[
https://issues.apache.org/jira/browse/HDFS-11072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15728618#comment-15728618
]
SammiChen edited comment on HDFS-11072 at 12/7/16 12:13 PM:
------------------------------------------------------------
Andrew, thanks very much for taking time review the patch!
bq. Can we just say "replication" rather than "continuous replicate"? e.g.
"getReplicationPolicy" instead of "getContinuousReplicatePolicy"
"continuous replicate" is chosen because I thought there is the combination of
"replication" plus "erasure coding", the planed phase 2 of erasure coding. So
I'm use "continuous replicate" to distinguish future "erasure coding
replicate". Does it make sense?
bq. Note that setting a "replication" EC policy is still different from
unsetting. Unsetting means the policy will be inherited from an ancestor.
Setting a "replication" policy means the "replication" policy will be used.
Imagine a situation where there are "/a" has RS 6,3 set and "/a/b" has XOR 2,1
set. On "/a/b", unsetting vs. setting "replication" will have different
effects. So we also need an unset API, similar to the unset storage policy API.
I agree with you and the implementation matches your thoughts. And I will add a
new unset API.
bq. Do the parameters "1-2-64K" have any meaning? If not, we should explain
that they are meaningless, or hide the parameters so we don't need to talk
about them.
"1-2-64K" is auto generated from the schema when replicate policy is defined.
The data is meaningless. At the first, I use the "null" as schema to define the
policy, then I found there is checker about schema can't be null. And then I
use schema (0-0-0). It breaks other checkers. I think we would like to keep
these checkers to avoid mistakes made by real ec policy, so at the end, I
choose "1-2-64k", which means 1 data block, 2 parity blocks, kind of matching
the default 3 replication case. As Rakesh has suggested to add a new unset API
and a new unset policy sub command in "erasurecode", makes the replicate policy
internal. So user will not see the policy unless they read the source code.
I will take care of all other comments in the new patch.
was (Author: sammi):
Andrew, thanks very much for taking time review the patch!
bq. Can we just say "replication" rather than "continuous replicate"? e.g.
"getReplicationPolicy" instead of "getContinuousReplicatePolicy" "continuous
replicate" is chosen because I thought there is the combination of
"replication" plus "erasure coding", the planed phase 2 of erasure coding. So
I'm use "continuous replicate" to distinguish future "erasure coding
replicate". Does it make sense?
bq. Note that setting a "replication" EC policy is still different from
unsetting. Unsetting means the policy will be inherited from an ancestor.
Setting a "replication" policy means the "replication" policy will be used.
Imagine a situation where there are "/a" has RS 6,3 set and "/a/b" has XOR 2,1
set. On "/a/b", unsetting vs. setting "replication" will have different
effects. So we also need an unset API, similar to the unset storage policy API.
I agree with you and the implementation matches your thoughts. And I will add a
new unset API.
bq. Do the parameters "1-2-64K" have any meaning? If not, we should explain
that they are meaningless, or hide the parameters so we don't need to talk
about them.
"1-2-64K" is auto generated from the schema when replicate policy is defined.
The data is meaningless. At the first, I use the "null" as schema to define the
policy, then I found there is checker about schema can't be null. And then I
use schema (0-0-0). It breaks other checkers. I think we would like to keep
these checkers to avoid mistakes made by real ec policy, so at the end, I
choose "1-2-64k", which means 1 data block, 2 parity blocks, kind of matching
the default 3 replication case. As Rakesh has suggested to add a new unset API
and a new unset policy sub command in "erasurecode", makes the replicate policy
internal. So user will not see the policy unless they read the source code.
I will take care of all other comments in the new patch.
> Add ability to unset and change directory EC policy
> ---------------------------------------------------
>
> Key: HDFS-11072
> URL: https://issues.apache.org/jira/browse/HDFS-11072
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: erasure-coding
> Affects Versions: 3.0.0-alpha1
> Reporter: Andrew Wang
> Assignee: SammiChen
> Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-11072-v1.patch, HDFS-11072-v2.patch,
> HDFS-11072-v3.patch, HDFS-11072-v4.patch
>
>
> Since the directory-level EC policy simply applies to files at create time,
> it makes sense to make it more similar to storage policies and allow changing
> and unsetting the policy.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]