[
https://issues.apache.org/jira/browse/HDFS-11072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15723896#comment-15723896
]
Andrew Wang commented on HDFS-11072:
------------------------------------
Hi Sammi, thanks for working on this. Some review comments in addition to
Rakesh's:
* Can we just say "replication" rather than "continuous replicate"? e.g.
"getReplicationPolicy" instead of "getContinuousReplicatePolicy"
* Note that setting a "replication" EC policy is still different from
unsetting. Unsetting means the policy will be inherited from an ancestor.
Setting a "replication" policy means the "replication" policy will be used.
Imagine a situation where there are "/a" has RS 6,3 set and "/a/b" has XOR 2,1
set. On "/a/b", unsetting vs. setting "replication" will have different
effects. So we also need an unset API, similar to the unset storage policy API.
Comment in ECPolicyManager, recommend reword like this:
{noformat}
/*
* This is a special policy. When this policy is applied to a directory, its
* children will be replicated rather than inheriting an erasure coding policy
* from an ancestor directory.
*
* This policy is only used when setting an erasure coding policy. It will
not be
* returned when get erasure coding policy is called.
*/
{noformat}
* FSDirErasureCodingOp: rename "ecXAttrExisted" to "hasEcXAttr"
* FSDirErasureCodingOp: should rename createErasureCodingPolicyXAttr to
setErasureCodingPolicyXAttr, since it can now replace
* Why do we hide the replication policy for calls to
getErasureCodingPolicyForPath for directories? Makes sense for files since they
are just replicated, but directory-level policies act like normal EC policies
in that they can be inherited.
* Rather than add new function getErasureCodingPolicyXAttrForLastINode to set a
boolean, seems like we could call a "hasErasureCodingPolicy" method (the
current one is also unused). Since this is only for paths that exist, it's safe
to use FSDirectory.resolveLastINode instead of a for loop that skips nulls. We
only need that for loop when creating a new path.
* To assist with the above, I feel like we should have a
{{getErasureCodingPolicy(INode)}} method that does this block in
getErasureCodingPolicyForPath:
{code}
final XAttrFeature xaf = inode.getXAttrFeature();
if (xaf != null) {
XAttr xattr = xaf.getXAttr(XATTR_ERASURECODING_POLICY);
if (xattr != null) {
ByteArrayInputStream bIn = new
ByteArrayInputStream(xattr.getValue());
DataInputStream dIn = new DataInputStream(bIn);
String ecPolicyName = WritableUtils.readString(dIn);
if (!ecPolicyName.equalsIgnoreCase(ErasureCodingPolicyManager
.getContinuousReplicatePolicy().getName())) {
return fsd.getFSNamesystem().getErasureCodingPolicyManager().
getPolicyByName(ecPolicyName);
} else {
return null;
}
}
}
{code}
Documentation:
* "Another purpose of this special policy is to unset the erasure coding policy
of a directory back to the traditional replications.", I don't think we should
say this, since we also support actually unsetting the EC policy. The
replication policy is still a policy that overrides policies on ancestor
directories.
* Do the parameters "1-2-64K" have any meaning? If not, we should explain that
they are meaningless, or hide the parameters so we don't need to talk about
them.
Tests:
* It's better to use more specific asserts like {{assertNull}},
{{assertNotNull}, etc instead of just {{assertTrue}}
* Would be good to create files with different replication factors.
> Add ability to unset and change directory EC policy
> ---------------------------------------------------
>
> Key: HDFS-11072
> URL: https://issues.apache.org/jira/browse/HDFS-11072
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: erasure-coding
> Affects Versions: 3.0.0-alpha1
> Reporter: Andrew Wang
> Assignee: SammiChen
> Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-11072-v1.patch, HDFS-11072-v2.patch,
> HDFS-11072-v3.patch, HDFS-11072-v4.patch
>
>
> Since the directory-level EC policy simply applies to files at create time,
> it makes sense to make it more similar to storage policies and allow changing
> and unsetting the policy.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]