[
https://issues.apache.org/jira/browse/HDFS-10996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15947931#comment-15947931
]
Andrew Wang commented on HDFS-10996:
------------------------------------
Hi Sammi, thanks for working on this, took an initial pass at the patch:
* We should file a follow-on to clean up all the create() overloads in
DFSClient, they seem unnecessary since DistributedFileSystem also has very
similar overloads for filling in defaults.
{code}
* Another addition is ecPolicyName which specify a specific erasure coding
* policy on this new file, stop new file from inheriting its parent
* group's whatever policy. A value of null means inheriting policy from
* parent group.
{code}
Recommend we reword as follows:
{code}
* A non-null ecPolicyName specifies an explicit erasure coding policy for
this
* file, overriding the inherited policy. A null ecPolicyName means the file
* will inherit its EC policy from an ancestor (the default).
{code}
* DistributedFileSystem, setEcPolicyName, should we have a
Preconditions.checkNotNull check? Since this is trunk-only, we can also add the
@Nonnull Java 8 annotation on the parameter.
* {{addFile}}, can we reuse the existing code in setErasureCodingPolicy that
checks if a policy is enabled? This way it's centralized.
* {{addFile}}, also prefer that we stick with checking for null rather than
{{StringUtils.isBlank}} since we haven't put any restrictions yet on EC policy
names. That'll probably happen as part of Kai's pluggable EC policy work.
* When changing the test mocks, we should use anyObject() instead of null for
the new parameter.
New unit test:
High-level:
* I don't think we need a new test class for this.
* We already have builder-related tests where we can do validation of the
additional parameter.
* Besides that, we just need a unit test that tries a few different
combinations of parent and per-file policies. This can be parameterized to make
the code simple. I don't think validating file contents is necessary since it's
done by other tests.
Nit stuff:
* Need to start the block comments with two {{*}} for the method and class
comments to be javadoc
* Timeout is set too long, the unit is milliseconds
* Typo: "stripped" -> "striped"
* {{filePath}} should be final.
* A lot of the policy-related class fields aren't used outside of setup, remove
them?
{code}
String ecPolicies =
Arrays.asList(ErasureCodingPolicyManager.getSystemPolicies()).stream()
.map(ErasureCodingPolicy::getName)
.collect(Collectors.joining(", "));
conf.set(DFSConfigKeys.DFS_NAMENODE_EC_POLICIES_ENABLED_KEY, ecPolicies);
{code}
Could you use DFSTestUtil#enableAllECPolicies instead? If you copied this from
somewhere else, please update to use the helper method as well.
{code}
Assert.assertTrue("Erasure coding policy mismatch!",
fileEcPolicy.getName().equals(ecPolicy.getName()));
{code}
Can use assertEquals here.
* testParameters, file1, no need for a try/catch, just let it throw on error
* testParameters, file2, we should assert the exception is expected with
GenericTestUtils#assertExceptionContains
> Ability to specify per-file EC policy at create time
> ----------------------------------------------------
>
> Key: HDFS-10996
> URL: https://issues.apache.org/jira/browse/HDFS-10996
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: erasure-coding
> Affects Versions: 3.0.0-alpha1
> Reporter: Andrew Wang
> Assignee: SammiChen
> Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-10996-v1.patch, HDFS-10996-v2.patch,
> HDFS-10996-v3.patch, HDFS-10996-v4.patch
>
>
> Based on discussion in HDFS-10971, it would be useful to specify the EC
> policy when the file is created. This is useful for situations where app
> requirements do not map nicely to the current directory-level policies.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]