[ 
https://issues.apache.org/jira/browse/HDFS-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8833:
----------------------------
    Attachment: HDFS-8833-HDFS-7285.08.patch

Thanks Kai for reporting the test results! Although the patch is large, the 
logic-level changes are really simple. So we shouldn't be expecting much 
changes on test results.

I just updated the patch to fix the issues Kai pointed out. When doing that I 
also found I should address a TODO about non-empty dir policies (not that we 
have reached a conclusion). The added test code is:

{code
    /**
     * Verify that setting EC policy on non-empty directory only affects
     * newly created files under the directory.
     */
    final Path notEmpty = new Path("/nonEmpty");
    fs.mkdir(notEmpty, FsPermission.getDirDefault());
    final Path oldFile = new Path(notEmpty, "old");
    fs.create(oldFile);
    fs.getClient().setErasureCodingPolicy(notEmpty.toString(), null);
    final Path newFile = new Path(notEmpty, "new");
    fs.create(newFile);
    INode oldInode = namesystem.getFSDirectory().getINode(oldFile.toString());
    assertFalse(oldInode.asFile().isStriped());
    INode newInode = namesystem.getFSDirectory().getINode(newFile.toString());
    assertTrue(newInode.asFile().isStriped());
{code}

[~walter.k.su] Could you take a look and see if it looks OK? If so I'll commit 
the patch based on your earlier +1. Thanks a lot.

[~drankye] Thanks for raising the question of unsetting EC policy on root dir, 
it's an important point actually. I think in general, we should support 
unsetting the policy on any EC dir. I'll create a follow-on JIRA to do that.

> Erasure coding: store EC schema and cell size in INodeFile and eliminate 
> notion of EC zones
> -------------------------------------------------------------------------------------------
>
>                 Key: HDFS-8833
>                 URL: https://issues.apache.org/jira/browse/HDFS-8833
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: namenode
>    Affects Versions: HDFS-7285
>            Reporter: Zhe Zhang
>            Assignee: Zhe Zhang
>         Attachments: HDFS-8833-HDFS-7285-merge.00.patch, 
> HDFS-8833-HDFS-7285-merge.01.patch, HDFS-8833-HDFS-7285.02.patch, 
> HDFS-8833-HDFS-7285.03.patch, HDFS-8833-HDFS-7285.04.patch, 
> HDFS-8833-HDFS-7285.05.patch, HDFS-8833-HDFS-7285.06.patch, 
> HDFS-8833-HDFS-7285.07.patch, HDFS-8833-HDFS-7285.08.patch
>
>
> We have [discussed | 
> https://issues.apache.org/jira/browse/HDFS-7285?focusedCommentId=14357754&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14357754]
>  storing EC schema with files instead of EC zones and recently revisited the 
> discussion under HDFS-8059.
> As a recap, the _zone_ concept has severe limitations including renaming and 
> nested configuration. Those limitations are valid in encryption for security 
> reasons and it doesn't make sense to carry them over in EC.
> This JIRA aims to store EC schema and cell size on {{INodeFile}} level. For 
> simplicity, we should first implement it as an xattr and consider memory 
> optimizations (such as moving it to file header) as a follow-on. We should 
> also disable changing EC policy on a non-empty file / dir in the first phase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to