[ 
https://issues.apache.org/jira/browse/HDFS-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15566485#comment-15566485
 ] 

Andrew Wang commented on HDFS-10971:
------------------------------------

Thanks for commenting Zhe, good thoughts,

bq. Extending this protocol will complicate how users/admins understand EC 
policies, and probably needs more discussions (maybe after the EC feature has 
been used in production for a while).

One usecase that has come up for specifying this per-file is an app like HBase. 
HBase might want to write its logs replicated, and its HFiles erasurecoded. 
Since it has level compaction, it might want to EC only colder HFiles. 
Particularly for the second case, directory-level EC policy is unlikely to be 
able to capture the application's desired semantics.

I filed HDFS-10996 to discuss this further, agree that we can address it 
separately.

> Distcp should not copy replication factor if source file is erasure coded
> -------------------------------------------------------------------------
>
>                 Key: HDFS-10971
>                 URL: https://issues.apache.org/jira/browse/HDFS-10971
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: distcp
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Wei-Chiu Chuang
>            Assignee: Wei-Chiu Chuang
>              Labels: hdfs-ec-3.0-must-do
>         Attachments: HDFS-10971.testcase.patch
>
>
> The current erasure coding implementation uses replication factor field to 
> store erasure coding policy.
> Distcp copies the source file's replication factor to the destination if 
> {{-pr}} is specified. However, if the source file is EC, the replication 
> factor (which is EC policy) should not be replicated to the destination file. 
> When a HdfsFileStatus is converted to FileStatus, the replication factor is 
> set to 0 if it's an EC file.
> In fact, I will attach a test case that shows trying to replicate the 
> replication factor of an EC file results in an IOException: "Requested 
> replication factor of 0 is less than the required minimum of 1 for 
> /tmp/dst/dest2"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to