[ 
https://issues.apache.org/jira/browse/HDFS-11072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15736500#comment-15736500
 ] 

Andrew Wang commented on HDFS-11072:
------------------------------------

Thanks for the detailed writeup Sammi!

bq. Andrew, can you explain why you think for a directory, the replication 
policy should be return when getErasureCodingPolicy is called?

I was thinking of a usecase where a user wants to redo the policies on an 
directory tree. Unless they can distinguish between states 1 and 2 vs 3 via a 
get API, they need to call set/remove on every directory to get exactly what 
they want. Another usecase is distcp, where you might want to exactly replicate 
the same storage policy setup on a destination cluster.

Looking at StoragePolicy though, it just returns the inherited policy. I don't 
see a way to check if a policy is inherited or explicitly set. IMO this is a 
flaw (particularly for distcp), but it's better to follow suit for continuity. 
It's also less bad for EC since there's no way to change the EC policy for a 
file.

Also referencing StoragePolicy, there's the idea of a default storage policy 
for the cluster. This is hardcoded to HOT, and is returned when you call 
getStoragePolicy. To align with getStoragePolicy, arguably getECPolicy should 
return a special "replicated" ECPolicy, but that makes {{isErasureCoded}} 
checks more complicated.

So, all said, let's just return the inherited EC policy if it's not replicated. 
We also need to validate the {{isErasureCoded}} checks internally, since I know 
for instance we restrict the set of storage policies that work on erasure coded 
files.

bq. Unless we introduce another policy manipulation API, such as 
"setDefaultReplicationPolicy" which handles change directory from ec policy to 
replication policy.

I like this idea, since calling the replication policy an "ECPolicy" is a 
misnomer. It's also confusing if we make people set it via setECPolicy but 
don't return it in getECPolicy.

> Add ability to unset and change directory EC policy
> ---------------------------------------------------
>
>                 Key: HDFS-11072
>                 URL: https://issues.apache.org/jira/browse/HDFS-11072
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: erasure-coding
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Andrew Wang
>            Assignee: SammiChen
>              Labels: hdfs-ec-3.0-must-do
>         Attachments: HDFS-11072-v1.patch, HDFS-11072-v2.patch, 
> HDFS-11072-v3.patch, HDFS-11072-v4.patch
>
>
> Since the directory-level EC policy simply applies to files at create time, 
> it makes sense to make it more similar to storage policies and allow changing 
> and unsetting the policy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to