[ 
https://issues.apache.org/jira/browse/HDFS-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13277219#comment-13277219
 ] 

Eli Collins commented on HDFS-2936:
-----------------------------------

bq. For instance, I create file with 3-rep, but get only 2 DNs in pipeline due 
to load or failure for my block writes, then when I call close(), it will hang 
infinitely cause there's only two replicas. 

So the bug here isn't that we need a better way to specify an hdfs-wide min 
replication - we already have dfs.namenode.replication.min. Seems like this is 
a bug, close shouldn't hang if we have fewer than dfs.namenode.replication.min 
replicas.  
                
> Provide a better way to specify a HDFS-wide minimum replication requirement
> ---------------------------------------------------------------------------
>
>                 Key: HDFS-2936
>                 URL: https://issues.apache.org/jira/browse/HDFS-2936
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>    Affects Versions: 0.23.0
>            Reporter: Harsh J
>            Assignee: Harsh J
>         Attachments: HDFS-2936.patch
>
>
> Currently, if an admin would like to enforce a replication factor for all 
> files on his HDFS, he does not have a way. He may arguably set 
> dfs.replication.min but that is a very hard guarantee and if the pipeline 
> can't afford that number for some reason/failure, the close() does not 
> succeed on the file being written and leads to several issues.
> After discussing with Todd, we feel it would make sense to introduce a second 
> config (which is ${dfs.replication.min} by default) which would act as a 
> minimum specified replication for files. This is different than 
> dfs.replication.min which also ensures that many replicas are recorded before 
> completeFile() returns... perhaps something like ${dfs.replication.min.user}. 
> We can leave dfs.replication.min alone for hard-guarantees and add 
> ${dfs.replication.min.for.block.completion} which could be left at 1 even if 
> dfs.replication.min is >1, and let files complete normally but not be of a 
> low replication factor (so can be monitored and accounted-for later).
> I'm prefering the second option myself. Will post a patch with tests soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to