[ https://issues.apache.org/jira/browse/HDFS-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Harsh J updated HDFS-2936: -------------------------- Assignee: (was: Harsh J) > Provide a way to apply a minimum replication factor aside of strict minimum > live replicas feature > ------------------------------------------------------------------------------------------------- > > Key: HDFS-2936 > URL: https://issues.apache.org/jira/browse/HDFS-2936 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode > Affects Versions: 0.23.0 > Reporter: Harsh J > Attachments: HDFS-2936.patch > > > If an admin wishes to enforce replication today for all the users of their > cluster, he may set {{dfs.namenode.replication.min}}. This property prevents > users from creating files with < expected replication factor. > However, the value of minimum replication set by the above value is also > checked at several other points, especially during completeFile (close) > operations. If a condition arises wherein a write's pipeline may have gotten > only < minimum nodes in it, the completeFile operation does not successfully > close the file and the client begins to hang waiting for NN to replicate the > last bad block in the background. This form of hard-guarantee can, for > example, bring down clusters of HBase during high xceiver load on DN, or disk > fill-ups on many of them, etc.. > I propose we should split the property in two parts: > * dfs.namenode.replication.min > ** Stays the same name, but only checks file creation time replication factor > value and during adjustments made via setrep/etc. > * dfs.namenode.replication.min.for.write > ** New property that disconnects the rest of the checks from the above > property, such as the checks done during block commit, file complete/close, > safemode checks for block availability, etc.. > Alternatively, we may also choose to remove the client-side hang of > completeFile/close calls with a set number of retries. This would further > require discussion about how a file-closure handle ought to be handled. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org