[ https://issues.apache.org/jira/browse/HDFS-13088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16572328#comment-16572328 ]
Virajith Jalaparti edited comment on HDFS-13088 at 8/7/18 9:00 PM: ------------------------------------------------------------------- Thanks for the feedback [~elgoiri] and [~ehiggs]. [^HDFS-13088.002.patch] is an alternate approach to implement this -- It adds a new parameter {{dfs.provided.overreplication.factor}} which allows specifying how many extra replicas can be allowed for blocks that are PROVIDED. This is a single value for all blocks/files in the system and ephemeral (not necessarily retained across Namenode restarts unless the config value remains the same). However, there are no changes to {{FileSystem}} or {{INodeFile}} and much less intrusive. The main change to existing code is when the excess replicas are checked for in {{BlockManager#shouldProcessExtraRedundancy}} -- the number of excess replicas are determined to be the block replication + the value specified by {{dfs.provided.overreplication.factor}} for PROVIDED blocks. For blocks that are not PROVIDED or for EC-blocks, the earlier semantics are retained. I still need to add tests for this but posting the patch to get it out earlier. was (Author: virajith): Thanks for the feedback [~elgoiri] and [~ehiggs]. [^HDFS-13088.002.patch] is an alternate approach to implement this -- It adds a new parameter {{dfs.provided.overreplication.factor}} which allows specifying how many extra replicas can be allowed for blocks that are PROVIDED. This is a single value for all blocks/files in the system and ephemeral (not necessarily retained across Namenode restarts unless the config value remains the same). However, there are no changes to {{FileSystem}} or {{INodeFile}} and much less intrusive. > Allow HDFS files/blocks to be over-replicated. > ---------------------------------------------- > > Key: HDFS-13088 > URL: https://issues.apache.org/jira/browse/HDFS-13088 > Project: Hadoop HDFS > Issue Type: Sub-task > Reporter: Virajith Jalaparti > Assignee: Virajith Jalaparti > Priority: Major > Attachments: HDFS-13088.001.patch, HDFS-13088.002.patch > > > This JIRA is to add a per-file "over-replication" factor to HDFS. As > mentioned in HDFS-13069, the over-replication factor will be the excess > replicas that will be allowed to exist for a file or block. This is > beneficial if the application deems additional replicas for a file are > needed. In the case of HDFS-13069, it would allow copies of data in PROVIDED > storage to be cached locally in HDFS in a read-through manner. > The Namenode will not proactively meet the over-replication i.e., it does not > schedule replications if the number of replicas for a block is less than > (replication factor + over-replication factor) as long as they are more than > the replication factor of the file. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org