[ 
https://issues.apache.org/jira/browse/HDFS-1765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164956#comment-13164956
 ] 

Eli Collins commented on HDFS-1765:
-----------------------------------

Thanks for taking this on Uma. Feedback on your latest patch follows: 

BlockManager:
* Why do we have to hold the write lock across chooseUnderReplicatedBlocks and 
computeReplicationWorkForBlocks now? Seems like the current granularity is 
sufficient.
* computeReplicationWork writeUnlock needs to be in a finaly clause

UnderreplicatedBlocks:
* I'd make chooseUnderReplicatedBlocks synchronized instead of synchronizing on 
this
* Why reset all repl indices at the same time? Shouldn't we reset the repl 
index for a priority only when that given priority runs out?
* I'd populate priorityVsReplIndex in the 1st loop so you don't need to check 
gets for null when setting repleIndex
* I'd make the loop over blocks a while loop, eg
{code}
  while (blockCount < blocksToProcess && neededRepls.hasNext()) {
      blockCount++
      replIndex++;
      ...
   }
{code}
* "All priority Queues last block index used for replication work." -> "Stores 
the replication index for each priority"
* Rename "priorityVsReplIdx" to "priorityToReplIdx" and "repleIndex" to 
"replIndex"
* param javadocs for decrementReplicationIndex and chooseUnderReplicatedBlocks 
should be on their own line
* I'd remove the comment about "# of blocks to process equals either twice the 
number of live data-nodes" as its a function of the argument blocksToProcess 
and so this comment is stale as soon as eg 
REPLICATION_WORK_MULTIPLIER_PER_ITERATION (which is multiple callers up) 
changes.
* I'd rename "replicationPriority" just "priority"
* !neededRep.hasNext() instead of "false == neededRep.hasNext()"
* We don't need to qualify the code in chooseUnderReplicatedBlocks with 
UnderReplicatedBlocks anymore (eg LEVEL and BlockIterator) now that it lives in 
this class
* The warn of unexpected replication priority should be an assert right? The 
block should never have an invalid priority.


TestReplPolicy:
* Seems like the test needs to check that there are no blocks in the high 
priority queue but there are still blocks in the low priority queue. It 
currently just tests that the high priority queue is empty, but that could be 
because it's has sufficient time to empty all queues right?
* Maybe write a unit test directly against chooseUnderReplicatedBlocks?
* Nit: replace 3000 with a constant and set 
DFS_NAMENODE_REPLICATION_INTERVAL_KEY in the conf in the test (you could lower 
it to 1000 so the test runs more quickly).
* Can import static org.junit.Assert.* 
* "replicationinterval" is two words
* "high priority blocks to process very quickly than the low priority blocks" 
-> "high priority blocks are processed before the low priority blocks"

TestNNMetrics:
* Where you've replaced updateMetrics with the new code/comment to sleep how 
about putting this in a method (eg waitForDeletion) and calling that.
                
> Block Replication should respect under-replication block priority
> -----------------------------------------------------------------
>
>                 Key: HDFS-1765
>                 URL: https://issues.apache.org/jira/browse/HDFS-1765
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>    Affects Versions: 0.23.0
>            Reporter: Hairong Kuang
>            Assignee: Uma Maheswara Rao G
>             Fix For: 0.24.0
>
>         Attachments: HDFS-1765.patch, HDFS-1765.patch, HDFS-1765.pdf, 
> underReplicatedQueue.pdf
>
>
> Currently under-replicated blocks are assigned different priorities depending 
> on how many replicas a block has. However the replication monitor works on 
> blocks in a round-robin fashion. So the newly added high priority blocks 
> won't get replicated until all low-priority blocks are done. One example is 
> that on decommissioning datanode WebUI we often observe that "blocks with 
> only decommissioning replicas" do not get scheduled to replicate before other 
> blocks, so risking data availability if the node is shutdown for repair 
> before decommission completes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to