[
https://issues.apache.org/jira/browse/HDFS-9381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15022790#comment-15022790
]
Zhe Zhang commented on HDFS-9381:
---------------------------------
Just finished reviewing the patch. A few minor issues:
# {{PendingReplicationBlocks#clear}} should clear
{{unscheduledPendingReplications}} as well?
# {{unscheduledPendingReplications}} could use a simple Javadoc to state the
reason that those blocks were unscheduled. If you don't mind we could also take
the chance to add a Javadoc for {{timedOutItems}}. For a while I thought it
means the replication work times out, then realized it means the block was not
scheduled for replication work within timeout window.
# The below logic needs more discussion:
{code}
296 // If this block exist in unscheduled pending replications
list,
297 // just remove it as this block added back to
toNeededReplications
298 // now.
299 if (unscheduledPendingReplications.contains(block)) {
300 unscheduledPendingReplications.remove(block);
{code}
If the block exists in {{unscheduledPendingReplications}}, it means another EC
recovery work is still ongoing. I guess we shouldn't move it to
{{toNeededReplications}}? It won't be selected anyway.
# So {{toNeededReplications}} includes 2 types of blocks, timed out and
EC-unscheduled. It's a little hard to come up with an intuitive name. Maybe
{{promotedReplications}}? Both types are being promoted from inactive status.
# Some checkstyle issues are valid, e.g. the unused import
# This is up to you: I prefer to use {{GenericTestUtils#waitFor}} to wait for a
condition. But the wait-related code segments in this test case are simple
enough so I'm OK either way.
As a follow-on, we can consider adding to the test the logic of DN finishing an
EC recovery task, so as to activate an unscheduled replication item.
> When same block came for replication for Striped mode, we can move that block
> to PendingReplications
> ----------------------------------------------------------------------------------------------------
>
> Key: HDFS-9381
> URL: https://issues.apache.org/jira/browse/HDFS-9381
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: erasure-coding, namenode
> Affects Versions: 3.0.0
> Reporter: Uma Maheswara Rao G
> Assignee: Uma Maheswara Rao G
> Attachments: HDFS-9381.00.patch, HDFS-9381.01.patch
>
>
> Currently I noticed that we are just returning null if block already exists
> in pendingReplications in replication flow for striped blocks.
> {code}
> if (block.isStriped()) {
> if (pendingNum > 0) {
> // Wait the previous recovery to finish.
> return null;
> }
> {code}
> Here if we just return null and if neededReplications contains only fewer
> blocks(basically by default if less than numliveNodes*2), then same blocks
> can be picked again from neededReplications from next loop as we are not
> removing element from neededReplications. Since this replication process need
> to take fsnamesystmem lock and do, we may spend some time unnecessarily in
> every loop.
> So my suggestion/improvement is:
> Instead of just returning null, how about incrementing pendingReplications
> for this block and remove from neededReplications? and also another point to
> consider here is, to add into pendingReplications, generally we need target
> and it is nothing but to which node we issued replication command. Later when
> after replication success and DN reported it, block will be removed from
> pendingReplications from NN addBlock.
> So since this is newly picked block from neededReplications, we would not
> have selected target yet. So which target to be passed to pendingReplications
> if we add this block? One Option I am thinking is, how about just passing
> srcNode itself as target for this special condition? So, anyway if the block
> is really missed, srcNode will not report it. So this block will not be
> removed from pending replications, so that when it is timed out, it will be
> considered for replication again and that time it will find actual target to
> replicate while processing as part of regular replication flow.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)