[
https://issues.apache.org/jira/browse/HDFS-7960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14372295#comment-14372295
]
Andrew Wang commented on HDFS-7960:
-----------------------------------
Reading through it again, few comments:
NNRpcServer:
* there's a TODO: FIXME, we aren't passing in the BlockReportContext.
processReport doesn't need that last parameter anymore either I think, since
the information is in the BR context.
BPServiceActor:
* Is there a need for BR ids to be monotonic increasing? Else using a random
number seems better. I see you do a fixup by checking with the previous ID, but
with random this shouldn't be necessary.
DatanodeDescriptor:
* it looks like we only get/set LastBlockReportId in removeZombieStorages. We
need to be setting to the current BR id as BRs come in right? This is probably
a holdover from processReport not being updated from the previous patch rev.
If you wanted to add comments about all this, BlockReportContext's class
javadoc would be a good choice.
Nit:
{code}
assert (namesystem.hasWriteLock());
{code}
space after assert
Going to stop there for now, I think we need to see another rev (the
processReport FIXME basically) to get a feel for BlockReportContext.
> The full block report should prune zombie storages even if they're not empty
> ----------------------------------------------------------------------------
>
> Key: HDFS-7960
> URL: https://issues.apache.org/jira/browse/HDFS-7960
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 2.6.0
> Reporter: Lei (Eddy) Xu
> Assignee: Colin Patrick McCabe
> Priority: Critical
> Attachments: HDFS-7960.002.patch, HDFS-7960.003.patch,
> HDFS-7960.004.patch
>
>
> The full block report should prune zombie storages even if they're not empty.
> We have seen cases in production where zombie storages have not been pruned
> subsequent to HDFS-7575. This could arise any time the NameNode thinks there
> is a block in some old storage which is actually not there. In this case,
> the block will not show up in the "new" storage (once old is renamed to new)
> and the old storage will linger forever as a zombie, even with the HDFS-7596
> fix applied. This also happens with datanode hotplug, when a drive is
> removed. In this case, an entire storage (volume) goes away but the blocks
> do not show up in another storage on the same datanode.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)