[
https://issues.apache.org/jira/browse/HDFS-9231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14971469#comment-14971469
]
Xiao Chen commented on HDFS-9231:
---------------------------------
Thanks a lot for the review [~yzhangal]!
{quote}
1. The description is not quite accurate per our discussion, suggest to modify.
Especially the patch actually does change (and fix) the behavior when without
-includeSnapshots.
{quote}
It was great to talk to you. I have updated the description. Modified patch
summary in the end of this comment.
{quote}
2. A possible optimization in FSDirSnapshotOp#getSnapshotFiles. It seems that
the sf variable could be calculated in caller for once before the loop in the
caller, and pass to this method.
{quote}
My apologies for the confusion, I added some comments in this method. But
getting sf for each snapshottable dir is needed, since /d1 and /d2 have
different snapshotlist.
{quote}
3. final INodesInPath iip = fsd.getINodesInPath4Write(snap, false); maybe
substituted with call to getINodesInPath
{quote}
Good catch! I updated the code to call {{getINode}} which invokes
{{getINodesInPath}}.
{quote}
4. The check if (!corruptFileBlocks.isEmpty()) in
listCorruptFileBlocksWithSnapshot is not needed
{quote}
Good call. Fixed.
{quote}
5. Add comment in listCorruptFileBlocks() before the call
namenode.getNamesystem().listCorruptFileBlocksWithSnapshot, to indicate that
snapshottableDirs is only relevant when -includeSnapshots is specified.
{quote}
Added a link to {{FSNamesystem#listCorruptFileBlocksWithSnapshot}} which
explains that parameter in javadoc.
{quote}
6. In listCorruptFileBlocksWithSnapshot, we can add
{code}
if (snapshottableDirs == null) {
continue;
}
{code}
to avoid the call to getSnapshotFiles.
{quote}
I'm not sure this is necessary. On one hand, it definitely saves 1 call stack.
On the other hand, with the existence of all those loops and checks, I think
the performance gain of saving 1 call stack would be trivial. And the nullity
check of snapshottableDirs is already performed as a first step in
{{getSnapshotFiles}}.
Attached patch 005 with the above modifications. Updated summary below:
- {{fsck -list-corruptfileblocks -includeSnapshots}} will also show full dir of
snapshots
- {{fsck -list-corruptfileblocks}} without -includeSnapshots will not show
corrupt blocks that only have snapshot files
- NameNode WebUI's way of showing corrupted files/blocks unchanged.
- Added a sentence in NN WebUI to hint the admin to run fsck with
-includeSnapshots, if there're snapshots present in the system.
- Some refactoring to reuse existing code in new methods getSnapshottableDirs
and ListCorruptFileBlocksWithSnapshot
- The reasoning of keep minimal change to NN WebUI and fsck without
-includeSnapshots is that getting all possible snapshots may be slow.
> fsck doesn't explicitly list when Bad Replicas/Blocks are in a snapshot
> -----------------------------------------------------------------------
>
> Key: HDFS-9231
> URL: https://issues.apache.org/jira/browse/HDFS-9231
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: snapshots
> Reporter: Xiao Chen
> Assignee: Xiao Chen
> Attachments: HDFS-9231.001.patch, HDFS-9231.002.patch,
> HDFS-9231.003.patch, HDFS-9231.004.patch, HDFS-9231.005.patch
>
>
> Currently for snapshot files, {{fsck -list-corruptfileblocks}} shows corrupt
> blocks with the original file dir instead of the snapshot dir, and {{fsck
> -list-corruptfileblocks -includeSnapshots}} behave the same.
> This can be confusing because even when the original file is deleted, fsck
> will still show that deleted file as corrupted, although what's actually
> corrupted is the snapshot.
> As a side note, {{fsck -files -includeSnapshots}} shows the snapshot dirs.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)