[
https://issues.apache.org/jira/browse/HDFS-5341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13802160#comment-13802160
]
Ravi Prakash commented on HDFS-5341:
------------------------------------
Hi Jiawei!
Thanks for the patch. This looks like a good change. I am reviewing
HDFS-5341.002.patch . I wasn't able to download HDFS-5341.003.patch. Did you
remove it? Sometimes JIRA has problems. Could you please upload it again?
Here are my comments.
# We usually use two spaces instead of tabs to indent. Could you please fix
that?
# This block {code} if (blockFile != null) {
this.blockFileLength = blockFile.length();
}
else {
this.blockFileLength = 0;
}{code} can be written more easily like {code} this.blockFileLength =
(blockFile != null) ? blockFile.length() : 0; {code}
# getDiskReport() should be annotated with @VisibleForTesting
# The test itself measures that time taken by diffBlock is less than time taken
by reconcile(=getDiskReport + diffBlock) + a second getDiskReport. Technically
the second getDiskReport is extraneous. I'm also afraid the getDiskReport might
return in < 0.5 ms, which could cause the test to fail intermittently. Can you
think of some other way in which we may test this? (It may not be necessary to
come up with a unit test since the change seems quite small and
straight-forward), but if we can think of an easy way, then we should.
# The old Test14 should now become Test15
> DirectoryScanner hold the object lock dataset too long in scan,that make the
> datanode turn into deadnode and block all reciving thread
> ---------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HDFS-5341
> URL: https://issues.apache.org/jira/browse/HDFS-5341
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: datanode
> Reporter: qus-jiawei
> Assignee: qus-jiawei
> Labels: deadlock
> Fix For: 3.0.0
>
> Attachments: HDFS-5341.000.patch, HDFS-5341.001.patch,
> HDFS-5341.002.patch
>
>
> When DirectoryScanner doing the scan function, it holding the dataset to diff
> the block info between memory and disk.But it do a lot of disk operation
> because it call the file's getlength funciton.
> Once the dataset is locked,heartbeat thread and all DataXceiver thread would
> be blocked.
> So,such disk operation should move to the async disk scan
--
This message was sent by Atlassian JIRA
(v6.1#6144)