[
https://issues.apache.org/jira/browse/HADOOP-4971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661796#action_12661796
]
Raghu Angadi commented on HADOOP-4971:
--------------------------------------
test-patch :
[exec] -1 overall.
[exec]
[exec] +1 @author. The patch does not contain any @author tags.
[exec]
[exec] -1 tests included. The patch doesn't appear to include any new
or modified tests.
[exec] Please justify why no tests are needed for
this patch.
[exec]
[exec] +1 javadoc. The javadoc tool did not generate any warning
messages.
[exec]
[exec] +1 javac. The applied patch does not increase the total number
of javac compiler warnings.
[exec]
[exec] +1 findbugs. The patch does not introduce any new Findbugs
warnings.
[exec]
[exec] +1 Eclipse classpath. The patch retains Eclipse classpath
integrity.
> Block report times from datanodes could converge to same time.
> -----------------------------------------------------------------
>
> Key: HADOOP-4971
> URL: https://issues.apache.org/jira/browse/HADOOP-4971
> Project: Hadoop Core
> Issue Type: Bug
> Components: dfs
> Affects Versions: 0.18.0
> Reporter: Raghu Angadi
> Assignee: Raghu Angadi
> Priority: Blocker
> Fix For: 0.18.3
>
> Attachments: HADOOP-4971-branch-18.patch, HADOOP-4971.patch,
> HADOOP-4971.patch
>
>
> Datanode block reports take quite a bit of memory to process at the namenode.
> After the inital report, DNs pick a random time to spread this load across at
> the NN. This normally works fine.
> Block reports are sent inside "offerService()" thread in DN. If for some
> reason this thread was stuck for long time (comparable to block report
> interval), and same thing happens on many DNs, all of them get back to the
> loop at the same time and start sending block report then and every hour at
> the same time.
> RPC server and clients in 0.18 can handle this situation fine. But since this
> is a memory intensive RPC it lead to large GC delays at the NN. We don't know
> yet why offerService therads seemed to be stuck, but DN should re-randomize
> it block report time in such cases.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.