[
https://issues.apache.org/jira/browse/HDFS-4047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yanbo Liang updated HDFS-4047:
------------------------------
Attachment: HDFS-4047.patch
Hi Eli,
You comment is instructive. But if moving the sleep on IOE from the outer loop
to the inner loop will make the interval of heartbeat sending longer. When IOE
occurs during the process of heartbeat, the thread will sleep for 5s and during
these time the NN will not receive any heartbeat from this DN. It will make
some failure of test case. For example, if the NN waits for the block report
and the DN report thread sleep due to the above IOE, it will takes more time to
make NN exits safe mode. As far as I know, most of the test failure is caused
by this reason. So I just remove the sleep on IOE based on your patch. If IOE
is throw during the loop, just log it and continue the loop as soon as
possible. Looking forward for your opinion~
> BPServiceActor has nested shouldRun loops
> -----------------------------------------
>
> Key: HDFS-4047
> URL: https://issues.apache.org/jira/browse/HDFS-4047
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: name-node
> Affects Versions: 2.0.0-alpha
> Reporter: Eli Collins
> Priority: Minor
> Attachments: HADOOP-4047.patch, HDFS-4047.patch, hdfs-4047.txt
>
>
> BPServiceActor#run and offerService booth have while shouldRun loops. We only
> need the outer one, ie we can hoist the info log from offerService out to run
> and remove the while loop.
> {code}
> BPServiceActor#run:
> while (shouldRun()) {
> try {
> offerService();
> } catch (Exception ex) {
> ...
> offerService:
> while (shouldRun()) {
> try {
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira