[jira] [Commented] (HDFS-9435) TestBlockRecovery#testRBWReplicas is failing intermittently

Rakesh R (JIRA) Tue, 17 Nov 2015 22:16:54 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-9435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15010318#comment-15010318
 ]


Rakesh R commented on HDFS-9435:
--------------------------------

Thanks [~iwasakims] for the interest and useful comments.

I could see, again {{#triggerBlockReportForTests}} can immediately return 
before acknowledging the ActiveNN. Below is the sequence:

1=> During startUp(), it will call 
{{dn.getAllBpOs().get(0).triggerBlockReportForTests()}} and initializes final 
long {{oldBlockReportTime = scheduler.nextBlockReportTime;}}
2=> BPServiceActor#start().
3=> Starting of the actor thread will call the function 
BPServiceActor#connectToNNAndHandshake()
4=> BPServiceActor#register()
5=> scheduler#scheduleBlockReport(dnConf.initialBlockReportDelayMs); 
Now, {{#scheduleBlockReport}} function call will update {{nextBlockReportTime = 
monotonicNow();}}. This will again stops waiting period of 
{{#triggerBlockReportForTests}}  and continue to the unit test cases, then fall 
into similar error situation.

IMHO like you mentioned, two times {{#triggerBlockReportForTests}} will make 
the tests more consistent. I'm attaching a patch showing the changes, please 
review the patch again. Thanks!

> TestBlockRecovery#testRBWReplicas is failing intermittently
> -----------------------------------------------------------
>
>                 Key: HDFS-9435
>                 URL: https://issues.apache.org/jira/browse/HDFS-9435
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: HDFS-9435-00.patch, HDFS-9435-01.patch, 
> testRBWReplicas.log
>
>
> TestBlockRecovery#testRBWReplicas is failing in the [build 
> 13536|https://builds.apache.org/job/PreCommit-HDFS-Build/13536/testReport/org.apache.hadoop.hdfs.server.datanode/TestBlockRecovery/testRBWReplicas/].
>  It looks like bug in tests due to race condition.
> Note: Attached logs taken from the build to this jira.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9435) TestBlockRecovery#testRBWReplicas is failing intermittently

Reply via email to