[jira] [Updated] (HDFS-1806) TestBlockReport.blockReport_08() and _09() are timing-dependent and likely to fail on fast servers

Matt Foley (JIRA) Tue, 05 Apr 2011 10:46:44 -0700

     [ 
https://issues.apache.org/jira/browse/HDFS-1806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Matt Foley updated HDFS-1806:
-----------------------------

    Attachment: blockReport_08_failure_log.html

In the attached log excerpt, from Apache Hudson/Jenkins QA auto-test, 
replication starts at the datanode at 2:58:09,608.
The datanode moves the replica from tmp to finalized at 2:58:09,659-660.
Then at 2:58:09,663 we see the message "Replication state before the loop 0", 
which represents the START of polling -- way too late.

So both the waitTil(100) and waitTil(50) lines in waitForTempReplica() are too 
long.

> TestBlockReport.blockReport_08() and _09() are timing-dependent and likely to 
> fail on fast servers
> --------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-1806
>                 URL: https://issues.apache.org/jira/browse/HDFS-1806
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node, name-node
>    Affects Versions: 0.22.0
>            Reporter: Matt Foley
>         Attachments: blockReport_08_failure_log.html
>
>
> Method waitForTempReplica() polls every 100ms during block replication, 
> attempting to "catch" a datanode in the state of having a TEMPORARY replica.  
> But examination of a current Hudson test failure log shows that the replica 
> goes from "start" to "TEMPORARY" to "FINALIZED" in only 50ms, so of course 
> the poll usually misses it.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1806) TestBlockReport.blockReport_08() and _09() are timing-dependent and likely to fail on fast servers

Reply via email to