[
https://issues.apache.org/jira/browse/HDFS-4633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13613110#comment-13613110
]
Arpit Agarwal commented on HDFS-4633:
-------------------------------------
Good find in testExcludedNodesForgiveness. I verified the fix works on Windows
and OS X.
I wonder if we can reduce the test execution time a little by reducing these
delays?
{code}
// Forgive nodes in under 10s for this test case.
conf.setLong(
DFSConfigKeys.DFS_CLIENT_WRITE_EXCLUDE_NODES_CACHE_EXPIRY_INTERVAL,
10000);
ThreadUtil.sleepAtLeastIgnoreInterrupts(15000);
{code}
The test passes reliably on my Windows machine with the former set to 2500ms
and the latter to 5000ms. I'm fine if you prefer leaving the higher values to
rule out spurious failures though.
> TestDFSClientExcludedNodes fails sporadically if excluded nodes cache expires
> too quickly
> -----------------------------------------------------------------------------------------
>
> Key: HDFS-4633
> URL: https://issues.apache.org/jira/browse/HDFS-4633
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs-client, test
> Affects Versions: 3.0.0
> Reporter: Chris Nauroth
> Assignee: Chris Nauroth
> Attachments: HDFS-4633.1.patch
>
>
> {{TestDFSClientExcludedNodes}} simulates failures of individual data nodes in
> the client's write pipeline and checks the client's ability to recover.
> HDFS-4246 added support for periodic "forgiveness" by caching the list of
> known bad data nodes with a periodic eviction. The test uses a 1 second
> cache expiration. This sometimes causes failed nodes to be forgiven too fast
> and violate the assumptions of the test.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira