[
https://issues.apache.org/jira/browse/HDFS-10197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lin Yiqun updated HDFS-10197:
-----------------------------
Description:
In {{TestFsDatasetCache}}, the unit tests failed sometimes. I collected some
failed reason in recent jenkins reports. They are all timeout errors.
{code}
Tests in error:
TestFsDatasetCache.testFilesExceedMaxLockedMemory:378 ? Timeout Timed out
wait...
TestFsDatasetCache.tearDown:149 ? Timeout Timed out waiting for condition.
Thr...
{code}
{code}
Tests in error:
TestFsDatasetCache.testPageRounder:474 ? test timed out after 60000
milliseco...
TestBalancer.testUnknownDatanodeSimple:1040->testUnknownDatanode:1098 ? test
...
{code}
But there was a little different between these failure.
* The first because the total block time was exceed the {{waitTimeMillis}}(here
is 60s) then throw the timeout exception and print thread diagnostic string in
method {{DFSTestUtil#verifyExpectedCacheUsage}}.
{code}
long st = Time.now();
do {
boolean result = check.get();
if (result) {
return;
}
Thread.sleep(checkEveryMillis);
} while (Time.now() - st < waitForMillis);
throw new TimeoutException("Timed out waiting for condition. " +
"Thread diagnostics:\n" +
TimedOutTestsListener.buildThreadDiagnosticString());
{code}
* The second is due to test elapsed time more than timeout time setting. Like
in {{TestFsDatasetCache#testPageRounder}}.
We should adjust timeout time for these unit test which would failed sometimes
due to timeout.
was:
In {{TestFsDatasetCache}}, the unit tests failed sometimes. I collected some
failed reason in recent jenkins reports. They are all timeout errors.
{code}
Tests in error:
TestFsDatasetCache.testFilesExceedMaxLockedMemory:378 ? Timeout Timed out
wait...
TestFsDatasetCache.tearDown:149 ? Timeout Timed out waiting for condition.
Thr...
{code}
{code}
Tests in error:
TestFsDatasetCache.testPageRounder:474 ? test timed out after 60000
milliseco...
TestBalancer.testUnknownDatanodeSimple:1040->testUnknownDatanode:1098 ? test
...
{code}
But there was a little different between these failure.
* The first because the total block time was exceed the {{waitTimeMillis}}(here
is 60s) and then throw the timeout exception and print thread diagnostic string.
{code}
long st = Time.now();
do {
boolean result = check.get();
if (result) {
return;
}
Thread.sleep(checkEveryMillis);
} while (Time.now() - st < waitForMillis);
throw new TimeoutException("Timed out waiting for condition. " +
"Thread diagnostics:\n" +
TimedOutTestsListener.buildThreadDiagnosticString());
{code}
* The second is due to test elapsed time more than timeout time setting. Like
in {{TestFsDatasetCache#testPageRounder}}.
We should adjust timeout time for these unit test which would failed sometimes
due to timeout.
> TestFsDatasetCache failing intermittently due to timeout
> --------------------------------------------------------
>
> Key: HDFS-10197
> URL: https://issues.apache.org/jira/browse/HDFS-10197
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: test
> Reporter: Lin Yiqun
> Assignee: Lin Yiqun
> Attachments: HDFS-10197.001.patch
>
>
> In {{TestFsDatasetCache}}, the unit tests failed sometimes. I collected some
> failed reason in recent jenkins reports. They are all timeout errors.
> {code}
> Tests in error:
> TestFsDatasetCache.testFilesExceedMaxLockedMemory:378 ? Timeout Timed out
> wait...
> TestFsDatasetCache.tearDown:149 ? Timeout Timed out waiting for condition.
> Thr...
> {code}
> {code}
> Tests in error:
> TestFsDatasetCache.testPageRounder:474 ? test timed out after 60000
> milliseco...
> TestBalancer.testUnknownDatanodeSimple:1040->testUnknownDatanode:1098 ?
> test ...
> {code}
> But there was a little different between these failure.
> * The first because the total block time was exceed the
> {{waitTimeMillis}}(here is 60s) then throw the timeout exception and print
> thread diagnostic string in method {{DFSTestUtil#verifyExpectedCacheUsage}}.
> {code}
> long st = Time.now();
> do {
> boolean result = check.get();
> if (result) {
> return;
> }
>
> Thread.sleep(checkEveryMillis);
> } while (Time.now() - st < waitForMillis);
>
> throw new TimeoutException("Timed out waiting for condition. " +
> "Thread diagnostics:\n" +
> TimedOutTestsListener.buildThreadDiagnosticString());
> {code}
> * The second is due to test elapsed time more than timeout time setting. Like
> in {{TestFsDatasetCache#testPageRounder}}.
> We should adjust timeout time for these unit test which would failed
> sometimes due to timeout.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)