TestMetaReaderEditor can timeouts during test execution
-------------------------------------------------------
Key: HBASE-5114
URL: https://issues.apache.org/jira/browse/HBASE-5114
Project: HBase
Issue Type: Bug
Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Likely because it belongs to the too long list of flaky tests. A test should
not fail randomly. Here is the list, from 2974 to 2598, with a timeouted
TestMetaReaderEditor
https://builds.apache.org/job/HBase-TRUNK/2597/testReport/org.apache.hadoop.hbase.catalog/
https://builds.apache.org/job/HBase-TRUNK/2588/testReport/org.apache.hadoop.hbase.catalog/
https://builds.apache.org/job/HBase-TRUNK/2585/testReport/org.apache.hadoop.hbase.catalog/
https://builds.apache.org/job/HBase-TRUNK/2584/testReport/org.apache.hadoop.hbase.catalog/
https://builds.apache.org/job/HBase-TRUNK/2582/testReport/org.apache.hadoop.hbase.catalog/
https://builds.apache.org/job/HBase-TRUNK/2577/testReport/org.apache.hadoop.hbase.catalog/
https://builds.apache.org/job/HBase-TRUNK/2574/testReport/org.apache.hadoop.hbase.catalog/
It happens on hadoop-qa as well:
https://builds.apache.org/job/PreCommit-HBASE-Build/640/testReport/org.apache.hadoop.hbase.catalog/
https://builds.apache.org/job/PreCommit-HBASE-Build/638/testReport/org.apache.hadoop.hbase.catalog/
https://builds.apache.org/job/PreCommit-HBASE-Build/630/testReport/org.apache.hadoop.hbase.catalog/
https://builds.apache.org/job/PreCommit-HBASE-Build/622/testReport/org.apache.hadoop.hbase.catalog/
Looking at the source code, main issues are:
- in testRetrying, some threads are started, but will not be stopped if an
assertion fails. As a consequence, the process will not stop
- still in testRetrying, it's not possible to use @Test(timeout=180000),
because if this timeout is reached, the thread won't be stopped.
This does not explain why the test fails sometimes, but explain why we have a
timeout instead of a clean failure.
I also fixed some stuff that seems unrelated, and I cannot reproduce the error
anymore locally after 10 tries. So I will try the patch on hadoop-qa.
The patch includes as well the removal of TestAdmin#testHundredsOfTable because
this test takes 2 minutes to execute; but proves nothing on success as there is
no check on ressources used.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira