[
https://issues.apache.org/jira/browse/HBASE-15192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15128692#comment-15128692
]
stack commented on HBASE-15192:
-------------------------------
I would disable that whole test if all methods are flaky. Get it out of the
build. Make its disable a subtask of HBASE-15012. Ping the author though it
looks like its [~zjushch] and he's not been around in a while:
commit c7309e82efb7d6ff90d8bb891f0cd9657bae518b
Author: zjushch <zjushch@unknown>
Date: Sun Mar 24 10:26:21 2013 +0000
HBASE-7403 Online Merge (Chunhui shen)
git-svn-id: https://svn.apache.org/repos/asf/hbase/trunk@1460306
13f79535-47bb-0310-9956-ffa450edef68
> TestRegionMergeTransactionOnCluster#testCleanMergeReference is flaky
> --------------------------------------------------------------------
>
> Key: HBASE-15192
> URL: https://issues.apache.org/jira/browse/HBASE-15192
> Project: HBase
> Issue Type: Test
> Reporter: Ted Yu
> Assignee: Ted Yu
> Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HBASE-15192.v1.patch, HBASE-15192.v2.patch
>
>
> TestRegionMergeTransactionOnCluster#testCleanMergeReference fails
> intermittently due to failed assertion on cleaned merge region count:
> {code}
> testCleanMergeReference(org.apache.hadoop.hbase.regionserver.TestRegionMergeTransactionOnCluster)
> Time elapsed: 64.183 sec <<< FAILURE!
> java.lang.AssertionError: null
> at org.junit.Assert.fail(Assert.java:86)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at org.junit.Assert.assertTrue(Assert.java:52)
> at
> org.apache.hadoop.hbase.regionserver.TestRegionMergeTransactionOnCluster.testCleanMergeReference(TestRegionMergeTransactionOnCluster.java:284)
> {code}
> Before calling CatalogJanitor#scan(), the test does:
> {code}
> int newcount1 = 0;
> while (System.currentTimeMillis() < timeout) {
> for(HColumnDescriptor colFamily : columnFamilies) {
> newcount1 += hrfs.getStoreFiles(colFamily.getName()).size();
> }
> if(newcount1 <= 1) {
> break;
> }
> Thread.sleep(50);
> }
> {code}
> newcount1 is not cleared at the beginning of the loop.
> This means that if the check for newcount1 <= 1 doesn't pass the first
> iteration, it wouldn't pass in subsequent iterations.
> After timeout is exhausted, admin.runCatalogScan() is called. However, there
> is a chance that CatalogJanitor#scan() has been called by the Chore already
> (during the wait period), leaving the cleaned count 0 and failing the test.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)