[
https://issues.apache.org/jira/browse/HBASE-15959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15320139#comment-15320139
]
Appy commented on HBASE-15959:
------------------------------
I don't know much about MOB, but from the looks, it seems that the test has
uncovered a bug here. If that's the case, the current change is basically
making the test pass somehow. The correct fix would be addressing the root
cause, which i think is cleaning/moving the files in end of compaction.
What test is doing is plausible in real cluster: compaction happens, region is
unloaded and loaded back before cleaner kicks in. Maybe rs went down, maybe
rebalancing, anything.
> Fix flaky test TestRegionServerMetrics.testMobMetrics
> -----------------------------------------------------
>
> Key: HBASE-15959
> URL: https://issues.apache.org/jira/browse/HBASE-15959
> Project: HBase
> Issue Type: Bug
> Reporter: Appy
> Assignee: huaxiang sun
> Attachments: HBASE-15959-v001.patch, HBASE-15959-v002.patch,
> HBASE-15959-v003.patch
>
>
> It flakes
> [here|https://github.com/apache/hbase/blob/b557f0bec62a48753e5d01d7a47f3c9e5a6b3ee8/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServerMetrics.java#L460].
> There are two weird things i identified:
> 1. In second compaction,
> [scanner|https://github.com/apache/hbase/blob/b557f0bec62a48753e5d01d7a47f3c9e5a6b3ee8/hbase-server/src/main/java/org/apache/hadoop/hbase/mob/DefaultMobStoreCompactor.java#L173]
> has 10 storefiles. Shouldn't there be 6? 5 from recent flushes and 1 from
> earlier compaction. Probably because mob cleaner doesn't clean old hfiles.
> Does this needs fixing?
> 2. Across runs, same cell (ie. same key) may or may not be considered mob
> reference cell.
> [here|https://github.com/apache/hbase/blob/b557f0bec62a48753e5d01d7a47f3c9e5a6b3ee8/hbase-server/src/main/java/org/apache/hadoop/hbase/mob/DefaultMobStoreCompactor.java#L213].
> This at least happens with row keys 0 - 4 (which got compacted earlier).
> [~jmhsieh] Any ideas why this would happen.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)