[
https://issues.apache.org/jira/browse/HBASE-17887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16006931#comment-16006931
]
Chia-Ping Tsai edited comment on HBASE-17887 at 5/11/17 6:24 PM:
-----------------------------------------------------------------
bq. here seems the issue is that the test case is making quick flushes
This is a great summary.
bq. 1st flush done and before the resetScannerStack
Not completely. The issue is caused by the following order.
# put data_A (seq id = 10, active store data_A and snapshots is empty)
# snapshot of 1st flush (active is empty and snapshot stores data_A)
# put data_B (seq id = 11, active store data_B and snapshot store data_A)
# create user scanner (read point = 11, so It should see the data_B)
# commit of 1st flush
#* clear snapshot ((hfile_A has data_A, active store data_B, and snapshot is
empty)
#* update the reader (the user scanner receives the hfile_A)
# snapshot of 2st flush (active is empty and snapshot store data_B)
# commit of 2st flush
#* clear snapshot (hfile_A has data_A, hfile_B has data_B, active is empty, and
snapshot is empty) -- this is critical piece.
#* -update the reader- (haven't happen)
# user scanner update the kv scanners (it creates scanner of hfile_A but
nothing of memstore)
# user see the older data A -- wrong result
was (Author: chia7712):
bq. here seems the issue is that the test case is making quick flushes
This is a great summary.
bq. 1st flush done and before the resetScannerStack
That is not all right. The issue is caused by the following order.
# put data_A (seq id = 10, active store data_A and snapshots is empty)
# snapshot of 1st flush (active is empty and snapshot stores data_A)
# put data_B (seq id = 11, active store data_B and snapshot store data_A)
# create user scanner (read point = 11, so It should see the data_B)
# commit of 1st flush
#* clear snapshot ((hfile_A has data_A, active store data_B, and snapshot is
empty)
#* update the reader (the user scanner receives the hfile_A)
# snapshot of 2st flush (active is empty and snapshot store data_B)
# commit of 2st flush
#* clear snapshot (hfile_A has data_A, hfile_B has data_B, active is empty, and
snapshot is empty) -- this is critical piece.
#* -update the reader- (haven't happen)
# user scanner update the kv scanners (it creates scanner of hfile_A but
nothing of memstore)
# user see the older data A -- wrong result
> TestAcidGuarantees fails frequently
> -----------------------------------
>
> Key: HBASE-17887
> URL: https://issues.apache.org/jira/browse/HBASE-17887
> Project: HBase
> Issue Type: Bug
> Components: regionserver
> Affects Versions: 2.0.0
> Reporter: Umesh Agashe
> Assignee: Chia-Ping Tsai
> Priority: Blocker
> Fix For: 2.0.0, 1.4.0, 1.2.6, 1.3.2, 1.4.1
>
> Attachments: HBASE-17887.branch-1.v0.patch,
> HBASE-17887.branch-1.v1.patch, HBASE-17887.branch-1.v1.patch,
> HBASE-17887.branch-1.v2.patch, HBASE-17887.branch-1.v2.patch,
> HBASE-17887.branch-1.v3.patch, HBASE-17887.branch-1.v4.patch,
> HBASE-17887.branch-1.v4.patch, HBASE-17887.branch-1.v4.patch,
> HBASE-17887.branch-1.v5.patch, HBASE-17887.branch-1.v6.patch,
> HBASE-17887.ut.patch, HBASE-17887.v0.patch, HBASE-17887.v1.patch,
> HBASE-17887.v2.patch, HBASE-17887.v3.patch, HBASE-17887.v4.patch,
> HBASE-17887.v5.patch
>
>
> As per the flaky tests dashboard here:
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html,
> It fails 30% of the time.
> While working on HBASE-17863, a few verification builds on patch failed due
> to TestAcidGuarantees didn't pass. IMHO, the changes for HBASE-17863 are
> unlikely to affect get/ put path.
> I ran the test with and without the patch several times locally and found
> that TestAcidGuarantees fails without the patch similar number of times.
> Opening blocker, considering acid guarantees are critical to HBase.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)