[
https://issues.apache.org/jira/browse/PHOENIX-5528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kadir OZDEMIR updated PHOENIX-5528:
-----------------------------------
Attachment: (was: PHOENIX-5528.master.001.patch)
> Race condition in index verification causes multiple index rows to be
> returned for single data table row
> --------------------------------------------------------------------------------------------------------
>
> Key: PHOENIX-5528
> URL: https://issues.apache.org/jira/browse/PHOENIX-5528
> Project: Phoenix
> Issue Type: Bug
> Reporter: Vincent Poon
> Assignee: Kadir OZDEMIR
> Priority: Major
>
> Warning: This is an artificially generated scenario that likely has a very
> low probability of happening in practice. But a race condition nevertheless.
> Unfortunately I don't have a test case, but was able to produce this by
> debugging a local regionserver and adding breakpoints at the right places to
> produce the ordering here.
> The core problem is that when we do an update to the data table, we produce
> two unverified index rows at first. When we scan both of these index rows
> and attempt to verify via rebuilding the data table row, we cannot guarantee
> that both verifications happen before the data table update, or both happen
> after the data table update.
> I use multiple index regions here to demonstrate, but I believe it could
> happen within a single region as well.
> Steps:
> 1) Create a test table with "pk" and "indexed_val" columns, and a global
> index on "indexed_val".
> 2) upsert into test values ('test_pk', 'test_val');
> 3) Split the index table on 'test_pk':
> hbase shell: split 'test_index', 'test_pk'.
> This creates two regions, call them regionA and regionB (which holds the
> existing index row)
> 3) start an update: upsert into test values ('test_pk', 'new_val');
> The first thing the indexing code does is create two unverified index
> rows: one is a new version of the existing index row, and the other is for
> the new indexed value.
> We pause the thread after this is done, before the row locks and data
> table write happens.
> 4) select indexed_val from test;
> This scans both the index regions in parallel. Each scan picks up a
> unverified row in its region. We pause in GlobalIndexChecker.
> Let the regionB scan proceed. It will attempt to rebuild the data table
> row. The data table still has 'test_val' as the indexed value. The rebuild
> succeeds.
> scan on regionA still paused.
> 5) The original update proceeds to update the data table indexed value to
> 'new_val'.
> 6) The scan on regionA proceeds, and attempted to rebuild the data table row.
> The rebuild succeeds with 'new_val' as the indexed value.
> 7) Both 'test_val' and 'new_val' are returned to the client, because both
> rebuilds succeeded.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)