Vincent Poon created PHOENIX-5528:
-------------------------------------

             Summary: Race condition in index verification causes multiple 
index rows to be returned for single data table row
                 Key: PHOENIX-5528
                 URL: https://issues.apache.org/jira/browse/PHOENIX-5528
             Project: Phoenix
          Issue Type: Bug
            Reporter: Vincent Poon


Warning: This is an artificially generated scenario that likely has a very low 
probability of happening in practice.  But a race condition nevertheless.  
Unfortunately I don't have a test case, but was able to produce this by 
debugging a local regionserver and adding breakpoints at the right places to 
produce the ordering here.

The core problem is that when we do an update to the data table, we produce two 
unverified index rows at first.  When we scan both of these index rows and 
attempt to verify via rebuilding the data table row, we cannot guarantee that 
both verifications happen before the data table update, or both happen after 
the data table update.

I use multiple index regions here to demonstrate, but I believe it could happen 
within a single region as well.

Steps:
1) Create a test table with "pk" and "indexed_val" columns, and a global index 
on "indexed_val".
2) upsert into test values ('test_pk', 'test_val');
3) Split the index table on 'test_pk':
   hbase shell: split 'test_index', 'test_pk'.
   This creates two regions, call them regionA and regionB (which holds the 
existing index row)
3) start an update: upsert into test values ('test_pk', 'new_val');
   The first thing the indexing code does is create two unverified index rows: 
one is a new version of the existing index row, and the other is for the new 
indexed value.
   We pause the thread after this is done, before the row locks and data table 
write happens.
4) select indexed_val from test;
   This scans both the index regions in parallel.  Each scan picks up a 
unverified row in its region.  We pause in GlobalIndexChecker.
   Let the regionB scan proceed.  It will attempt to rebuild the data table 
row.  The data table still has 'test_val' as the indexed value.  The rebuild 
succeeds.
   scan on regionA still paused.
5) The original update proceeds to update the data table indexed value to 
'new_val'.
6) The scan on regionA proceeds, and attempted to rebuild the data table row.  
The rebuild succeeds with 'new_val' as the indexed value.
7) Both 'test_val' and 'new_val' are returned to the client, because both 
rebuilds succeeded.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to