[ 
https://issues.apache.org/jira/browse/PHOENIX-5528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952576#comment-16952576
 ] 

Kadir OZDEMIR commented on PHOENIX-5528:
----------------------------------------

On a chat with [~vincentpoon] , we think that the patch will eliminate the 
problem if the multiple index rows for a given data table row are contained 
within a table region since the set of data row keys maintained by 
GlobalIndexChecker is per table region. If there are multiple unverified index 
rows for the same data table row  but they are distributed over two or more 
index table regions, then some of them can be still verified and returned to 
the client as we do not have a global set to identify that data rows that have 
been rebuilt for a given scan.

The reason the race condition reported in this issue happens is because at the 
time of an index table scan, there are data table writes in progress and their 
unverified index rows are visible to the scan. These unverified writes lead to 
rebuilding of the same data table row multiple times. For this problem to 
happen, the rebuild index write for an unverified index row and the data table 
write for the corresponding data table row must be happening concurrently by 
definition. In order to eliminate the race condition, we can detect this 
concurrent activity and delayed the rebuild index write until data table write 
completes or fails. In other words, we can serialize them.

IndexRegionObserver already maintains the set of pending data table writes to 
detect concurrent updates on the same data table row. If there exist an index 
rebuild write and data table write on the same row then we can return an 
exception (say IndexRebuilWriteRetry) so that UngroupedAggregateRegionObserver 
who initiates index rebuild writes retries them if they fail with this 
exception. This will force the rebuild write to wait for the data table write 
to complete or fail. [~vincentpoon],[~gjacoby], [~larsh], let me know if you 
have any comments on this.

> Race condition in index verification causes multiple index rows to be 
> returned for single data table row
> --------------------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-5528
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5528
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Vincent Poon
>            Assignee: Kadir OZDEMIR
>            Priority: Major
>         Attachments: PHOENIX-5528.master.001.patch
>
>
> Warning: This is an artificially generated scenario that likely has a very 
> low probability of happening in practice.  But a race condition nevertheless. 
>  Unfortunately I don't have a test case, but was able to produce this by 
> debugging a local regionserver and adding breakpoints at the right places to 
> produce the ordering here.
> The core problem is that when we do an update to the data table, we produce 
> two unverified index rows at first.  When we scan both of these index rows 
> and attempt to verify via rebuilding the data table row, we cannot guarantee 
> that both verifications happen before the data table update, or both happen 
> after the data table update.
> I use multiple index regions here to demonstrate, but I believe it could 
> happen within a single region as well.
> Steps:
> 1) Create a test table with "pk" and "indexed_val" columns, and a global 
> index on "indexed_val".
> 2) upsert into test values ('test_pk', 'test_val');
> 3) Split the index table on 'test_pk':
>    hbase shell: split 'test_index', 'test_pk'.
>    This creates two regions, call them regionA and regionB (which holds the 
> existing index row)
> 3) start an update: upsert into test values ('test_pk', 'new_val');
>    The first thing the indexing code does is create two unverified index 
> rows: one is a new version of the existing index row, and the other is for 
> the new indexed value.
>    We pause the thread after this is done, before the row locks and data 
> table write happens.
> 4) select indexed_val from test;
>    This scans both the index regions in parallel.  Each scan picks up a 
> unverified row in its region.  We pause in GlobalIndexChecker.
>    Let the regionB scan proceed.  It will attempt to rebuild the data table 
> row.  The data table still has 'test_val' as the indexed value.  The rebuild 
> succeeds.
>    scan on regionA still paused.
> 5) The original update proceeds to update the data table indexed value to 
> 'new_val'.
> 6) The scan on regionA proceeds, and attempted to rebuild the data table row. 
>  The rebuild succeeds with 'new_val' as the indexed value.
> 7) Both 'test_val' and 'new_val' are returned to the client, because both 
> rebuilds succeeded.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to