[jira] [Comment Edited] (PHOENIX-2582) Creating an index while a batch of rows is being written leads to missing rows in the index table

Thomas D'Silva (JIRA) Fri, 22 Jan 2016 14:21:06 -0800

    [ 
https://issues.apache.org/jira/browse/PHOENIX-2582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15113226#comment-15113226
 ]


Thomas D'Silva edited comment on PHOENIX-2582 at 1/22/16 10:20 PM:
-------------------------------------------------------------------

Attaching a possible solution from a email conversation with [~apurtell]

In lieu of an (external) transaction manager, maybe you could run a Procedure 
that must complete before the index create is declared successful? Procedure is 
HBase's i?>internal coordination framework. HBase 0.98 and 1.0 have 
ProcedureV1. HBase 1.1+ has ProcedureV2. 

Your procedure workers would set the writestate on each region to readonly, 
wait for in flight writes to finish, and then join the barrier. Once inside the 
barrier your workers could make the index related state changes, or just return 
if no further work needed. Your procedure workers would reset writestate in the 
cleanup callback. Your coordinator (in the master) can wait on a monitor for 
global completion or poll on a completion status check. Note Procedures will 
complete in either successful or failed state. Failure >may be explicit (worker 
posted failure notice) or a timeout. If failed, you'll need to retry. Once one 
of these has completed successfully, you would be good. 


was (Author: tdsilva):
Attaching a possible solution from a email conversation with [~apurtell]

>In lieu of an (external) transaction manager, maybe you could run a Procedure 
>that must complete before the index create is declared successful? Procedure 
>is HBase's i?>internal coordination framework. HBase 0.98 and 1.0 have 
>ProcedureV1. HBase 1.1+ has ProcedureV2. 
>
>Your procedure workers would set the writestate on each region to readonly, 
>wait for in flight writes to finish, and then join the barrier. Once inside 
>the barrier your workers >could make the index related state changes, or just 
>return if no further work needed. Your procedure workers would reset 
>writestate in the cleanup callback. Your coordinator >(in the master) can wait 
>on a monitor for global completion or poll on a completion status check. Note 
>Procedures will complete in either successful or failed state. Failure >may be 
>explicit (worker posted failure notice) or a timeout. If failed, you'll need 
>to retry. Once one of these has completed successfully, you would be good. 

> Creating an index while a batch of rows is being written leads to missing 
> rows in the index table
> -------------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-2582
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2582
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Thomas D'Silva
>
> If we create an index while we are upserting rows to the table its possible 
> we can miss writing corresponding rows to the index table. 
> If a region server is writing a batch of rows and we create an index just 
> before the batch is written we will miss writing that batch to the index 
> table. This is because we run the inital UPSERT SELECT to populate the index 
> with an SCN that we get from the server which will be before the timestamp 
> the batch of rows is written. 
> We need to figure out if there is a way to determine that are pending batches 
> have been written before running the UPSERT SELECT to do the initial index 
> population.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (PHOENIX-2582) Creating an index while a batch of rows is being written leads to missing rows in the index table

Reply via email to