[ 
https://issues.apache.org/jira/browse/PHOENIX-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104225#comment-16104225
 ] 

Samarth Jain commented on PHOENIX-4051:
---------------------------------------

I think updatesLock is more of a region level lock. It looks like we generally 
don't acquire the writeLock of the updatesLock unless HBase needs to flush. So 
possibly under high write load, which increases chances of HBase flushing, 
writer threads will have to wait for the flush to complete. Once flush does 
complete and the write lock is released, threads waiting for the 
updateLock.readLock() will be scheduled randomly with some threads having 
mutations at a earlier timestamps completing writes before other threads. By 
delaying the generation of timestamp like this patch is doing seems like a good 
workaround. So, +1. 

My only one minor nit would be to see if its possible to move this check 
{code}
if (!this.builder.isPartialRebuild(m) && 
!isProbablyClientControlledTimeStamp(m)) {
{code}
out of the loop and compute it only once by using the first mutation - 
miniBatchOp.getOperation(0)




> Prevent out-of-order updates for mutable index updates
> ------------------------------------------------------
>
>                 Key: PHOENIX-4051
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4051
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor
>            Assignee: James Taylor
>         Attachments: PHOENIX-4051_v1.patch
>
>
> Out-of-order processing of data rows during index maintenance causes mutable 
> indexes to become out of sync with regard to the data table. Here's a simple 
> example to illustrate the issue:
> # Assume table T(K,V) and index X(V,K).
> # Upsert T(A, 1) at t10. Index updates: Put X(1,A) at t10.
> # Upsert T(A, 3) at t30. Index updates: Delete X(1,A) at t29, Put X(3,A) at 
> t30.
> # Upsert T(A,2) at t20. Index updates: Delete X(1,A) at t19, Put X(2,A) at 
> t20, Delete X(2,A) at t29
> Ideally, we'd want to remove the Delete X(1,A) at t29 since this isn't 
> correct in terms of timeline consistency, but we can't do that with HBase 
> without support for deleting/undoing Delete markers. 
> The above is not what is occurring. Instead, when T(A,2) comes in, the Put 
> X(2,A) will occur at t20, but the Delete won't occur. This causes more index 
> rows than data rows, essentially making it invalid.
> A quick fix is to reset the timestamp of the data table mutations to the 
> current time within the preBatchMutate call, when the row is exclusively 
> locked. This skirts the issue because then timestamps won't overlap.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to