[jira] [Commented] (HBASE-17924) Consider sorting the row order when processing multi() ops before taking rowlocks

Jerry He (JIRA) Sun, 21 May 2017 12:48:32 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-17924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16018966#comment-16018966
 ]


Jerry He commented on HBASE-17924:
----------------------------------

Thinking a little more. It is not clear how the ordering will help multi 
performance.

The locks for doMiniBatchMutate() are read locks for put and delete.
Say one batch has (a, c, b),  another thread has (b, a, c).  Sorting the rows 
or not, there is no conflict anyway.
The conflict will come from increment or checkAndxxx which acquires write lock 
on the row.  But we don't batch such ops. It is one at a time.
It is either a or b or c.  Sorting the original mini batch from the other 
thread from (a, c, b) to (a, b, c) does not seem to help in this case either. 

It is possible the user can explicitly request mutateRowsWithLocks for multiple 
rows in a coprocessor. But it is a rare case.

I wonder if I misunderstand anything ...


> Consider sorting the row order when processing multi() ops before taking 
> rowlocks
> ---------------------------------------------------------------------------------
>
>                 Key: HBASE-17924
>                 URL: https://issues.apache.org/jira/browse/HBASE-17924
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 2.0.0, 1.1.8
>            Reporter: Andrew Purtell
>            Assignee: Allan Yang
>             Fix For: 2.0.0, 1.4.0
>
>         Attachments: HBASE-17924.patch, HBASE-17924.v0.patch, 
> HBASE-17924.v2.patch, HBASE-17924.v3.patch, HBASE-17924.v4.patch, 
> HBASE-17924.v5.patch
>
>
> When processing a batch mutation, we take row locks in whatever order the 
> mutations were added to the multi op by the client.
>  
> {noformat}
> RSRpcServices#multi -> RSRpcServices#mutateRows -> HRegion#mutateRow -> 
> HRegion#mutateRowsWithLocks -> HRegion#processRowsWithLocks
> {noformat}
> Or
> {noformat}
> RSRpcServices#multi -> RSRpcServices#doNonAtomicRegionMutation ->
>       HRegion#get 
>     | HRegion#append 
>     | HRegion#increment 
>     | HRegionServer#doBatchOp -> HRegion#batchMutate -> 
> HRegion#doMiniBatchMutation
> {noformat}
>  
> multi() is fed by client APIs that accept a RowMutations object containing 
> actions for multiple rows. The container for ops inside RowMutations is an 
> ArrayList, which doesn't change the ordering of objects added to it. The 
> protobuf implementation of the messages for multi ops do not reorder the list 
> of actions. When processing multi ops we iterate over the actions in the 
> order rehydrated from protobuf.
> We should discuss sorting the order of ops by row key when processing multi() 
> ops before taking row locks. Does this make lock ordering more predictable 
> for server side operations? Yes, but potentially surprising for the client, 
> right? Is there any legitimate reason we should take locks out of row key 
> sorted order because the client has structured the request as such?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-17924) Consider sorting the row order when processing multi() ops before taking rowlocks

Reply via email to