[ 
https://issues.apache.org/jira/browse/HBASE-18144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16052824#comment-16052824
 ] 

Allan Yang commented on HBASE-18144:
------------------------------------

{quote}
Allan Yang Yeah, even with sorting, allowing that threads can be preempted 
(unscheduled) while trying to attain locks, there is nothing to prevent your 
scenario. It does seem harder to manufacture if locks are being taken in order 
requiring more parties participating to achieve deadlock.
{quote}
Are you still facing this problem after sorting? I think this  temporary 
deadlocking situation only exists when getting readlock is not ordered.

{quote}
On your #2, above, with read/write locks, we have to take the lock every time 
so the lock keep account of reads vs writes outstanding; we can't do the logic 
that was in branch-1.1.
{quote}
Not quite understand why we can't do the logic in branch-1.1. We can do this in 
doMiniBatchMutation: if it is the first row in the batch, wait until we can get 
its readlock (only one lock won't cause deadlock), if it is not, use trylock() 
instead of tryLock(long time, TimeUnit unit), fail fast and clean up 
lockcontext(if you are worry about ref counting, we just need to make sure we 
clean up the lockcontext after failing to get the lock)

{quote}
Other thoughts are that the doMiniBatch where it takes all locks up front 
before going on to apply Mutations is not appropriate after the move to 
read/write row locks; it made sense in branch-1.1. where the row lock was 
costly and reentrant. Once you had the lock, you could do a bunch of mutations 
under its umbrella. When read/write lock, the lock does the refcounting 
internally. If all read lock requests, then all threads make progress. Its the 
write lock requests that are blockers. Better to have the batch fail quickly 
with a bunch of read-lock successes than have it wait (deadlock) for 30 seconds 
at a time just because the batch had a write lock in the mix that it was unable 
to attain in time; better to fail fast and then retry on a new rpc? Someone 
probably figured the heuristic long ago. We need to do a bit of study.
{quote}
doMiniBatch is not trying to get all locks before moving to the next step, on 
the contrary, if getting locks fails, doMiniBatch will apply mutations on rows 
which's readlock have gotten successfully. The failed ones in the batch will be 
retried in batchMutate()
The only problem in branches other than branch-1.1 is that we are waiting on 
readlock everytime now. 

{quote}
On the sort, it costs. Was wondering if we could exploit the sort otherwise 
perhaps later in the processing of mutations (is this the only sort in the 
write pipeline?)
{quote}
Yes, I'm pretty sure this is the only sort in the write pipeline. Why sorting 
later in the processing of mutations is better?


> Forward-port the old exclusive row lock; there are scenarios where it 
> performs better
> -------------------------------------------------------------------------------------
>
>                 Key: HBASE-18144
>                 URL: https://issues.apache.org/jira/browse/HBASE-18144
>             Project: HBase
>          Issue Type: Bug
>          Components: Increment
>    Affects Versions: 1.2.5
>            Reporter: stack
>            Assignee: stack
>             Fix For: 2.0.0, 1.3.2, 1.2.7
>
>         Attachments: DisorderedBatchAndIncrementUT.patch, 
> HBASE-18144.master.001.patch
>
>
> Description to follow.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to