Re: Review Request: Fix RPC deadlock when splitting regions on same RS as meta under heavy load

Todd Lipcon Tue, 07 Sep 2010 18:33:47 -0700

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/798/#review1122
-----------------------------------------------------------



Seems to make sense. Let me try it on a cluster before I +1 it


src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java
<http://review.cloudera.org/r/798/#comment3823>

    maybe now we can do an:
    
    assert !this.parent.lock.writeLock().isHeldByCurrentThread() : "Unsafe to 
hold write lock while performing RPCs";


- Todd


On 2010-09-07 13:38:39, Todd Lipcon wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://review.cloudera.org/r/798/
> -----------------------------------------------------------
> 
> (Updated 2010-09-07 13:38:39)
> 
> 
> Review request for hbase and stack.
> 
> 
> Summary
> -------
> 
> Moves all RPCs outside of the region writeLock - the writeLock is now only 
> used long enough to set the 'closing' flag. When we drop the lock any waiters 
> will see 'closing' upon acquiring the lock, and thus throw NSRE.
> 
> In the case that we abort the split, it will reopen the region as before. 
> Accessors will have gotten NSRE but will just come back to the same region 
> eventually.
> 
> 
> This addresses bug HBASE-2964.
>     http://issues.apache.org/jira/browse/HBASE-2964
> 
> 
> Diffs
> -----
> 
>   src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java a692125 
>   src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java 
> 3507c0d 
>   
> src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransaction.java 
> a245d97 
> 
> Diff: http://review.cloudera.org/r/798/diff
> 
> 
> Testing
> -------
> 
> YCSB testing on my cluster - it used to deadlock due to this bug within an 
> hour. I ran a 5 hour load test overnight and it worked OK.
> 
> 
> Thanks,
> 
> Todd
> 
>

Re: Review Request: Fix RPC deadlock when splitting regions on same RS as meta under heavy load

Reply via email to