----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/798/#review1122 -----------------------------------------------------------
Seems to make sense. Let me try it on a cluster before I +1 it src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java <http://review.cloudera.org/r/798/#comment3823> maybe now we can do an: assert !this.parent.lock.writeLock().isHeldByCurrentThread() : "Unsafe to hold write lock while performing RPCs"; - Todd On 2010-09-07 13:38:39, Todd Lipcon wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > http://review.cloudera.org/r/798/ > ----------------------------------------------------------- > > (Updated 2010-09-07 13:38:39) > > > Review request for hbase and stack. > > > Summary > ------- > > Moves all RPCs outside of the region writeLock - the writeLock is now only > used long enough to set the 'closing' flag. When we drop the lock any waiters > will see 'closing' upon acquiring the lock, and thus throw NSRE. > > In the case that we abort the split, it will reopen the region as before. > Accessors will have gotten NSRE but will just come back to the same region > eventually. > > > This addresses bug HBASE-2964. > http://issues.apache.org/jira/browse/HBASE-2964 > > > Diffs > ----- > > src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java a692125 > src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java > 3507c0d > > src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransaction.java > a245d97 > > Diff: http://review.cloudera.org/r/798/diff > > > Testing > ------- > > YCSB testing on my cluster - it used to deadlock due to this bug within an > hour. I ran a 5 hour load test overnight and it worked OK. > > > Thanks, > > Todd > >