[ 
https://issues.apache.org/jira/browse/HBASE-3894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035608#comment-13035608
 ] 

Dave Latham commented on HBASE-3894:
------------------------------------

bq. If client is getting what it asked for, then just log it. How would it 
happen?  I like the idea of removing lockid completely (If client does 
explicity unlock, then they deserve the headache that will ensue).

Good question, I'm not sure.  Perhaps if the client times out and tries twice?  
In that case, it may be good to preserve the lock id.

bq. Do we need HashedBytes? Was it you who said that wrapping the byte array in 
a ByteBuffer would do what we want?

It wasn't me.  ByteBuffer has 6 or 7 extra fields that aren't really necessary 
here, and I think a simple dedicated class is appropriate.  I'm also not sure 
whether to cache the hashCode.  It does save recomputation of it at release 
time.

bq. (I think presence of this class will undo a bunch of places where we are 
using TreeMap just because key is a byte array).

It would be interesting to see if using it elsewhere would have any improvement.

bq. Whats up w/ this?

{code}
+  private final ConcurrentHashMap<HashedBytes, CountDownLatch> lockedRows = 
+    new ConcurrentHashMap<HashedBytes, CountDownLatch>();
{code}

This allows threads that are trying to lock a given row to wait on a latch for 
just that row, so they don't need to awaken every time any lock is released - 
only the one they're waiting for.

bq. Interesting. You make the lockid long now instead of int. Good.

Actually, I left it as an Integer so that the change didn't touch the protocol.

bq. Does it work?

Running it in 2 test clusters today, and if it looks good will put it out into 
production tomorrow or Friday.

bq. On the patch, whats with all the ^Ms?
bq. This is the version adapted to trunk.  TestHeapSize fails. The other tests 
passed.

Sorry for the rough patch, as I noted before it wasn't ready for inclusion, I 
just wanted to get it out there for other eyes to see since jd was looking at a 
related issue.

I talked with JD yesterday, and agreed that perhaps it's best to push it trunk 
rather than 0.90 branch if other people aren't having the same problem, though 
we'll be using it here.  What do you think, Stack?

> Thread contention over row locks set monitor
> --------------------------------------------
>
>                 Key: HBASE-3894
>                 URL: https://issues.apache.org/jira/browse/HBASE-3894
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.2
>            Reporter: Dave Latham
>            Priority: Blocker
>             Fix For: 0.90.4
>
>         Attachments: concurrentRowLocks-2.patch, 
> concurrentRowLocks-trunk.patch, 
> regionserver_rowLock_set_contention.threads.txt
>
>
> HRegion maintains a set of row locks.  Whenever any thread attempts to lock 
> or release a row it needs to acquire the monitor on that set.  We've been 
> encountering cases with 30 handler threads all contending for that monitor, 
> blocked progress on the region server.  Clients timeout, and retry making it 
> worse, and the region server stops responding to new clients almost entirely.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to