[ https://issues.apache.org/jira/browse/HBASE-3893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dave Latham updated HBASE-3893: ------------------------------- Attachment: regionserver_rowLock_set_contention.threads.txt We've also run into this issue a couple times. I'm attaching a sample thread dump. I examined a heap dump as well, and saw about 160K locks in the TreeSet of row locks. > HRegion.internalObtainRowLock shouldn't wait forever > ---------------------------------------------------- > > Key: HBASE-3893 > URL: https://issues.apache.org/jira/browse/HBASE-3893 > Project: HBase > Issue Type: Improvement > Affects Versions: 0.90.2 > Reporter: Jean-Daniel Cryans > Priority: Blocker > Fix For: 0.90.4 > > Attachments: regionserver_rowLock_set_contention.threads.txt > > > We just had a weird episode where one user was trying to insert a lot of data > with overlapping keys into a single region (all of that is a separate > problem), and the region server rapidly filled up all it's handlers + queues > with those calls. Basically it wasn't deadlocked but almost. > Worse, now that we have a 60 seconds socket timeout the clients were > eventually getting the timeout and then retrying another call to that same > region server. > We should have a timeout on lockedRows.wait() in > HRegion.internalObtainRowLock in order to survive this better. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira