[
https://issues.apache.org/jira/browse/HBASE-2322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857629#action_12857629
]
stack commented on HBASE-2322:
------------------------------
At Todd's suggestion I used his version of jcarder because it does readwrite
locks. Its available here:
http://github.com/toddlipcon/jcarder/tree/lockclasses I ran a local test with
4 concurrent threads each loading 1M rows -- how I got the deadlock previously
-- and then did analysis and it claimed no deadlocks:
{code}
stack:0.20_pre_durability Stack$ java -Xmx4G -jar
~/checkouts/jcarder/dist/jcarder.jar
Opening for reading:
/Users/Stack/checkouts/0.20_pre_durability/jcarder_contexts.db
Opening for reading:
/Users/Stack/checkouts/0.20_pre_durability/jcarder_events.db
Loaded from database files:
Nodes: 166109
Edges: 494196 (excluding 175530380 duplicated)
Cycle analysis result:
Cycles: 0
Edges in cycles: 0
Nodes in cycles: 0
Max cycle depth: 0
Max graph depth: 8
No cycles found!
{code}
I've been running multiple MR jobs over last day or so and we used deadlock
reliably at 2% done or so. I've exceeded this 2% many times since w/o
deadlocking.
I'm going to say that this issue was fixed by hbase-2248. Will open new issue
if I see it again.
> deadlock between put and cacheflusher in 0.20 branch
> ----------------------------------------------------
>
> Key: HBASE-2322
> URL: https://issues.apache.org/jira/browse/HBASE-2322
> Project: Hadoop HBase
> Issue Type: Bug
> Reporter: stack
> Assignee: stack
> Priority: Blocker
> Fix For: 0.20.4, 0.21.0
>
> Attachments: hbase-2322.png
>
>
> {code}
> Found one Java-level deadlock:
> =============================
> "IPC Server handler 59 on 60020":
> waiting for ownable synchronizer 0x00007fec9eb050f8, (a
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync),
> which is held by "IPC Server handler 54 on 60020"
> "IPC Server handler 54 on 60020":
> waiting to lock monitor 0x000000004190e950 (object 0x00007fec64f25258, a
> java.util.HashSet),
> which is held by "regionserver/10.20.20.186:60020.cacheFlusher"
> "regionserver/10.20.20.186:60020.cacheFlusher":
> waiting for ownable synchronizer 0x00007fec651df998, (a
> java.util.concurrent.locks.ReentrantLock$NonfairSync),
> which is held by "IPC Server handler 19 on 60020"
> "IPC Server handler 19 on 60020":
> waiting for ownable synchronizer 0x00007fec9eb050f8, (a
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync),
> which is held by "IPC Server handler 54 on 60020"
> Java stack information for the threads listed above:
> ===================================================
> "IPC Server handler 59 on 60020":
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x00007fec9eb050f8> (a
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:747)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:778)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1114)
> at
> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:807)
> at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1299)
> at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1281)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.put(HRegionServer.java:1789)
> at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:577)
> at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> "IPC Server handler 54 on 60020":
> at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.request(MemStoreFlusher.java:172)
> - waiting to lock <0x00007fec64f25258> (a java.util.HashSet)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.requestFlush(HRegion.java:1549)
> at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1534)
> at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1318)
> at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1281)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.put(HRegionServer.java:1789)
> at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:577)
> at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> "regionserver/10.20.20.186:60020.cacheFlusher":
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x00007fec651df998> (a
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:747)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:778)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1114)
> at
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186)
> at
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262)
> at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:235)
> - locked <0x00007fec64f25258> (a java.util.HashSet)
> at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:149)
> "IPC Server handler 19 on 60020":
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x00007fec9eb050f8> (a
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:747)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:778)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1114)
> at
> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:807)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:980)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:873)
> at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:241)
> at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushSomeRegions(MemStoreFlusher.java:352)
> - locked <0x00007fec64ed96f0> (a
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher)
> at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.reclaimMemStoreMemory(MemStoreFlusher.java:321)
> - locked <0x00007fec64ed96f0> (a
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.put(HRegionServer.java:1783)
> at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:577)
> at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> Found 1 deadlock.
> {code}
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira