[ 
https://issues.apache.org/jira/browse/HBASE-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13102010#comment-13102010
 ] 

Todd Lipcon commented on HBASE-4367:
------------------------------------

Thread dumps (line numbers from 0.90.4)

{noformat}
"IPC Server handler 37 on 60020":
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00002aaab584cee0> (a 
java.util.concurrent.locks.ReentrantLock$NonfairSync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:747)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:778)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2061)
        at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.reclaimMemStoreMemory(MemStoreFlusher.java:444)
        - locked <0x00002aaab5519648> (a 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher)
        at 
org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:2586)


"regionserver60020.cacheFlusher":
        at java.util.ResourceBundle.endLoading(ResourceBundle.java:1506)
        - waiting to lock <0x00002aaab5519648> (a 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher)
        at java.util.ResourceBundle.findBundle(ResourceBundle.java:1379)
        at java.util.ResourceBundle.findBundle(ResourceBundle.java:1292)
        at java.util.ResourceBundle.getBundleImpl(ResourceBundle.java:1234)
        at java.util.ResourceBundle.getBundle(ResourceBundle.java:832)
        at sun.util.resources.LocaleData$1.run(LocaleData.java:127)
        at java.security.AccessController.doPrivileged(Native Method)
        at sun.util.resources.LocaleData.getBundle(LocaleData.java:125)
        at sun.util.resources.LocaleData.getTimeZoneNames(LocaleData.java:97)
        at sun.util.TimeZoneNameUtility.getBundle(TimeZoneNameUtility.java:115)
        at 
sun.util.TimeZoneNameUtility.retrieveDisplayNames(TimeZoneNameUtility.java:80)
        at java.util.TimeZone.getDisplayNames(TimeZone.java:399)
        at java.util.TimeZone.getDisplayName(TimeZone.java:350)
        at java.util.Date.toString(Date.java:1025)
        at java.lang.String.valueOf(String.java:2826)
        at java.lang.StringBuilder.append(StringBuilder.java:115)
        at 
org.apache.hadoop.hbase.regionserver.PriorityCompactionQueue$CompactionRequest.toString(PriorityCompactionQueue.java:114)
        at java.lang.String.valueOf(String.java:2826)
        at java.lang.StringBuilder.append(StringBuilder.java:115)
        at 
org.apache.hadoop.hbase.regionserver.PriorityCompactionQueue.addToRegionsInQueue(PriorityCompactionQueue.java:145)
        - locked <0x00002aaab55aa2a8> (a java.util.HashMap)
        at 
org.apache.hadoop.hbase.regionserver.PriorityCompactionQueue.add(PriorityCompactionQueue.java:188)
        at 
org.apache.hadoop.hbase.regionserver.CompactSplitThread.requestCompaction(CompactSplitThread.java:140)
        - locked <0x00002aaab555c870> (a 
org.apache.hadoop.hbase.regionserver.CompactSplitThread)
        at 
org.apache.hadoop.hbase.regionserver.CompactSplitThread.requestCompaction(CompactSplitThread.java:118)
        - locked <0x00002aaab555c870> (a 
org.apache.hadoop.hbase.regionserver.CompactSplitThread)
        at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:395)
{noformat}

> Deadlock in MemStore flusher due to JDK internally synchronizing on current 
> thread
> ----------------------------------------------------------------------------------
>
>                 Key: HBASE-4367
>                 URL: https://issues.apache.org/jira/browse/HBASE-4367
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.90.4
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.92.0
>
>
> We observed a deadlock in production between the following threads:
> - IPC handler thread holding the monitor lock on MemStoreFlusher inside 
> reclaimMemStoreMemory, waiting to obtain MemStoreFlusher.lock (the reentrant 
> lock member)
> - cacheFlusher thread inside flushRegion holds MemStoreFlusher.lock, and then 
> calls PriorityCompactionQueue.add, which calls 
> PriorityCompactionQueue.addToRegionsInQueue, which calls 
> CompactionRequest.toString(), which calls Date.toString. If this occurs just 
> after a GC under memory pressure, Date.toString needs to reload locale 
> information (stored in a soft reference), so it calls 
> ResourceBundle.loadBundle, which uses Thread.currentThread() as a 
> synchronizer (see sun bug http://bugs.sun.com/view_bug.do?bug_id=6915621). 
> Since the current thread is the MemStoreFlusher itself, we have a lock order 
> inversion and a deadlock.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to