[jira] [Commented] (HBASE-14274) Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs MetricsRegionAggregateSourceImpl

2015-09-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791630#comment-14791630
 ] 

Hudson commented on HBASE-14274:


FAILURE: Integrated in HBase-TRUNK #6817 (See 
[https://builds.apache.org/job/HBase-TRUNK/6817/])
HBASE-14278 Fix NPE that is showing up since HBASE-14274 went in (eclark: rev 
c1ac4bb8601f88eb3fe246eb62c3f40e95faf93d)
* 
hbase-hadoop2-compat/src/test/java/org/apache/hadoop/hbase/regionserver/TestMetricsRegionSourceImpl.java
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSourceImpl.java
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionAggregateSourceImpl.java
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/metrics2/impl/JmxCacheBuster.java
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerSourceImpl.java


> Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs 
> MetricsRegionAggregateSourceImpl
> ---
>
> Key: HBASE-14274
> URL: https://issues.apache.org/jira/browse/HBASE-14274
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: Elliott Clark
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14274-addendum.txt, 23612.stack, HBASE-14274-v1.patch, 
> HBASE-14274.patch
>
>
> Looking into parent issue, got a hang locally of TestDistributedLogReplay.
> We have region closes here:
> {code}
> "RS_CLOSE_META-localhost:59610-0" prio=5 tid=0x7ff65c03f800 nid=0x54347 
> waiting on condition [0x00011f7ac000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x00075636d8c0> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
>   at 
> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
>   at 
> org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:78)
>   at 
> org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:120)
>   at 
> org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:41)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1500)
>   at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1344)
>   - locked <0x0007ff878190> (a java.lang.Object)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:102)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:103)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:744)
> {code}
> They are trying to MetricsRegionAggregateSourceImpl.deregister. They want to 
> get a write lock on this classes local ReentrantReadWriteLock while holding 
> MetricsRegionSourceImpl's readWriteLock write lock.
> Then, elsewhere the JmxCacheBuster is running trying to get metrics with 
> above locks held in reverse:
> {code}
> "HBase-Metrics2-1" daemon prio=5 tid=0x7ff65e14b000 nid=0x59a03 waiting 
> on condition [0x000140ea5000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x0007cade1480> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
>   at 
> java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
>   at 
> org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.snapshot(MetricsRegionSourceImpl.java:193)
>   at 
> 

[jira] [Commented] (HBASE-14274) Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs MetricsRegionAggregateSourceImpl

2015-09-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791449#comment-14791449
 ] 

Hudson commented on HBASE-14274:


SUCCESS: Integrated in HBase-1.3-IT #162 (See 
[https://builds.apache.org/job/HBase-1.3-IT/162/])
HBASE-14278 Fix NPE that is showing up since HBASE-14274 went in (eclark: rev 
2029e851827fa1bf59436c7baa1971b52ac5833e)
* 
hbase-hadoop2-compat/src/test/java/org/apache/hadoop/hbase/regionserver/TestMetricsRegionSourceImpl.java
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSourceImpl.java
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerSourceImpl.java
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/metrics2/impl/JmxCacheBuster.java
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionAggregateSourceImpl.java


> Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs 
> MetricsRegionAggregateSourceImpl
> ---
>
> Key: HBASE-14274
> URL: https://issues.apache.org/jira/browse/HBASE-14274
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: Elliott Clark
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14274-addendum.txt, 23612.stack, HBASE-14274-v1.patch, 
> HBASE-14274.patch
>
>
> Looking into parent issue, got a hang locally of TestDistributedLogReplay.
> We have region closes here:
> {code}
> "RS_CLOSE_META-localhost:59610-0" prio=5 tid=0x7ff65c03f800 nid=0x54347 
> waiting on condition [0x00011f7ac000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x00075636d8c0> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
>   at 
> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
>   at 
> org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:78)
>   at 
> org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:120)
>   at 
> org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:41)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1500)
>   at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1344)
>   - locked <0x0007ff878190> (a java.lang.Object)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:102)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:103)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:744)
> {code}
> They are trying to MetricsRegionAggregateSourceImpl.deregister. They want to 
> get a write lock on this classes local ReentrantReadWriteLock while holding 
> MetricsRegionSourceImpl's readWriteLock write lock.
> Then, elsewhere the JmxCacheBuster is running trying to get metrics with 
> above locks held in reverse:
> {code}
> "HBase-Metrics2-1" daemon prio=5 tid=0x7ff65e14b000 nid=0x59a03 waiting 
> on condition [0x000140ea5000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x0007cade1480> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
>   at 
> java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
>   at 
> org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.snapshot(MetricsRegionSourceImpl.java:193)
>   at 
> 

[jira] [Commented] (HBASE-14274) Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs MetricsRegionAggregateSourceImpl

2015-09-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791454#comment-14791454
 ] 

Hudson commented on HBASE-14274:


FAILURE: Integrated in HBase-1.2-IT #152 (See 
[https://builds.apache.org/job/HBase-1.2-IT/152/])
HBASE-14278 Fix NPE that is showing up since HBASE-14274 went in (eclark: rev 
a229ac91fbab2608ae89bbe44b1dd05e5aef1183)
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/metrics2/impl/JmxCacheBuster.java
* 
hbase-hadoop2-compat/src/test/java/org/apache/hadoop/hbase/regionserver/TestMetricsRegionSourceImpl.java
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionAggregateSourceImpl.java
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSourceImpl.java
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerSourceImpl.java


> Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs 
> MetricsRegionAggregateSourceImpl
> ---
>
> Key: HBASE-14274
> URL: https://issues.apache.org/jira/browse/HBASE-14274
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: Elliott Clark
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14274-addendum.txt, 23612.stack, HBASE-14274-v1.patch, 
> HBASE-14274.patch
>
>
> Looking into parent issue, got a hang locally of TestDistributedLogReplay.
> We have region closes here:
> {code}
> "RS_CLOSE_META-localhost:59610-0" prio=5 tid=0x7ff65c03f800 nid=0x54347 
> waiting on condition [0x00011f7ac000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x00075636d8c0> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
>   at 
> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
>   at 
> org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:78)
>   at 
> org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:120)
>   at 
> org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:41)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1500)
>   at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1344)
>   - locked <0x0007ff878190> (a java.lang.Object)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:102)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:103)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:744)
> {code}
> They are trying to MetricsRegionAggregateSourceImpl.deregister. They want to 
> get a write lock on this classes local ReentrantReadWriteLock while holding 
> MetricsRegionSourceImpl's readWriteLock write lock.
> Then, elsewhere the JmxCacheBuster is running trying to get metrics with 
> above locks held in reverse:
> {code}
> "HBase-Metrics2-1" daemon prio=5 tid=0x7ff65e14b000 nid=0x59a03 waiting 
> on condition [0x000140ea5000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x0007cade1480> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
>   at 
> java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
>   at 
> org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.snapshot(MetricsRegionSourceImpl.java:193)
>   at 
> 

[jira] [Commented] (HBASE-14274) Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs MetricsRegionAggregateSourceImpl

2015-09-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791556#comment-14791556
 ] 

Hudson commented on HBASE-14274:


FAILURE: Integrated in HBase-1.3 #182 (See 
[https://builds.apache.org/job/HBase-1.3/182/])
HBASE-14278 Fix NPE that is showing up since HBASE-14274 went in (eclark: rev 
2029e851827fa1bf59436c7baa1971b52ac5833e)
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/metrics2/impl/JmxCacheBuster.java
* 
hbase-hadoop2-compat/src/test/java/org/apache/hadoop/hbase/regionserver/TestMetricsRegionSourceImpl.java
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSourceImpl.java
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerSourceImpl.java
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionAggregateSourceImpl.java


> Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs 
> MetricsRegionAggregateSourceImpl
> ---
>
> Key: HBASE-14274
> URL: https://issues.apache.org/jira/browse/HBASE-14274
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: Elliott Clark
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14274-addendum.txt, 23612.stack, HBASE-14274-v1.patch, 
> HBASE-14274.patch
>
>
> Looking into parent issue, got a hang locally of TestDistributedLogReplay.
> We have region closes here:
> {code}
> "RS_CLOSE_META-localhost:59610-0" prio=5 tid=0x7ff65c03f800 nid=0x54347 
> waiting on condition [0x00011f7ac000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x00075636d8c0> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
>   at 
> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
>   at 
> org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:78)
>   at 
> org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:120)
>   at 
> org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:41)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1500)
>   at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1344)
>   - locked <0x0007ff878190> (a java.lang.Object)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:102)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:103)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:744)
> {code}
> They are trying to MetricsRegionAggregateSourceImpl.deregister. They want to 
> get a write lock on this classes local ReentrantReadWriteLock while holding 
> MetricsRegionSourceImpl's readWriteLock write lock.
> Then, elsewhere the JmxCacheBuster is running trying to get metrics with 
> above locks held in reverse:
> {code}
> "HBase-Metrics2-1" daemon prio=5 tid=0x7ff65e14b000 nid=0x59a03 waiting 
> on condition [0x000140ea5000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x0007cade1480> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
>   at 
> java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
>   at 
> org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.snapshot(MetricsRegionSourceImpl.java:193)
>   at 
> 

[jira] [Commented] (HBASE-14274) Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs MetricsRegionAggregateSourceImpl

2015-09-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791563#comment-14791563
 ] 

Hudson commented on HBASE-14274:


FAILURE: Integrated in HBase-1.2 #180 (See 
[https://builds.apache.org/job/HBase-1.2/180/])
HBASE-14278 Fix NPE that is showing up since HBASE-14274 went in (eclark: rev 
a229ac91fbab2608ae89bbe44b1dd05e5aef1183)
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/metrics2/impl/JmxCacheBuster.java
* 
hbase-hadoop2-compat/src/test/java/org/apache/hadoop/hbase/regionserver/TestMetricsRegionSourceImpl.java
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSourceImpl.java
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerSourceImpl.java
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionAggregateSourceImpl.java


> Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs 
> MetricsRegionAggregateSourceImpl
> ---
>
> Key: HBASE-14274
> URL: https://issues.apache.org/jira/browse/HBASE-14274
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: Elliott Clark
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14274-addendum.txt, 23612.stack, HBASE-14274-v1.patch, 
> HBASE-14274.patch
>
>
> Looking into parent issue, got a hang locally of TestDistributedLogReplay.
> We have region closes here:
> {code}
> "RS_CLOSE_META-localhost:59610-0" prio=5 tid=0x7ff65c03f800 nid=0x54347 
> waiting on condition [0x00011f7ac000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x00075636d8c0> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
>   at 
> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
>   at 
> org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:78)
>   at 
> org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:120)
>   at 
> org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:41)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1500)
>   at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1344)
>   - locked <0x0007ff878190> (a java.lang.Object)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:102)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:103)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:744)
> {code}
> They are trying to MetricsRegionAggregateSourceImpl.deregister. They want to 
> get a write lock on this classes local ReentrantReadWriteLock while holding 
> MetricsRegionSourceImpl's readWriteLock write lock.
> Then, elsewhere the JmxCacheBuster is running trying to get metrics with 
> above locks held in reverse:
> {code}
> "HBase-Metrics2-1" daemon prio=5 tid=0x7ff65e14b000 nid=0x59a03 waiting 
> on condition [0x000140ea5000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x0007cade1480> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
>   at 
> java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
>   at 
> org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.snapshot(MetricsRegionSourceImpl.java:193)
>   at 
> 

[jira] [Commented] (HBASE-14274) Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs MetricsRegionAggregateSourceImpl

2015-08-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707825#comment-14707825
 ] 

Hudson commented on HBASE-14274:


FAILURE: Integrated in HBase-TRUNK #6746 (See 
[https://builds.apache.org/job/HBase-TRUNK/6746/])
HBASE-14274 Addendum sets closed to true when closing (tedyu: rev 
9b2325e16fac7f2f37ac3539aee4faf6cdb8d6a6)
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSourceImpl.java


 Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs 
 MetricsRegionAggregateSourceImpl
 ---

 Key: HBASE-14274
 URL: https://issues.apache.org/jira/browse/HBASE-14274
 Project: HBase
  Issue Type: Sub-task
  Components: test
Reporter: stack
Assignee: Elliott Clark
 Fix For: 2.0.0, 1.2.0, 1.3.0

 Attachments: 14274-addendum.txt, 23612.stack, HBASE-14274-v1.patch, 
 HBASE-14274.patch


 Looking into parent issue, got a hang locally of TestDistributedLogReplay.
 We have region closes here:
 {code}
 RS_CLOSE_META-localhost:59610-0 prio=5 tid=0x7ff65c03f800 nid=0x54347 
 waiting on condition [0x00011f7ac000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x00075636d8c0 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:78)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:120)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:41)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1500)
   at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1344)
   - locked 0x0007ff878190 (a java.lang.Object)
   at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:102)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:103)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 {code}
 They are trying to MetricsRegionAggregateSourceImpl.deregister. They want to 
 get a write lock on this classes local ReentrantReadWriteLock while holding 
 MetricsRegionSourceImpl's readWriteLock write lock.
 Then, elsewhere the JmxCacheBuster is running trying to get metrics with 
 above locks held in reverse:
 {code}
 HBase-Metrics2-1 daemon prio=5 tid=0x7ff65e14b000 nid=0x59a03 waiting 
 on condition [0x000140ea5000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x0007cade1480 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.snapshot(MetricsRegionSourceImpl.java:193)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.getMetrics(MetricsRegionAggregateSourceImpl.java:115)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:195)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:172)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMBeanInfo(MetricsSourceAdapter.java:151)
   at 
 

[jira] [Commented] (HBASE-14274) Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs MetricsRegionAggregateSourceImpl

2015-08-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707804#comment-14707804
 ] 

Hudson commented on HBASE-14274:


FAILURE: Integrated in HBase-1.2 #130 (See 
[https://builds.apache.org/job/HBase-1.2/130/])
HBASE-14274 Addendum sets closed to true when closing (tedyu: rev 
1484aecc2635fdaecfeeeb368eafa2204041a8a9)
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSourceImpl.java


 Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs 
 MetricsRegionAggregateSourceImpl
 ---

 Key: HBASE-14274
 URL: https://issues.apache.org/jira/browse/HBASE-14274
 Project: HBase
  Issue Type: Sub-task
  Components: test
Reporter: stack
Assignee: Elliott Clark
 Fix For: 2.0.0, 1.2.0, 1.3.0

 Attachments: 14274-addendum.txt, 23612.stack, HBASE-14274-v1.patch, 
 HBASE-14274.patch


 Looking into parent issue, got a hang locally of TestDistributedLogReplay.
 We have region closes here:
 {code}
 RS_CLOSE_META-localhost:59610-0 prio=5 tid=0x7ff65c03f800 nid=0x54347 
 waiting on condition [0x00011f7ac000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x00075636d8c0 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:78)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:120)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:41)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1500)
   at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1344)
   - locked 0x0007ff878190 (a java.lang.Object)
   at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:102)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:103)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 {code}
 They are trying to MetricsRegionAggregateSourceImpl.deregister. They want to 
 get a write lock on this classes local ReentrantReadWriteLock while holding 
 MetricsRegionSourceImpl's readWriteLock write lock.
 Then, elsewhere the JmxCacheBuster is running trying to get metrics with 
 above locks held in reverse:
 {code}
 HBase-Metrics2-1 daemon prio=5 tid=0x7ff65e14b000 nid=0x59a03 waiting 
 on condition [0x000140ea5000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x0007cade1480 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.snapshot(MetricsRegionSourceImpl.java:193)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.getMetrics(MetricsRegionAggregateSourceImpl.java:115)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:195)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:172)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMBeanInfo(MetricsSourceAdapter.java:151)
   at 
 

[jira] [Commented] (HBASE-14274) Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs MetricsRegionAggregateSourceImpl

2015-08-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706539#comment-14706539
 ] 

Hudson commented on HBASE-14274:


FAILURE: Integrated in HBase-TRUNK #6744 (See 
[https://builds.apache.org/job/HBase-TRUNK/6744/])
HBASE-14274 Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs 
MetricsRegionAggregateSourceImpl (stack: rev 
bcef28eefaf192b0ad48c8011f98b8e944340da5)
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSourceImpl.java
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionAggregateSourceImpl.java


 Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs 
 MetricsRegionAggregateSourceImpl
 ---

 Key: HBASE-14274
 URL: https://issues.apache.org/jira/browse/HBASE-14274
 Project: HBase
  Issue Type: Sub-task
  Components: test
Reporter: stack
Assignee: Elliott Clark
 Fix For: 2.0.0, 1.2.0, 1.3.0

 Attachments: 23612.stack, HBASE-14274-v1.patch, HBASE-14274.patch


 Looking into parent issue, got a hang locally of TestDistributedLogReplay.
 We have region closes here:
 {code}
 RS_CLOSE_META-localhost:59610-0 prio=5 tid=0x7ff65c03f800 nid=0x54347 
 waiting on condition [0x00011f7ac000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x00075636d8c0 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:78)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:120)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:41)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1500)
   at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1344)
   - locked 0x0007ff878190 (a java.lang.Object)
   at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:102)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:103)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 {code}
 They are trying to MetricsRegionAggregateSourceImpl.deregister. They want to 
 get a write lock on this classes local ReentrantReadWriteLock while holding 
 MetricsRegionSourceImpl's readWriteLock write lock.
 Then, elsewhere the JmxCacheBuster is running trying to get metrics with 
 above locks held in reverse:
 {code}
 HBase-Metrics2-1 daemon prio=5 tid=0x7ff65e14b000 nid=0x59a03 waiting 
 on condition [0x000140ea5000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x0007cade1480 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.snapshot(MetricsRegionSourceImpl.java:193)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.getMetrics(MetricsRegionAggregateSourceImpl.java:115)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:195)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:172)
   at 
 

[jira] [Commented] (HBASE-14274) Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs MetricsRegionAggregateSourceImpl

2015-08-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707458#comment-14707458
 ] 

Hudson commented on HBASE-14274:


SUCCESS: Integrated in HBase-1.2-IT #106 (See 
[https://builds.apache.org/job/HBase-1.2-IT/106/])
HBASE-14274 Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs 
MetricsRegionAggregateSourceImpl (busbey: rev 
909e2fe504169f978989e4c1c3778291b3da2418)
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSourceImpl.java
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionAggregateSourceImpl.java


 Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs 
 MetricsRegionAggregateSourceImpl
 ---

 Key: HBASE-14274
 URL: https://issues.apache.org/jira/browse/HBASE-14274
 Project: HBase
  Issue Type: Sub-task
  Components: test
Reporter: stack
Assignee: Elliott Clark
 Fix For: 2.0.0, 1.2.0, 1.3.0

 Attachments: 14274-addendum.txt, 23612.stack, HBASE-14274-v1.patch, 
 HBASE-14274.patch


 Looking into parent issue, got a hang locally of TestDistributedLogReplay.
 We have region closes here:
 {code}
 RS_CLOSE_META-localhost:59610-0 prio=5 tid=0x7ff65c03f800 nid=0x54347 
 waiting on condition [0x00011f7ac000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x00075636d8c0 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:78)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:120)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:41)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1500)
   at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1344)
   - locked 0x0007ff878190 (a java.lang.Object)
   at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:102)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:103)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 {code}
 They are trying to MetricsRegionAggregateSourceImpl.deregister. They want to 
 get a write lock on this classes local ReentrantReadWriteLock while holding 
 MetricsRegionSourceImpl's readWriteLock write lock.
 Then, elsewhere the JmxCacheBuster is running trying to get metrics with 
 above locks held in reverse:
 {code}
 HBase-Metrics2-1 daemon prio=5 tid=0x7ff65e14b000 nid=0x59a03 waiting 
 on condition [0x000140ea5000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x0007cade1480 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.snapshot(MetricsRegionSourceImpl.java:193)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.getMetrics(MetricsRegionAggregateSourceImpl.java:115)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:195)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:172)
   at 
 

[jira] [Commented] (HBASE-14274) Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs MetricsRegionAggregateSourceImpl

2015-08-21 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707575#comment-14707575
 ] 

Elliott Clark commented on HBASE-14274:
---

Yep [~tedyu] that's what I meant. Want to commit that to every branch ?

 Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs 
 MetricsRegionAggregateSourceImpl
 ---

 Key: HBASE-14274
 URL: https://issues.apache.org/jira/browse/HBASE-14274
 Project: HBase
  Issue Type: Sub-task
  Components: test
Reporter: stack
Assignee: Elliott Clark
 Fix For: 2.0.0, 1.2.0, 1.3.0

 Attachments: 14274-addendum.txt, 23612.stack, HBASE-14274-v1.patch, 
 HBASE-14274.patch


 Looking into parent issue, got a hang locally of TestDistributedLogReplay.
 We have region closes here:
 {code}
 RS_CLOSE_META-localhost:59610-0 prio=5 tid=0x7ff65c03f800 nid=0x54347 
 waiting on condition [0x00011f7ac000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x00075636d8c0 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:78)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:120)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:41)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1500)
   at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1344)
   - locked 0x0007ff878190 (a java.lang.Object)
   at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:102)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:103)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 {code}
 They are trying to MetricsRegionAggregateSourceImpl.deregister. They want to 
 get a write lock on this classes local ReentrantReadWriteLock while holding 
 MetricsRegionSourceImpl's readWriteLock write lock.
 Then, elsewhere the JmxCacheBuster is running trying to get metrics with 
 above locks held in reverse:
 {code}
 HBase-Metrics2-1 daemon prio=5 tid=0x7ff65e14b000 nid=0x59a03 waiting 
 on condition [0x000140ea5000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x0007cade1480 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.snapshot(MetricsRegionSourceImpl.java:193)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.getMetrics(MetricsRegionAggregateSourceImpl.java:115)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:195)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:172)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMBeanInfo(MetricsSourceAdapter.java:151)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getNewMBeanClassName(DefaultMBeanServerInterceptor.java:333)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:319)
   at 
 

[jira] [Commented] (HBASE-14274) Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs MetricsRegionAggregateSourceImpl

2015-08-21 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707613#comment-14707613
 ] 

Ted Yu commented on HBASE-14274:


Pushed addendum to the 3 branches.

 Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs 
 MetricsRegionAggregateSourceImpl
 ---

 Key: HBASE-14274
 URL: https://issues.apache.org/jira/browse/HBASE-14274
 Project: HBase
  Issue Type: Sub-task
  Components: test
Reporter: stack
Assignee: Elliott Clark
 Fix For: 2.0.0, 1.2.0, 1.3.0

 Attachments: 14274-addendum.txt, 23612.stack, HBASE-14274-v1.patch, 
 HBASE-14274.patch


 Looking into parent issue, got a hang locally of TestDistributedLogReplay.
 We have region closes here:
 {code}
 RS_CLOSE_META-localhost:59610-0 prio=5 tid=0x7ff65c03f800 nid=0x54347 
 waiting on condition [0x00011f7ac000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x00075636d8c0 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:78)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:120)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:41)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1500)
   at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1344)
   - locked 0x0007ff878190 (a java.lang.Object)
   at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:102)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:103)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 {code}
 They are trying to MetricsRegionAggregateSourceImpl.deregister. They want to 
 get a write lock on this classes local ReentrantReadWriteLock while holding 
 MetricsRegionSourceImpl's readWriteLock write lock.
 Then, elsewhere the JmxCacheBuster is running trying to get metrics with 
 above locks held in reverse:
 {code}
 HBase-Metrics2-1 daemon prio=5 tid=0x7ff65e14b000 nid=0x59a03 waiting 
 on condition [0x000140ea5000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x0007cade1480 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.snapshot(MetricsRegionSourceImpl.java:193)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.getMetrics(MetricsRegionAggregateSourceImpl.java:115)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:195)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:172)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMBeanInfo(MetricsSourceAdapter.java:151)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getNewMBeanClassName(DefaultMBeanServerInterceptor.java:333)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:319)
   at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
   at 

[jira] [Commented] (HBASE-14274) Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs MetricsRegionAggregateSourceImpl

2015-08-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707753#comment-14707753
 ] 

Hudson commented on HBASE-14274:


FAILURE: Integrated in HBase-1.3 #126 (See 
[https://builds.apache.org/job/HBase-1.3/126/])
HBASE-14274 Addendum sets closed to true when closing (tedyu: rev 
f05770b96f6246ffe851e6878dc07c4b7c02906f)
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSourceImpl.java


 Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs 
 MetricsRegionAggregateSourceImpl
 ---

 Key: HBASE-14274
 URL: https://issues.apache.org/jira/browse/HBASE-14274
 Project: HBase
  Issue Type: Sub-task
  Components: test
Reporter: stack
Assignee: Elliott Clark
 Fix For: 2.0.0, 1.2.0, 1.3.0

 Attachments: 14274-addendum.txt, 23612.stack, HBASE-14274-v1.patch, 
 HBASE-14274.patch


 Looking into parent issue, got a hang locally of TestDistributedLogReplay.
 We have region closes here:
 {code}
 RS_CLOSE_META-localhost:59610-0 prio=5 tid=0x7ff65c03f800 nid=0x54347 
 waiting on condition [0x00011f7ac000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x00075636d8c0 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:78)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:120)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:41)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1500)
   at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1344)
   - locked 0x0007ff878190 (a java.lang.Object)
   at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:102)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:103)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 {code}
 They are trying to MetricsRegionAggregateSourceImpl.deregister. They want to 
 get a write lock on this classes local ReentrantReadWriteLock while holding 
 MetricsRegionSourceImpl's readWriteLock write lock.
 Then, elsewhere the JmxCacheBuster is running trying to get metrics with 
 above locks held in reverse:
 {code}
 HBase-Metrics2-1 daemon prio=5 tid=0x7ff65e14b000 nid=0x59a03 waiting 
 on condition [0x000140ea5000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x0007cade1480 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.snapshot(MetricsRegionSourceImpl.java:193)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.getMetrics(MetricsRegionAggregateSourceImpl.java:115)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:195)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:172)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMBeanInfo(MetricsSourceAdapter.java:151)
   at 
 

[jira] [Commented] (HBASE-14274) Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs MetricsRegionAggregateSourceImpl

2015-08-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707756#comment-14707756
 ] 

Hudson commented on HBASE-14274:


SUCCESS: Integrated in HBase-1.3-IT #109 (See 
[https://builds.apache.org/job/HBase-1.3-IT/109/])
HBASE-14274 Addendum sets closed to true when closing (tedyu: rev 
f05770b96f6246ffe851e6878dc07c4b7c02906f)
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSourceImpl.java


 Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs 
 MetricsRegionAggregateSourceImpl
 ---

 Key: HBASE-14274
 URL: https://issues.apache.org/jira/browse/HBASE-14274
 Project: HBase
  Issue Type: Sub-task
  Components: test
Reporter: stack
Assignee: Elliott Clark
 Fix For: 2.0.0, 1.2.0, 1.3.0

 Attachments: 14274-addendum.txt, 23612.stack, HBASE-14274-v1.patch, 
 HBASE-14274.patch


 Looking into parent issue, got a hang locally of TestDistributedLogReplay.
 We have region closes here:
 {code}
 RS_CLOSE_META-localhost:59610-0 prio=5 tid=0x7ff65c03f800 nid=0x54347 
 waiting on condition [0x00011f7ac000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x00075636d8c0 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:78)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:120)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:41)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1500)
   at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1344)
   - locked 0x0007ff878190 (a java.lang.Object)
   at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:102)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:103)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 {code}
 They are trying to MetricsRegionAggregateSourceImpl.deregister. They want to 
 get a write lock on this classes local ReentrantReadWriteLock while holding 
 MetricsRegionSourceImpl's readWriteLock write lock.
 Then, elsewhere the JmxCacheBuster is running trying to get metrics with 
 above locks held in reverse:
 {code}
 HBase-Metrics2-1 daemon prio=5 tid=0x7ff65e14b000 nid=0x59a03 waiting 
 on condition [0x000140ea5000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x0007cade1480 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.snapshot(MetricsRegionSourceImpl.java:193)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.getMetrics(MetricsRegionAggregateSourceImpl.java:115)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:195)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:172)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMBeanInfo(MetricsSourceAdapter.java:151)
   at 
 

[jira] [Commented] (HBASE-14274) Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs MetricsRegionAggregateSourceImpl

2015-08-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707743#comment-14707743
 ] 

Hudson commented on HBASE-14274:


SUCCESS: Integrated in HBase-1.2-IT #107 (See 
[https://builds.apache.org/job/HBase-1.2-IT/107/])
HBASE-14274 Addendum sets closed to true when closing (tedyu: rev 
1484aecc2635fdaecfeeeb368eafa2204041a8a9)
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSourceImpl.java


 Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs 
 MetricsRegionAggregateSourceImpl
 ---

 Key: HBASE-14274
 URL: https://issues.apache.org/jira/browse/HBASE-14274
 Project: HBase
  Issue Type: Sub-task
  Components: test
Reporter: stack
Assignee: Elliott Clark
 Fix For: 2.0.0, 1.2.0, 1.3.0

 Attachments: 14274-addendum.txt, 23612.stack, HBASE-14274-v1.patch, 
 HBASE-14274.patch


 Looking into parent issue, got a hang locally of TestDistributedLogReplay.
 We have region closes here:
 {code}
 RS_CLOSE_META-localhost:59610-0 prio=5 tid=0x7ff65c03f800 nid=0x54347 
 waiting on condition [0x00011f7ac000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x00075636d8c0 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:78)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:120)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:41)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1500)
   at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1344)
   - locked 0x0007ff878190 (a java.lang.Object)
   at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:102)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:103)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 {code}
 They are trying to MetricsRegionAggregateSourceImpl.deregister. They want to 
 get a write lock on this classes local ReentrantReadWriteLock while holding 
 MetricsRegionSourceImpl's readWriteLock write lock.
 Then, elsewhere the JmxCacheBuster is running trying to get metrics with 
 above locks held in reverse:
 {code}
 HBase-Metrics2-1 daemon prio=5 tid=0x7ff65e14b000 nid=0x59a03 waiting 
 on condition [0x000140ea5000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x0007cade1480 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.snapshot(MetricsRegionSourceImpl.java:193)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.getMetrics(MetricsRegionAggregateSourceImpl.java:115)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:195)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:172)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMBeanInfo(MetricsSourceAdapter.java:151)
   at 
 

[jira] [Commented] (HBASE-14274) Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs MetricsRegionAggregateSourceImpl

2015-08-21 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706275#comment-14706275
 ] 

Sean Busbey commented on HBASE-14274:
-

cherry-picked to 1.2

 Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs 
 MetricsRegionAggregateSourceImpl
 ---

 Key: HBASE-14274
 URL: https://issues.apache.org/jira/browse/HBASE-14274
 Project: HBase
  Issue Type: Sub-task
  Components: test
Reporter: stack
Assignee: Elliott Clark
 Fix For: 2.0.0, 1.2.0, 1.3.0

 Attachments: 23612.stack, HBASE-14274-v1.patch, HBASE-14274.patch


 Looking into parent issue, got a hang locally of TestDistributedLogReplay.
 We have region closes here:
 {code}
 RS_CLOSE_META-localhost:59610-0 prio=5 tid=0x7ff65c03f800 nid=0x54347 
 waiting on condition [0x00011f7ac000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x00075636d8c0 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:78)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:120)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:41)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1500)
   at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1344)
   - locked 0x0007ff878190 (a java.lang.Object)
   at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:102)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:103)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 {code}
 They are trying to MetricsRegionAggregateSourceImpl.deregister. They want to 
 get a write lock on this classes local ReentrantReadWriteLock while holding 
 MetricsRegionSourceImpl's readWriteLock write lock.
 Then, elsewhere the JmxCacheBuster is running trying to get metrics with 
 above locks held in reverse:
 {code}
 HBase-Metrics2-1 daemon prio=5 tid=0x7ff65e14b000 nid=0x59a03 waiting 
 on condition [0x000140ea5000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x0007cade1480 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.snapshot(MetricsRegionSourceImpl.java:193)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.getMetrics(MetricsRegionAggregateSourceImpl.java:115)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:195)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:172)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMBeanInfo(MetricsSourceAdapter.java:151)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getNewMBeanClassName(DefaultMBeanServerInterceptor.java:333)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:319)
   at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
   at 

[jira] [Commented] (HBASE-14274) Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs MetricsRegionAggregateSourceImpl

2015-08-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706358#comment-14706358
 ] 

Hudson commented on HBASE-14274:


FAILURE: Integrated in HBase-1.3-IT #108 (See 
[https://builds.apache.org/job/HBase-1.3-IT/108/])
HBASE-14274 Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs 
MetricsRegionAggregateSourceImpl (stack: rev 
f4ad31c8f91782e51e0bffdcd11c587a6c1dd3e9)
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionAggregateSourceImpl.java
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSourceImpl.java


 Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs 
 MetricsRegionAggregateSourceImpl
 ---

 Key: HBASE-14274
 URL: https://issues.apache.org/jira/browse/HBASE-14274
 Project: HBase
  Issue Type: Sub-task
  Components: test
Reporter: stack
Assignee: Elliott Clark
 Fix For: 2.0.0, 1.2.0, 1.3.0

 Attachments: 23612.stack, HBASE-14274-v1.patch, HBASE-14274.patch


 Looking into parent issue, got a hang locally of TestDistributedLogReplay.
 We have region closes here:
 {code}
 RS_CLOSE_META-localhost:59610-0 prio=5 tid=0x7ff65c03f800 nid=0x54347 
 waiting on condition [0x00011f7ac000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x00075636d8c0 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:78)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:120)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:41)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1500)
   at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1344)
   - locked 0x0007ff878190 (a java.lang.Object)
   at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:102)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:103)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 {code}
 They are trying to MetricsRegionAggregateSourceImpl.deregister. They want to 
 get a write lock on this classes local ReentrantReadWriteLock while holding 
 MetricsRegionSourceImpl's readWriteLock write lock.
 Then, elsewhere the JmxCacheBuster is running trying to get metrics with 
 above locks held in reverse:
 {code}
 HBase-Metrics2-1 daemon prio=5 tid=0x7ff65e14b000 nid=0x59a03 waiting 
 on condition [0x000140ea5000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x0007cade1480 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.snapshot(MetricsRegionSourceImpl.java:193)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.getMetrics(MetricsRegionAggregateSourceImpl.java:115)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:195)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:172)
   at 
 

[jira] [Commented] (HBASE-14274) Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs MetricsRegionAggregateSourceImpl

2015-08-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706210#comment-14706210
 ] 

stack commented on HBASE-14274:
---

[~busbey] Looks like this is needed for 1.2 too. I applied to branch-1.

 Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs 
 MetricsRegionAggregateSourceImpl
 ---

 Key: HBASE-14274
 URL: https://issues.apache.org/jira/browse/HBASE-14274
 Project: HBase
  Issue Type: Sub-task
  Components: test
Reporter: stack
Assignee: Elliott Clark
 Fix For: 2.0.0, 1.3.0

 Attachments: 23612.stack, HBASE-14274-v1.patch, HBASE-14274.patch


 Looking into parent issue, got a hang locally of TestDistributedLogReplay.
 We have region closes here:
 {code}
 RS_CLOSE_META-localhost:59610-0 prio=5 tid=0x7ff65c03f800 nid=0x54347 
 waiting on condition [0x00011f7ac000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x00075636d8c0 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:78)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:120)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:41)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1500)
   at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1344)
   - locked 0x0007ff878190 (a java.lang.Object)
   at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:102)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:103)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 {code}
 They are trying to MetricsRegionAggregateSourceImpl.deregister. They want to 
 get a write lock on this classes local ReentrantReadWriteLock while holding 
 MetricsRegionSourceImpl's readWriteLock write lock.
 Then, elsewhere the JmxCacheBuster is running trying to get metrics with 
 above locks held in reverse:
 {code}
 HBase-Metrics2-1 daemon prio=5 tid=0x7ff65e14b000 nid=0x59a03 waiting 
 on condition [0x000140ea5000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x0007cade1480 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.snapshot(MetricsRegionSourceImpl.java:193)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.getMetrics(MetricsRegionAggregateSourceImpl.java:115)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:195)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:172)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMBeanInfo(MetricsSourceAdapter.java:151)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getNewMBeanClassName(DefaultMBeanServerInterceptor.java:333)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:319)
   at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
   at 

[jira] [Commented] (HBASE-14274) Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs MetricsRegionAggregateSourceImpl

2015-08-20 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706009#comment-14706009
 ] 

Elliott Clark commented on HBASE-14274:
---

Ok so I can't get that log message to go away without lots of interesting work 
other places. so lets get this in and we can work on the log message more in a 
follow on jira.

 Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs 
 MetricsRegionAggregateSourceImpl
 ---

 Key: HBASE-14274
 URL: https://issues.apache.org/jira/browse/HBASE-14274
 Project: HBase
  Issue Type: Sub-task
  Components: test
Reporter: stack
 Attachments: 23612.stack, HBASE-14274-v1.patch, HBASE-14274.patch


 Looking into parent issue, got a hang locally of TestDistributedLogReplay.
 We have region closes here:
 {code}
 RS_CLOSE_META-localhost:59610-0 prio=5 tid=0x7ff65c03f800 nid=0x54347 
 waiting on condition [0x00011f7ac000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x00075636d8c0 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:78)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:120)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:41)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1500)
   at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1344)
   - locked 0x0007ff878190 (a java.lang.Object)
   at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:102)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:103)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 {code}
 They are trying to MetricsRegionAggregateSourceImpl.deregister. They want to 
 get a write lock on this classes local ReentrantReadWriteLock while holding 
 MetricsRegionSourceImpl's readWriteLock write lock.
 Then, elsewhere the JmxCacheBuster is running trying to get metrics with 
 above locks held in reverse:
 {code}
 HBase-Metrics2-1 daemon prio=5 tid=0x7ff65e14b000 nid=0x59a03 waiting 
 on condition [0x000140ea5000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x0007cade1480 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.snapshot(MetricsRegionSourceImpl.java:193)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.getMetrics(MetricsRegionAggregateSourceImpl.java:115)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:195)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:172)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMBeanInfo(MetricsSourceAdapter.java:151)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getNewMBeanClassName(DefaultMBeanServerInterceptor.java:333)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:319)
   at 
 

[jira] [Commented] (HBASE-14274) Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs MetricsRegionAggregateSourceImpl

2015-08-20 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705929#comment-14705929
 ] 

Elliott Clark commented on HBASE-14274:
---

For that one it seems like the cache buster ran before the nn was all set up. I 
don't think that it's caused by the patch, more likely caused the fact that 
HBase has to stop and start the metrics system.

# That shouldn't be an issue in real life.
# Let me get something up so that the logs are cleaner.


 Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs 
 MetricsRegionAggregateSourceImpl
 ---

 Key: HBASE-14274
 URL: https://issues.apache.org/jira/browse/HBASE-14274
 Project: HBase
  Issue Type: Sub-task
  Components: test
Reporter: stack
 Attachments: 23612.stack, HBASE-14274-v1.patch, HBASE-14274.patch


 Looking into parent issue, got a hang locally of TestDistributedLogReplay.
 We have region closes here:
 {code}
 RS_CLOSE_META-localhost:59610-0 prio=5 tid=0x7ff65c03f800 nid=0x54347 
 waiting on condition [0x00011f7ac000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x00075636d8c0 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:78)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:120)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:41)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1500)
   at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1344)
   - locked 0x0007ff878190 (a java.lang.Object)
   at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:102)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:103)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 {code}
 They are trying to MetricsRegionAggregateSourceImpl.deregister. They want to 
 get a write lock on this classes local ReentrantReadWriteLock while holding 
 MetricsRegionSourceImpl's readWriteLock write lock.
 Then, elsewhere the JmxCacheBuster is running trying to get metrics with 
 above locks held in reverse:
 {code}
 HBase-Metrics2-1 daemon prio=5 tid=0x7ff65e14b000 nid=0x59a03 waiting 
 on condition [0x000140ea5000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x0007cade1480 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.snapshot(MetricsRegionSourceImpl.java:193)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.getMetrics(MetricsRegionAggregateSourceImpl.java:115)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:195)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:172)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMBeanInfo(MetricsSourceAdapter.java:151)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getNewMBeanClassName(DefaultMBeanServerInterceptor.java:333)
   at 
 

[jira] [Commented] (HBASE-14274) Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs MetricsRegionAggregateSourceImpl

2015-08-20 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705776#comment-14705776
 ] 

Elliott Clark commented on HBASE-14274:
---

We can get away without the clearJmxCache however I think that we need to have 
agg.deregister above all the removeMetric things or we risk a race.

 Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs 
 MetricsRegionAggregateSourceImpl
 ---

 Key: HBASE-14274
 URL: https://issues.apache.org/jira/browse/HBASE-14274
 Project: HBase
  Issue Type: Sub-task
  Components: test
Reporter: stack
 Attachments: 23612.stack, HBASE-14274.patch


 Looking into parent issue, got a hang locally of TestDistributedLogReplay.
 We have region closes here:
 {code}
 RS_CLOSE_META-localhost:59610-0 prio=5 tid=0x7ff65c03f800 nid=0x54347 
 waiting on condition [0x00011f7ac000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x00075636d8c0 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:78)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:120)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:41)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1500)
   at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1344)
   - locked 0x0007ff878190 (a java.lang.Object)
   at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:102)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:103)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 {code}
 They are trying to MetricsRegionAggregateSourceImpl.deregister. They want to 
 get a write lock on this classes local ReentrantReadWriteLock while holding 
 MetricsRegionSourceImpl's readWriteLock write lock.
 Then, elsewhere the JmxCacheBuster is running trying to get metrics with 
 above locks held in reverse:
 {code}
 HBase-Metrics2-1 daemon prio=5 tid=0x7ff65e14b000 nid=0x59a03 waiting 
 on condition [0x000140ea5000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x0007cade1480 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.snapshot(MetricsRegionSourceImpl.java:193)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.getMetrics(MetricsRegionAggregateSourceImpl.java:115)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:195)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:172)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMBeanInfo(MetricsSourceAdapter.java:151)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getNewMBeanClassName(DefaultMBeanServerInterceptor.java:333)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:319)
   at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
   at 

[jira] [Commented] (HBASE-14274) Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs MetricsRegionAggregateSourceImpl

2015-08-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705723#comment-14705723
 ] 

stack commented on HBASE-14274:
---

I was going to ask if CHM would do.

What about MetricsRegionSourceImpl#close?  It calls add.deregister which will 
run the cache buster... then still inside the lock, we'll again call 
clearJmxCache.  Move the add.deregister in place of the call to clearJmxCache?

How we know this stuff is doing the metrics clearing you want [~eclark]? Thanks.

 Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs 
 MetricsRegionAggregateSourceImpl
 ---

 Key: HBASE-14274
 URL: https://issues.apache.org/jira/browse/HBASE-14274
 Project: HBase
  Issue Type: Sub-task
  Components: test
Reporter: stack
 Attachments: 23612.stack, HBASE-14274.patch


 Looking into parent issue, got a hang locally of TestDistributedLogReplay.
 We have region closes here:
 {code}
 RS_CLOSE_META-localhost:59610-0 prio=5 tid=0x7ff65c03f800 nid=0x54347 
 waiting on condition [0x00011f7ac000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x00075636d8c0 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:78)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:120)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:41)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1500)
   at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1344)
   - locked 0x0007ff878190 (a java.lang.Object)
   at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:102)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:103)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 {code}
 They are trying to MetricsRegionAggregateSourceImpl.deregister. They want to 
 get a write lock on this classes local ReentrantReadWriteLock while holding 
 MetricsRegionSourceImpl's readWriteLock write lock.
 Then, elsewhere the JmxCacheBuster is running trying to get metrics with 
 above locks held in reverse:
 {code}
 HBase-Metrics2-1 daemon prio=5 tid=0x7ff65e14b000 nid=0x59a03 waiting 
 on condition [0x000140ea5000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x0007cade1480 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.snapshot(MetricsRegionSourceImpl.java:193)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.getMetrics(MetricsRegionAggregateSourceImpl.java:115)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:195)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:172)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMBeanInfo(MetricsSourceAdapter.java:151)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getNewMBeanClassName(DefaultMBeanServerInterceptor.java:333)
   at 
 

[jira] [Commented] (HBASE-14274) Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs MetricsRegionAggregateSourceImpl

2015-08-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705727#comment-14705727
 ] 

stack commented on HBASE-14274:
---

Was thinking of doing this:

{code}
diff --git 
a/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSourceImpl.java
 
b/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSourceImpl.java
index 7290c56..fab6861 100644
--- 
a/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSourceImpl.java
+++ 
b/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSourceImpl.java
@@ -117,7 +117,6 @@ public class MetricsRegionSourceImpl implements 
MetricsRegionSource {
   }

   closed = true;
-  agg.deregister(this);

   if (LOG.isTraceEnabled()) {
 LOG.trace(Removing region Metrics:  + regionWrapper.getRegionName());
@@ -131,10 +130,8 @@ public class MetricsRegionSourceImpl implements 
MetricsRegionSource {
   registry.removeMetric(regionScanNextKey);
   registry.removeHistogramMetrics(regionGetKey);
   registry.removeHistogramMetrics(regionScanNextKey);
-
   regionWrapper = null;
-
-  JmxCacheBuster.clearJmxCache();
+  agg.deregister(this);
 } finally {
   lock.unlock();
 }
{code}

... but not sure how registry places with aggregating bean.  We need this extra 
run of clearJmxCache in here?

 Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs 
 MetricsRegionAggregateSourceImpl
 ---

 Key: HBASE-14274
 URL: https://issues.apache.org/jira/browse/HBASE-14274
 Project: HBase
  Issue Type: Sub-task
  Components: test
Reporter: stack
 Attachments: 23612.stack, HBASE-14274.patch


 Looking into parent issue, got a hang locally of TestDistributedLogReplay.
 We have region closes here:
 {code}
 RS_CLOSE_META-localhost:59610-0 prio=5 tid=0x7ff65c03f800 nid=0x54347 
 waiting on condition [0x00011f7ac000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x00075636d8c0 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:78)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:120)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:41)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1500)
   at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1344)
   - locked 0x0007ff878190 (a java.lang.Object)
   at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:102)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:103)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 {code}
 They are trying to MetricsRegionAggregateSourceImpl.deregister. They want to 
 get a write lock on this classes local ReentrantReadWriteLock while holding 
 MetricsRegionSourceImpl's readWriteLock write lock.
 Then, elsewhere the JmxCacheBuster is running trying to get metrics with 
 above locks held in reverse:
 {code}
 HBase-Metrics2-1 daemon prio=5 tid=0x7ff65e14b000 nid=0x59a03 waiting 
 on condition [0x000140ea5000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x0007cade1480 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
   at 
 

[jira] [Commented] (HBASE-14274) Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs MetricsRegionAggregateSourceImpl

2015-08-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705781#comment-14705781
 ] 

stack commented on HBASE-14274:
---

It will run the cachebuster... It could run before stuff is removed here in 
MetricsRegionSourceImpl? Would that leave dangling metrics? [~eclark]

 Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs 
 MetricsRegionAggregateSourceImpl
 ---

 Key: HBASE-14274
 URL: https://issues.apache.org/jira/browse/HBASE-14274
 Project: HBase
  Issue Type: Sub-task
  Components: test
Reporter: stack
 Attachments: 23612.stack, HBASE-14274.patch


 Looking into parent issue, got a hang locally of TestDistributedLogReplay.
 We have region closes here:
 {code}
 RS_CLOSE_META-localhost:59610-0 prio=5 tid=0x7ff65c03f800 nid=0x54347 
 waiting on condition [0x00011f7ac000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x00075636d8c0 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:78)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:120)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:41)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1500)
   at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1344)
   - locked 0x0007ff878190 (a java.lang.Object)
   at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:102)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:103)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 {code}
 They are trying to MetricsRegionAggregateSourceImpl.deregister. They want to 
 get a write lock on this classes local ReentrantReadWriteLock while holding 
 MetricsRegionSourceImpl's readWriteLock write lock.
 Then, elsewhere the JmxCacheBuster is running trying to get metrics with 
 above locks held in reverse:
 {code}
 HBase-Metrics2-1 daemon prio=5 tid=0x7ff65e14b000 nid=0x59a03 waiting 
 on condition [0x000140ea5000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x0007cade1480 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.snapshot(MetricsRegionSourceImpl.java:193)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.getMetrics(MetricsRegionAggregateSourceImpl.java:115)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:195)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:172)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMBeanInfo(MetricsSourceAdapter.java:151)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getNewMBeanClassName(DefaultMBeanServerInterceptor.java:333)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:319)
   at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
   at 

[jira] [Commented] (HBASE-14274) Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs MetricsRegionAggregateSourceImpl

2015-08-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705672#comment-14705672
 ] 

stack commented on HBASE-14274:
---

Hmm... not enough. I see the cache buster runs every 5 mins anyways, not just 
on region close.

 Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs 
 MetricsRegionAggregateSourceImpl
 ---

 Key: HBASE-14274
 URL: https://issues.apache.org/jira/browse/HBASE-14274
 Project: HBase
  Issue Type: Sub-task
  Components: test
Reporter: stack
 Attachments: 23612.stack


 Looking into parent issue, got a hang locally of TestDistributedLogReplay.
 We have region closes here:
 {code}
 RS_CLOSE_META-localhost:59610-0 prio=5 tid=0x7ff65c03f800 nid=0x54347 
 waiting on condition [0x00011f7ac000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x00075636d8c0 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:78)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:120)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:41)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1500)
   at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1344)
   - locked 0x0007ff878190 (a java.lang.Object)
   at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:102)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:103)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 {code}
 They are trying to MetricsRegionAggregateSourceImpl.deregister. They want to 
 get a write lock on this classes local ReentrantReadWriteLock while holding 
 MetricsRegionSourceImpl's readWriteLock write lock.
 Then, elsewhere the JmxCacheBuster is running trying to get metrics with 
 above locks held in reverse:
 {code}
 HBase-Metrics2-1 daemon prio=5 tid=0x7ff65e14b000 nid=0x59a03 waiting 
 on condition [0x000140ea5000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x0007cade1480 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.snapshot(MetricsRegionSourceImpl.java:193)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.getMetrics(MetricsRegionAggregateSourceImpl.java:115)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:195)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:172)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMBeanInfo(MetricsSourceAdapter.java:151)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getNewMBeanClassName(DefaultMBeanServerInterceptor.java:333)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:319)
   at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
   at org.apache.hadoop.metrics2.util.MBeans.register(MBeans.java:57)
   at 
 

[jira] [Commented] (HBASE-14274) Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs MetricsRegionAggregateSourceImpl

2015-08-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705671#comment-14705671
 ] 

stack commented on HBASE-14274:
---

Thanks [~apurtell]

[~eclark] How about this?

diff --git 
a/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSourceImpl.java
 
b/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSourceImpl.java
index 7290c56..87272c4 100644
--- 
a/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSourceImpl.java
+++ 
b/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSourceImpl.java
@@ -133,11 +133,10 @@ public class MetricsRegionSourceImpl implements 
MetricsRegionSource {
   registry.removeHistogramMetrics(regionScanNextKey);

   regionWrapper = null;
-
-  JmxCacheBuster.clearJmxCache();
 } finally {
   lock.unlock();
 }
+JmxCacheBuster.clearJmxCache();
   }

We don't need the cache buster to run under the lock, right? And we can be 
stale for a few millis?

Let me try and make a test.

 Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs 
 MetricsRegionAggregateSourceImpl
 ---

 Key: HBASE-14274
 URL: https://issues.apache.org/jira/browse/HBASE-14274
 Project: HBase
  Issue Type: Sub-task
  Components: test
Reporter: stack
 Attachments: 23612.stack


 Looking into parent issue, got a hang locally of TestDistributedLogReplay.
 We have region closes here:
 {code}
 RS_CLOSE_META-localhost:59610-0 prio=5 tid=0x7ff65c03f800 nid=0x54347 
 waiting on condition [0x00011f7ac000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x00075636d8c0 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:78)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:120)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:41)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1500)
   at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1344)
   - locked 0x0007ff878190 (a java.lang.Object)
   at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:102)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:103)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 {code}
 They are trying to MetricsRegionAggregateSourceImpl.deregister. They want to 
 get a write lock on this classes local ReentrantReadWriteLock while holding 
 MetricsRegionSourceImpl's readWriteLock write lock.
 Then, elsewhere the JmxCacheBuster is running trying to get metrics with 
 above locks held in reverse:
 {code}
 HBase-Metrics2-1 daemon prio=5 tid=0x7ff65e14b000 nid=0x59a03 waiting 
 on condition [0x000140ea5000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x0007cade1480 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.snapshot(MetricsRegionSourceImpl.java:193)
   at 
 

[jira] [Commented] (HBASE-14274) Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs MetricsRegionAggregateSourceImpl

2015-08-20 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705605#comment-14705605
 ] 

Andrew Purtell commented on HBASE-14274:


JMX cache buster came in on HBASE-14166

 Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs 
 MetricsRegionAggregateSourceImpl
 ---

 Key: HBASE-14274
 URL: https://issues.apache.org/jira/browse/HBASE-14274
 Project: HBase
  Issue Type: Sub-task
  Components: test
Reporter: stack
 Attachments: 23612.stack


 Looking into parent issue, got a hang locally of TestDistributedLogReplay.
 We have region closes here:
 {code}
 RS_CLOSE_META-localhost:59610-0 prio=5 tid=0x7ff65c03f800 nid=0x54347 
 waiting on condition [0x00011f7ac000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x00075636d8c0 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:78)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:120)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:41)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1500)
   at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1344)
   - locked 0x0007ff878190 (a java.lang.Object)
   at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:102)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:103)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 {code}
 They are trying to MetricsRegionAggregateSourceImpl.deregister. They want to 
 get a write lock on this classes local ReentrantReadWriteLock while holding 
 MetricsRegionSourceImpl's readWriteLock write lock.
 Then, elsewhere the JmxCacheBuster is running trying to get metrics with 
 above locks held in reverse:
 {code}
 HBase-Metrics2-1 daemon prio=5 tid=0x7ff65e14b000 nid=0x59a03 waiting 
 on condition [0x000140ea5000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x0007cade1480 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.snapshot(MetricsRegionSourceImpl.java:193)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.getMetrics(MetricsRegionAggregateSourceImpl.java:115)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:195)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:172)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMBeanInfo(MetricsSourceAdapter.java:151)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getNewMBeanClassName(DefaultMBeanServerInterceptor.java:333)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:319)
   at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
   at org.apache.hadoop.metrics2.util.MBeans.register(MBeans.java:57)
   at 
 

[jira] [Commented] (HBASE-14274) Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs MetricsRegionAggregateSourceImpl

2015-08-20 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705687#comment-14705687
 ] 

Elliott Clark commented on HBASE-14274:
---

Yeah cache buster runs in the background, so it will be better to put it out 
side of the lock but it shouldn't change much.

 Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs 
 MetricsRegionAggregateSourceImpl
 ---

 Key: HBASE-14274
 URL: https://issues.apache.org/jira/browse/HBASE-14274
 Project: HBase
  Issue Type: Sub-task
  Components: test
Reporter: stack
 Attachments: 23612.stack


 Looking into parent issue, got a hang locally of TestDistributedLogReplay.
 We have region closes here:
 {code}
 RS_CLOSE_META-localhost:59610-0 prio=5 tid=0x7ff65c03f800 nid=0x54347 
 waiting on condition [0x00011f7ac000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x00075636d8c0 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:78)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:120)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:41)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1500)
   at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1344)
   - locked 0x0007ff878190 (a java.lang.Object)
   at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:102)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:103)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 {code}
 They are trying to MetricsRegionAggregateSourceImpl.deregister. They want to 
 get a write lock on this classes local ReentrantReadWriteLock while holding 
 MetricsRegionSourceImpl's readWriteLock write lock.
 Then, elsewhere the JmxCacheBuster is running trying to get metrics with 
 above locks held in reverse:
 {code}
 HBase-Metrics2-1 daemon prio=5 tid=0x7ff65e14b000 nid=0x59a03 waiting 
 on condition [0x000140ea5000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x0007cade1480 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.snapshot(MetricsRegionSourceImpl.java:193)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.getMetrics(MetricsRegionAggregateSourceImpl.java:115)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:195)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:172)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMBeanInfo(MetricsSourceAdapter.java:151)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getNewMBeanClassName(DefaultMBeanServerInterceptor.java:333)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:319)
   at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
   at 

[jira] [Commented] (HBASE-14274) Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs MetricsRegionAggregateSourceImpl

2015-08-20 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705701#comment-14705701
 ] 

Elliott Clark commented on HBASE-14274:
---

I can just remove the lock in metrics region source impl.

 Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs 
 MetricsRegionAggregateSourceImpl
 ---

 Key: HBASE-14274
 URL: https://issues.apache.org/jira/browse/HBASE-14274
 Project: HBase
  Issue Type: Sub-task
  Components: test
Reporter: stack
 Attachments: 23612.stack


 Looking into parent issue, got a hang locally of TestDistributedLogReplay.
 We have region closes here:
 {code}
 RS_CLOSE_META-localhost:59610-0 prio=5 tid=0x7ff65c03f800 nid=0x54347 
 waiting on condition [0x00011f7ac000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x00075636d8c0 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:78)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:120)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:41)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1500)
   at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1344)
   - locked 0x0007ff878190 (a java.lang.Object)
   at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:102)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:103)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 {code}
 They are trying to MetricsRegionAggregateSourceImpl.deregister. They want to 
 get a write lock on this classes local ReentrantReadWriteLock while holding 
 MetricsRegionSourceImpl's readWriteLock write lock.
 Then, elsewhere the JmxCacheBuster is running trying to get metrics with 
 above locks held in reverse:
 {code}
 HBase-Metrics2-1 daemon prio=5 tid=0x7ff65e14b000 nid=0x59a03 waiting 
 on condition [0x000140ea5000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x0007cade1480 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.snapshot(MetricsRegionSourceImpl.java:193)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.getMetrics(MetricsRegionAggregateSourceImpl.java:115)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:195)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:172)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMBeanInfo(MetricsSourceAdapter.java:151)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getNewMBeanClassName(DefaultMBeanServerInterceptor.java:333)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:319)
   at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
   at org.apache.hadoop.metrics2.util.MBeans.register(MBeans.java:57)
   at 
 

[jira] [Commented] (HBASE-14274) Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs MetricsRegionAggregateSourceImpl

2015-08-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705898#comment-14705898
 ] 

stack commented on HBASE-14274:
---

+1

Shouldn't deadlock anymore given you've removed both locks (smile).

I tried it local and it is good. I was able to deadlock pretty easily locally 
but not w/ this patch applied.

 Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs 
 MetricsRegionAggregateSourceImpl
 ---

 Key: HBASE-14274
 URL: https://issues.apache.org/jira/browse/HBASE-14274
 Project: HBase
  Issue Type: Sub-task
  Components: test
Reporter: stack
 Attachments: 23612.stack, HBASE-14274-v1.patch, HBASE-14274.patch


 Looking into parent issue, got a hang locally of TestDistributedLogReplay.
 We have region closes here:
 {code}
 RS_CLOSE_META-localhost:59610-0 prio=5 tid=0x7ff65c03f800 nid=0x54347 
 waiting on condition [0x00011f7ac000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x00075636d8c0 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:78)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:120)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:41)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1500)
   at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1344)
   - locked 0x0007ff878190 (a java.lang.Object)
   at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:102)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:103)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 {code}
 They are trying to MetricsRegionAggregateSourceImpl.deregister. They want to 
 get a write lock on this classes local ReentrantReadWriteLock while holding 
 MetricsRegionSourceImpl's readWriteLock write lock.
 Then, elsewhere the JmxCacheBuster is running trying to get metrics with 
 above locks held in reverse:
 {code}
 HBase-Metrics2-1 daemon prio=5 tid=0x7ff65e14b000 nid=0x59a03 waiting 
 on condition [0x000140ea5000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x0007cade1480 (a 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
   at 
 java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.snapshot(MetricsRegionSourceImpl.java:193)
   at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.getMetrics(MetricsRegionAggregateSourceImpl.java:115)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:195)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:172)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMBeanInfo(MetricsSourceAdapter.java:151)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getNewMBeanClassName(DefaultMBeanServerInterceptor.java:333)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:319)
   at 
 

[jira] [Commented] (HBASE-14274) Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs MetricsRegionAggregateSourceImpl

2015-08-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705903#comment-14705903
 ] 

stack commented on HBASE-14274:
---

This related to your change? Should protect against it?

{code}
119113 2015-08-20 15:31:10,704 WARN  [HBase-Metrics2-1] 
impl.MetricsConfig(124): Cannot locate configuration: tried 
hadoop-metrics2-hbase.properties,hadoop-metrics2.properties
119114 2015-08-20 15:31:10,710 ERROR [HBase-Metrics2-1] 
lib.MethodMetric$2(118): Error invoking method getBlocksTotal
119115 java.lang.reflect.InvocationTargetException
119116 ›   at sun.reflect.GeneratedMethodAccessor72.invoke(Unknown Source)
119117 ›   at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
119118 ›   at java.lang.reflect.Method.invoke(Method.java:606)
119119 ›   at 
org.apache.hadoop.metrics2.lib.MethodMetric$2.snapshot(MethodMetric.java:111)
119120 ›   at 
org.apache.hadoop.metrics2.lib.MethodMetric.snapshot(MethodMetric.java:144)
119121 ›   at 
org.apache.hadoop.metrics2.lib.MetricsRegistry.snapshot(MetricsRegistry.java:387)
119122 ›   at 
org.apache.hadoop.metrics2.lib.MetricsSourceBuilder$1.getMetrics(MetricsSourceBuilder.java:79)
119123 ›   at 
org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:195)
119124 ›   at 
org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:172)
119125 ›   at 
org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMBeanInfo(MetricsSourceAdapter.java:151)
119126 ›   at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getNewMBeanClassName(DefaultMBeanServerInterceptor.java:333)
119127 ›   at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:319)
119128 ›   at 
com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
119129 ›   at org.apache.hadoop.metrics2.util.MBeans.register(MBeans.java:57)
119130 ›   at 
org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.startMBeans(MetricsSourceAdapter.java:221)
119131 ›   at 
org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.start(MetricsSourceAdapter.java:96)
119132 ›   at 
org.apache.hadoop.metrics2.impl.MetricsSystemImpl.registerSource(MetricsSystemImpl.java:245)
119133 ›   at 
org.apache.hadoop.metrics2.impl.MetricsSystemImpl$1.postStart(MetricsSystemImpl.java:229)
119134 ›   at sun.reflect.GeneratedMethodAccessor50.invoke(Unknown Source)
119135 ›   at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
119136 ›   at java.lang.reflect.Method.invoke(Method.java:606)
119137 ›   at 
org.apache.hadoop.metrics2.impl.MetricsSystemImpl$3.invoke(MetricsSystemImpl.java:290)
119138 ›   at com.sun.proxy.$Proxy13.postStart(Unknown Source)
119139 ›   at 
org.apache.hadoop.metrics2.impl.MetricsSystemImpl.start(MetricsSystemImpl.java:185)
119140 ›   at 
org.apache.hadoop.metrics2.impl.JmxCacheBuster$JmxCacheBusterRunnable.run(JmxCacheBuster.java:81)
119141 ›   at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
119142 ›   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
119143 ›   at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
119144 ›   at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
119145 ›   at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
119146 ›   at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
119147 ›   at java.lang.Thread.run(Thread.java:744)
119148 Caused by: java.lang.NullPointerException
119149 ›   at 
org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.size(BlocksMap.java:198)
119150 ›   at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.getTotalBlocks(BlockManager.java:3158)
119151 ›   at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlocksTotal(FSNamesystem.java:5652)
119152 ›   ... 32 more
{code}

 Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs 
 MetricsRegionAggregateSourceImpl
 ---

 Key: HBASE-14274
 URL: https://issues.apache.org/jira/browse/HBASE-14274
 Project: HBase
  Issue Type: Sub-task
  Components: test
Reporter: stack
 Attachments: 23612.stack, HBASE-14274-v1.patch, HBASE-14274.patch


 Looking into parent issue, got a hang locally of TestDistributedLogReplay.
 We have region closes here:
 {code}
 RS_CLOSE_META-localhost:59610-0 prio=5 tid=0x7ff65c03f800 nid=0x54347 
 waiting on condition [0x00011f7ac000]
java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for