[
https://issues.apache.org/jira/browse/HBASE-7865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13579869#comment-13579869
]
Ted Yu commented on HBASE-7865:
-------------------------------
>From
>https://issues.apache.org/jira/secure/attachment/12569671/jstack_node3.txt:
{code}
"RS_CLOSE_REGION-node3,60020,1360977975359-2" prio=10 tid=0x00000000032ff000
nid=0x3a2d waiting on condition [0x00007f74f0ead000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000625b93ee8> (a
java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
at
java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290)
at
org.apache.hadoop.hbase.regionserver.wal.HLog.startCacheFlush(HLog.java:1551)
at
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1490)
at
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1435)
at
org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:968)
at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:916)
- locked <0x0000000627dee918> (a java.lang.Object)
at
org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:119)
at
org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
"RS_CLOSE_REGION-node3,60020,1360977975359-1" prio=10 tid=0x0000000002aab800
nid=0x3a2c waiting on condition [0x00007f74f1dbc000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000625b93ee8> (a
java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
at
java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290)
at
org.apache.hadoop.hbase.regionserver.wal.HLog.startCacheFlush(HLog.java:1551)
at
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1490)
at
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1435)
at
org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:968)
at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:916)
{code}
Looks like region was having trouble closing.
Region server log would help further diagnostics.
> HBase regionserver never stops when running `bin/stop-hbase.sh` on master
> -------------------------------------------------------------------------
>
> Key: HBASE-7865
> URL: https://issues.apache.org/jira/browse/HBASE-7865
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.94.5
> Reporter: Jean-Marc Spaggiari
> Attachments: jstack_node1.txt, jstack_node3.txt, jstack_node7.txt
>
>
> I faced 3 regions (out of 8) never stopping today. This is pretty bad because
> the script is supposed to wait until all the RS stopped to re-start
> everything, therefor, servers are never going back online.
> HBASE-7838 will help with that and will kill the RSs. But that will not
> really solve the root cause.
> Attached are the jstack for the 3 servers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira