[ 
https://issues.apache.org/jira/browse/HBASE-15900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15307450#comment-15307450
 ] 

Heng Chen commented on HBASE-15900:
-----------------------------------

We could see in 0d32a6bab354e6cc170cd59a2d485797.rs.log,  the region could not 
be closed due to there is one compaction left
{code}
2016-05-31 14:18:18,414 INFO  [PriorityRpcServer.handler=19,queue=1,port=16020] 
regionserver.RSRpcServices: Close 0d32a6bab354e6cc170cd59a2d485797, moving to 
dx-pipe-regionserver8-online,16020,1464675204532
2016-05-31 14:18:18,443 DEBUG 
[RS_CLOSE_REGION-dx-pipe-regionserver20-online:16020-1] 
handler.CloseRegionHandler: Processing close of 
frog_stastic,$+\xD6\x05_{211}_1454335203,1462760599763.0d32a6bab354e6cc170cd59a2d485797.
2016-05-31 14:18:18,443 DEBUG 
[RS_CLOSE_REGION-dx-pipe-regionserver20-online:16020-1] regionserver.HRegion: 
Closing 
frog_stastic,$+\xD6\x05_{211}_1454335203,1462760599763.0d32a6bab354e6cc170cd59a2d485797.:
 disabling compactions & flushes
2016-05-31 14:18:18,443 DEBUG 
[RS_CLOSE_REGION-dx-pipe-regionserver20-online:16020-1] regionserver.HRegion: 
waiting for 1 compactions to complete for region 
frog_stastic,$+\xD6\x05_{211}_1454335203,1462760599763.0d32a6bab354e6cc170cd59a2d485797.
{code}

The left compaction is one  major compaction run in longCompaction thread,  the 
progress of it reached 81.23%, then it has no response after that.
{code}
2016-05-31 05:28:29,476 DEBUG 
[regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
 regionserver.HStore: 0d32a6bab354e6cc170cd59a2d485797 - info: Initiating major 
compaction (all files)
2016-05-31 05:28:29,476 INFO  
[regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
 regionserver.HRegion: Starting compaction on info in region 
frog_stastic,$+\xD6\x05_{211}_1454335203,1462760599763.0d32a6bab354e6cc170cd59a2d485797.
2016-05-31 05:28:29,476 INFO  
[regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
 regionserver.HStore: Starting compaction of 9 file(s) in info of 
frog_stastic,$+\xD6\x05_{211}_1454335203,1462760599763.0d32a6bab354e6cc170cd59a2d485797.
 into 
tmpdir=hdfs://f04/hbase/data/default/frog_stastic/0d32a6bab354e6cc170cd59a2d485797/.tmp,
 totalSize=5.7 G
2016-05-31 05:28:29,476 DEBUG 
[regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
 compactions.Compactor: Compacting 
hdfs://f04/hbase/data/default/frog_stastic/0d32a6bab354e6cc170cd59a2d485797/info/c31e84594c444d3cb8de00029e24aa34,
 keycount=78408609, bloomtype=ROW, size=4.9 G, encoding=NONE, seqNum=77065948, 
earliestPutTs=1453899210588
2016-05-31 05:28:29,476 DEBUG 
[regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
 compactions.Compactor: Compacting 
hdfs://f04/hbase/data/default/frog_stastic/0d32a6bab354e6cc170cd59a2d485797/info/d3310d547e494e7db2b2e952e5fb412e,
 keycount=6356215, bloomtype=ROW, size=428.0 M, encoding=NONE, seqNum=78270525, 
earliestPutTs=1463613139775
2016-05-31 05:28:29,477 DEBUG 
[regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
 compactions.Compactor: Compacting 
hdfs://f04/hbase/data/default/frog_stastic/0d32a6bab354e6cc170cd59a2d485797/info/8b11d48ac6a241f1b3ace1a822f087b0,
 keycount=1850675, bloomtype=ROW, size=124.6 M, encoding=NONE, seqNum=78624661, 
earliestPutTs=1464048401491
2016-05-31 05:28:29,477 DEBUG 
[regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
 compactions.Compactor: Compacting 
hdfs://f04/hbase/data/default/frog_stastic/0d32a6bab354e6cc170cd59a2d485797/info/514ea98b95eb4b5fa4e122ebdd32d004,
 keycount=443135, bloomtype=ROW, size=29.9 M, encoding=NONE, seqNum=78710458, 
earliestPutTs=1464306574727
2016-05-31 05:28:29,478 DEBUG 
[regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
 compactions.Compactor: Compacting 
hdfs://f04/hbase/data/default/frog_stastic/0d32a6bab354e6cc170cd59a2d485797/info/bf4877dbdf804492878750c30f3cfd8e,
 keycount=780345, bloomtype=ROW, size=52.6 M, encoding=NONE, seqNum=78852324, 
earliestPutTs=1464307875726
2016-05-31 05:28:29,478 DEBUG 
[regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
 compactions.Compactor: Compacting 
hdfs://f04/hbase/data/default/frog_stastic/0d32a6bab354e6cc170cd59a2d485797/info/79ae0770fb58484e866c7b48f84e16a4,
 keycount=357040, bloomtype=ROW, size=24.1 M, encoding=NONE, seqNum=78916832, 
earliestPutTs=1464391849632
2016-05-31 05:28:29,478 DEBUG 
[regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
 compactions.Compactor: Compacting 
hdfs://f04/hbase/data/default/frog_stastic/0d32a6bab354e6cc170cd59a2d485797/info/e7708a72734448c5a86305a0224af683,
 keycount=1725115, bloomtype=ROW, size=116.1 M, encoding=NONE, seqNum=79190322, 
earliestPutTs=1464393070342
2016-05-31 05:28:29,479 DEBUG 
[regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
 compactions.Compactor: Compacting 
hdfs://f04/hbase/data/default/frog_stastic/0d32a6bab354e6cc170cd59a2d485797/info/1dcd54598541492bb20c279716488b5a,
 keycount=869015, bloomtype=ROW, size=58.6 M, encoding=NONE, seqNum=79334888, 
earliestPutTs=1464588026997
2016-05-31 05:28:29,479 DEBUG 
[regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
 compactions.Compactor: Compacting 
hdfs://f04/hbase/data/default/frog_stastic/0d32a6bab354e6cc170cd59a2d485797/info/0a76eb5184dd4d42bdd16a588d295d0e,
 keycount=53330, bloomtype=ROW, size=3.7 M, encoding=NONE, seqNum=79346806, 
earliestPutTs=1464591201410
2016-05-31 05:29:29,684 DEBUG 
[regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
 compactions.Compactor: Compaction progress: 
frog_stastic,$+\xD6\x05_{211}_1454335203,1462760599763.0d32a6bab354e6cc170cd59a2d485797.#info
 14047066/90843479 (15.46%), rate=14910.65 kB/sec, throughputController is 
NoLimitCompactionThroughputController
2016-05-31 05:30:29,684 DEBUG 
[regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
 compactions.Compactor: Compaction progress: 
frog_stastic,$+\xD6\x05_{211}_1454335203,1462760599763.0d32a6bab354e6cc170cd59a2d485797.#info
 23487276/90843479 (25.85%), rate=10021.16 kB/sec, throughputController is 
NoLimitCompactionThroughputController
2016-05-31 05:31:29,684 DEBUG 
[regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
 compactions.Compactor: Compaction progress: 
frog_stastic,$+\xD6\x05_{211}_1454335203,1462760599763.0d32a6bab354e6cc170cd59a2d485797.#info
 39884696/90843479 (43.90%), rate=17406.01 kB/sec, throughputController is 
NoLimitCompactionThroughputController
2016-05-31 05:32:29,686 DEBUG 
[regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
 compactions.Compactor: Compaction progress: 
frog_stastic,$+\xD6\x05_{211}_1454335203,1462760599763.0d32a6bab354e6cc170cd59a2d485797.#info
 45639621/90843479 (50.24%), rate=6103.74 kB/sec, throughputController is 
NoLimitCompactionThroughputController
2016-05-31 05:33:30,012 DEBUG 
[regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
 compactions.Compactor: Compaction progress: 
frog_stastic,$+\xD6\x05_{211}_1454335203,1462760599763.0d32a6bab354e6cc170cd59a2d485797.#info
 61288969/90843479 (67.47%), rate=16531.36 kB/sec, throughputController is 
NoLimitCompactionThroughputController
2016-05-31 05:34:30,225 DEBUG 
[regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
 compactions.Compactor: Compaction progress: 
frog_stastic,$+\xD6\x05_{211}_1454335203,1462760599763.0d32a6bab354e6cc170cd59a2d485797.#info
 73794589/90843479 (81.23%), rate=13231.85 kB/sec, throughputController is 
NoLimitCompactionThroughputController
{code}

And in jstack,  we could see the longCompaction thread is in RUNNABLE state, 
but it seems stuck.
{code}
"regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827"
 #727 daemon prio=5 os_prio=0 tid=0x00007f5324412000 nid=0x3796 runnable 
[0x00007f52fc5a8000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
        - locked <0x00000006b5900d70> (a sun.nio.ch.Util$2)
        - locked <0x00000006b58fa2a8> (a java.util.Collections$UnmodifiableSet)
        - locked <0x00000006b58d4400> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
        at 
org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:335)
        at 
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157)
        at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
        at 
org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.readChannelFully(PacketReceiver.java:258)
        at 
org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:209)
        at 
org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:171)
        at 
org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:102)
        at 
org.apache.hadoop.hdfs.RemoteBlockReader2.readNextPacket(RemoteBlockReader2.java:186)
        at 
org.apache.hadoop.hdfs.RemoteBlockReader2.read(RemoteBlockReader2.java:146)
        - locked <0x00000006b597e8f8> (a 
org.apache.hadoop.hdfs.RemoteBlockReader2)
        at 
org.apache.hadoop.hdfs.DFSInputStream$ByteArrayStrategy.doRead(DFSInputStream.java:686)
        at 
org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:742)
        - eliminated <0x000000062d8aeed8> (a 
org.apache.hadoop.hdfs.DFSInputStream)
        at 
org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:799)
        at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:840)
        - locked <0x000000062d8aeed8> (a org.apache.hadoop.hdfs.DFSInputStream)
        at java.io.DataInputStream.read(DataInputStream.java:149)
        at 
org.apache.hadoop.hbase.io.hfile.HFileBlock.readWithExtra(HFileBlock.java:678)
        at 
org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1372)
        at 
org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockDataInternal(HFileBlock.java:1591)
        at 
org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockData(HFileBlock.java:1470)
        at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:437)
        at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.readNextDataBlock(HFileReaderV2.java:708)
        at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.isNextBlock(HFileReaderV2.java:833)
        at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.positionForNextBlock(HFileReaderV2.java:828)
        at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2._next(HFileReaderV2.java:845)
        at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:865)
        at 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:139)
        at 
org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:108)
        at 
org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:596)
        at 
org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:255)
        at 
org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:106)
        at 
org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:112)
        at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1212)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:1798)
        at 
org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.run(CompactSplitThread.java:519)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

   Locked ownable synchronizers:
        - <0x000000062afa5528> (a 
java.util.concurrent.ThreadPoolExecutor$Worker)
        - <0x000000062f7ee850> (a 
java.util.concurrent.locks.ReentrantLock$NonfairSync)
        - <0x00000006b5985ec8> (a 
java.util.concurrent.locks.ReentrantLock$NonfairSync)
{code}





> RS stuck in get lock of HStore
> ------------------------------
>
>                 Key: HBASE-15900
>                 URL: https://issues.apache.org/jira/browse/HBASE-15900
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.1.1, 1.3.0
>            Reporter: Heng Chen
>         Attachments: 0d32a6bab354e6cc170cd59a2d485797.jstack.txt, 
> 0d32a6bab354e6cc170cd59a2d485797.rs.log, 9fe15a52_9fe15a52_save, 
> c91324eb_81194e359707acadee2906ffe36ab130.log, dump.txt
>
>
> It happens on my production cluster when i run MR job.  I save the dump.txt 
> from this RS webUI.
> Many threads stuck here:
> {code}
> Thread 133 (B.defaultRpcServer.handler=94,queue=4,port=16020):
>    32   State: WAITING
>    31   Blocked count: 477816
>    30   Waited count: 535255
>    29   Waiting on 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@6447ba67
>    28   Stack:
>    27     sun.misc.Unsafe.park(Native Method)
>    26     java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>    25     
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>    24     
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
>    23     
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
>    22     
> java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
>    21     org.apache.hadoop.hbase.regionserver.HStore.add(HStore.java:666)
>    20     
> org.apache.hadoop.hbase.regionserver.HRegion.applyFamilyMapToMemstore(HRegion.java:3621)
>    19     
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:3038)
>    18     
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2793)
>    17     
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2735)
>    16     
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:692)
>    15     
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:654)
>    14     
> org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2029)
>    13     
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32213)
>    12     org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2112)
>    11     org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101)
>    10     
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
>     9     org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
>     8     java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to