[
https://issues.apache.org/jira/browse/HBASE-14062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14625882#comment-14625882
]
Heng Chen commented on HBASE-14062:
-----------------------------------
So i think the lock is hold due to a lot of exceptions throwed by doRead。
When exception throw, doRead will call closeConnection, and closeConnection
will hold the lock.
And when having too many exceptions, the lock is always acquired by
closeConnection, so the lock is always waited by doAccept
Why the exception is throwed?
> RpcServer.Listener.doAccept get blocked by LinkedList.remove
> ------------------------------------------------------------
>
> Key: HBASE-14062
> URL: https://issues.apache.org/jira/browse/HBASE-14062
> Project: HBase
> Issue Type: Bug
> Components: IPC/RPC
> Affects Versions: 0.98.12
> Reporter: Victor Xu
> Attachments: hbase.log, jstack.log
>
>
> We saw these blocked info in our jstack output:
> {noformat}
> "RpcServer.listener,port=60020" daemon prio=10 tid=0x00007f158097b800
> nid=0x2cd05 waiting for monitor entry [0x0000000046374000]
> java.lang.Thread.State: BLOCKED (on object monitor)
> at
> org.apache.hadoop.hbase.ipc.RpcServer$Listener.doAccept(RpcServer.java:833)
> - waiting to lock <0x00000002bb094ac8> (a
> java.util.Collections$SynchronizedList)
> at
> org.apache.hadoop.hbase.ipc.RpcServer$Listener.run(RpcServer.java:748)
> {noformat}
> And the owner of the lock is LinkedList.remove:
> {noformat}
> "RpcServer.reader=9,port=60020" daemon prio=10 tid=0x00007f1580394000
> nid=0x2cc19 runnable [0x0000000043b4c000]
> java.lang.Thread.State: RUNNABLE
> at java.util.LinkedList.remove(LinkedList.java:363)
> at
> java.util.Collections$SynchronizedCollection.remove(Collections.java:1639)
> - locked <0x00000002bb094ac8> (a
> java.util.Collections$SynchronizedList)
> at
> org.apache.hadoop.hbase.ipc.RpcServer.closeConnection(RpcServer.java:1992)
> - locked <0x00000002bb094ac8> (a
> java.util.Collections$SynchronizedList)
> at
> org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:867)
> at
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:645)
> - locked <0x00000002bae09a30> (a
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader)
> at
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:620)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> {noformat}
> This issue blocked RS once in a while and I had to restart it whenever it
> happens. It seems like a bug. Any suggestions?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)