[ 
https://issues.apache.org/jira/browse/HBASE-14062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14625882#comment-14625882
 ] 

Heng Chen commented on HBASE-14062:
-----------------------------------

So i think the lock is hold  due to a lot of exceptions throwed  by doRead。 

When exception throw, doRead will call closeConnection,  and closeConnection 
will hold the lock.

And when having too many exceptions, the lock is always acquired by 
closeConnection, so the lock is always waited by doAccept


Why the exception is throwed? 

> RpcServer.Listener.doAccept get blocked by LinkedList.remove
> ------------------------------------------------------------
>
>                 Key: HBASE-14062
>                 URL: https://issues.apache.org/jira/browse/HBASE-14062
>             Project: HBase
>          Issue Type: Bug
>          Components: IPC/RPC
>    Affects Versions: 0.98.12
>            Reporter: Victor Xu
>         Attachments: hbase.log, jstack.log
>
>
> We saw these blocked info in our jstack output:
> {noformat}
> "RpcServer.listener,port=60020" daemon prio=10 tid=0x00007f158097b800 
> nid=0x2cd05 waiting for monitor entry [0x0000000046374000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>         at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener.doAccept(RpcServer.java:833)
>         - waiting to lock <0x00000002bb094ac8> (a 
> java.util.Collections$SynchronizedList)
>         at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener.run(RpcServer.java:748)
> {noformat}
> And the owner of the lock is LinkedList.remove:
> {noformat}
> "RpcServer.reader=9,port=60020" daemon prio=10 tid=0x00007f1580394000 
> nid=0x2cc19 runnable [0x0000000043b4c000]
>    java.lang.Thread.State: RUNNABLE
>         at java.util.LinkedList.remove(LinkedList.java:363)
>         at 
> java.util.Collections$SynchronizedCollection.remove(Collections.java:1639)
>         - locked <0x00000002bb094ac8> (a 
> java.util.Collections$SynchronizedList)
>         at 
> org.apache.hadoop.hbase.ipc.RpcServer.closeConnection(RpcServer.java:1992)
>         - locked <0x00000002bb094ac8> (a 
> java.util.Collections$SynchronizedList)
>         at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:867)
>         at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:645)
>         - locked <0x00000002bae09a30> (a 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader)
>         at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:620)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:744)
> {noformat}
> This issue blocked RS once in a while and I had to restart it whenever it 
> happens. It seems like a bug. Any suggestions?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to