Guangxu Cheng created HBASE-17798:
-------------------------------------

             Summary: RpcServer.Listener.Reader can abort due to 
CancelledKeyException
                 Key: HBASE-17798
                 URL: https://issues.apache.org/jira/browse/HBASE-17798
             Project: HBase
          Issue Type: Bug
    Affects Versions: 0.98.24, 1.2.4, 1.3.0, 2.0.0
            Reporter: Guangxu Cheng


In our production cluster(0.98), some of the requests were unacceptable because 
RpcServer.Listener.Reader were aborted.
getReader() will return the next reader to deal with request.
The implementation of getReader() as below:
{code:title=RpcServer.java|borderStyle=solid}
    // The method that will return the next reader to work with
    // Simplistic implementation of round robin for now
    Reader getReader() {
      currentReader = (currentReader + 1) % readers.length;
      return readers[currentReader];
    }
{code}
If one of the readers abort, then it will lead to fall on the reader's request 
will never be dealt with.
Why does RpcServer.Listener.Reader abort?We add the debug log to get it.
After a while, we got the following exception:
{code}
2017-03-10 08:05:13,247 ERROR [RpcServer.reader=3,port=60020] ipc.RpcServer: 
RpcServer.listener,port=60020: unexpectedly error in Reader(Throwable)
java.nio.channels.CancelledKeyException
        at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
        at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87)
        at java.nio.channels.SelectionKey.isReadable(SelectionKey.java:289)
        at 
org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:592)
        at 
org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:566)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
{code}

So, when deal with the request in reader, we should handle CanceledKeyException.

----------
versions 1.x and 2.0 will log and retrun when dealing with the 
InterruptedException in Reader#doRunLoop after HBASE-10521. It will lead to the 
same problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to