[jira] [Updated] (HBASE-17798) RpcServer.Listener.Reader can abort due to CancelledKeyException
[ https://issues.apache.org/jira/browse/HBASE-17798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-17798: --- Fix Version/s: 1.3.3 > RpcServer.Listener.Reader can abort due to CancelledKeyException > > > Key: HBASE-17798 > URL: https://issues.apache.org/jira/browse/HBASE-17798 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.0, 1.2.4, 0.98.24, 2.0.0 >Reporter: Guangxu Cheng >Assignee: Guangxu Cheng >Priority: Major > Fix For: 1.4.0, 1.3.3, 2.0.0 > > Attachments: 17798-master-v2.patch, HBASE-17798-0.98-v1.patch, > HBASE-17798-0.98-v2.patch, HBASE-17798-branch-1-v1.patch, > HBASE-17798-branch-1-v2.patch, HBASE-17798-master-v1.patch, > HBASE-17798-master-v2.patch, connections.png > > > In our production cluster(0.98), some of the requests were unacceptable > because RpcServer.Listener.Reader were aborted. > getReader() will return the next reader to deal with request. > The implementation of getReader() as below: > {code:title=RpcServer.java|borderStyle=solid} > // The method that will return the next reader to work with > // Simplistic implementation of round robin for now > Reader getReader() { > currentReader = (currentReader + 1) % readers.length; > return readers[currentReader]; > } > {code} > If one of the readers abort, then it will lead to fall on the reader's > request will never be dealt with. > Why does RpcServer.Listener.Reader abort?We add the debug log to get it. > After a while, we got the following exception: > {code} > 2017-03-10 08:05:13,247 ERROR [RpcServer.reader=3,port=60020] ipc.RpcServer: > RpcServer.listener,port=60020: unexpectedly error in Reader(Throwable) > java.nio.channels.CancelledKeyException > at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73) > at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87) > at java.nio.channels.SelectionKey.isReadable(SelectionKey.java:289) > at > org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:592) > at > org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:566) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > So, when deal with the request in reader, we should handle > CanceledKeyException. > -- > versions 1.x and 2.0 will log and retrun when dealing with the > InterruptedException in Reader#doRunLoop after HBASE-10521. It will lead to > the same problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-17798) RpcServer.Listener.Reader can abort due to CancelledKeyException
[ https://issues.apache.org/jira/browse/HBASE-17798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-17798: -- Fix Version/s: (was: 2.0) 2.0.0 > RpcServer.Listener.Reader can abort due to CancelledKeyException > > > Key: HBASE-17798 > URL: https://issues.apache.org/jira/browse/HBASE-17798 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0, 1.3.0, 1.2.4, 0.98.24 >Reporter: Guangxu Cheng >Assignee: Guangxu Cheng > Fix For: 2.0.0, 1.4.0 > > Attachments: 17798-master-v2.patch, connections.png, > HBASE-17798-0.98-v1.patch, HBASE-17798-0.98-v2.patch, > HBASE-17798-branch-1-v1.patch, HBASE-17798-branch-1-v2.patch, > HBASE-17798-master-v1.patch, HBASE-17798-master-v2.patch > > > In our production cluster(0.98), some of the requests were unacceptable > because RpcServer.Listener.Reader were aborted. > getReader() will return the next reader to deal with request. > The implementation of getReader() as below: > {code:title=RpcServer.java|borderStyle=solid} > // The method that will return the next reader to work with > // Simplistic implementation of round robin for now > Reader getReader() { > currentReader = (currentReader + 1) % readers.length; > return readers[currentReader]; > } > {code} > If one of the readers abort, then it will lead to fall on the reader's > request will never be dealt with. > Why does RpcServer.Listener.Reader abort?We add the debug log to get it. > After a while, we got the following exception: > {code} > 2017-03-10 08:05:13,247 ERROR [RpcServer.reader=3,port=60020] ipc.RpcServer: > RpcServer.listener,port=60020: unexpectedly error in Reader(Throwable) > java.nio.channels.CancelledKeyException > at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73) > at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87) > at java.nio.channels.SelectionKey.isReadable(SelectionKey.java:289) > at > org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:592) > at > org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:566) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > So, when deal with the request in reader, we should handle > CanceledKeyException. > -- > versions 1.x and 2.0 will log and retrun when dealing with the > InterruptedException in Reader#doRunLoop after HBASE-10521. It will lead to > the same problem. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-17798) RpcServer.Listener.Reader can abort due to CancelledKeyException
[ https://issues.apache.org/jira/browse/HBASE-17798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-17798: --- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.0 1.4.0 Status: Resolved (was: Patch Available) Thanks for the patch, Guangxu. branch-1.3 is in quiet period. Resolving for now. > RpcServer.Listener.Reader can abort due to CancelledKeyException > > > Key: HBASE-17798 > URL: https://issues.apache.org/jira/browse/HBASE-17798 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0, 1.3.0, 1.2.4, 0.98.24 >Reporter: Guangxu Cheng >Assignee: Guangxu Cheng > Fix For: 1.4.0, 2.0 > > Attachments: 17798-master-v2.patch, connections.png, > HBASE-17798-0.98-v1.patch, HBASE-17798-0.98-v2.patch, > HBASE-17798-branch-1-v1.patch, HBASE-17798-branch-1-v2.patch, > HBASE-17798-master-v1.patch, HBASE-17798-master-v2.patch > > > In our production cluster(0.98), some of the requests were unacceptable > because RpcServer.Listener.Reader were aborted. > getReader() will return the next reader to deal with request. > The implementation of getReader() as below: > {code:title=RpcServer.java|borderStyle=solid} > // The method that will return the next reader to work with > // Simplistic implementation of round robin for now > Reader getReader() { > currentReader = (currentReader + 1) % readers.length; > return readers[currentReader]; > } > {code} > If one of the readers abort, then it will lead to fall on the reader's > request will never be dealt with. > Why does RpcServer.Listener.Reader abort?We add the debug log to get it. > After a while, we got the following exception: > {code} > 2017-03-10 08:05:13,247 ERROR [RpcServer.reader=3,port=60020] ipc.RpcServer: > RpcServer.listener,port=60020: unexpectedly error in Reader(Throwable) > java.nio.channels.CancelledKeyException > at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73) > at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87) > at java.nio.channels.SelectionKey.isReadable(SelectionKey.java:289) > at > org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:592) > at > org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:566) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > So, when deal with the request in reader, we should handle > CanceledKeyException. > -- > versions 1.x and 2.0 will log and retrun when dealing with the > InterruptedException in Reader#doRunLoop after HBASE-10521. It will lead to > the same problem. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-17798) RpcServer.Listener.Reader can abort due to CancelledKeyException
[ https://issues.apache.org/jira/browse/HBASE-17798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-17798: --- Attachment: 17798-master-v2.patch > RpcServer.Listener.Reader can abort due to CancelledKeyException > > > Key: HBASE-17798 > URL: https://issues.apache.org/jira/browse/HBASE-17798 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0, 1.3.0, 1.2.4, 0.98.24 >Reporter: Guangxu Cheng > Attachments: 17798-master-v2.patch, connections.png, > HBASE-17798-0.98-v1.patch, HBASE-17798-0.98-v2.patch, > HBASE-17798-branch-1-v1.patch, HBASE-17798-branch-1-v2.patch, > HBASE-17798-master-v1.patch, HBASE-17798-master-v2.patch > > > In our production cluster(0.98), some of the requests were unacceptable > because RpcServer.Listener.Reader were aborted. > getReader() will return the next reader to deal with request. > The implementation of getReader() as below: > {code:title=RpcServer.java|borderStyle=solid} > // The method that will return the next reader to work with > // Simplistic implementation of round robin for now > Reader getReader() { > currentReader = (currentReader + 1) % readers.length; > return readers[currentReader]; > } > {code} > If one of the readers abort, then it will lead to fall on the reader's > request will never be dealt with. > Why does RpcServer.Listener.Reader abort?We add the debug log to get it. > After a while, we got the following exception: > {code} > 2017-03-10 08:05:13,247 ERROR [RpcServer.reader=3,port=60020] ipc.RpcServer: > RpcServer.listener,port=60020: unexpectedly error in Reader(Throwable) > java.nio.channels.CancelledKeyException > at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73) > at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87) > at java.nio.channels.SelectionKey.isReadable(SelectionKey.java:289) > at > org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:592) > at > org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:566) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > So, when deal with the request in reader, we should handle > CanceledKeyException. > -- > versions 1.x and 2.0 will log and retrun when dealing with the > InterruptedException in Reader#doRunLoop after HBASE-10521. It will lead to > the same problem. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-17798) RpcServer.Listener.Reader can abort due to CancelledKeyException
[ https://issues.apache.org/jira/browse/HBASE-17798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-17798: --- Status: Patch Available (was: Open) > RpcServer.Listener.Reader can abort due to CancelledKeyException > > > Key: HBASE-17798 > URL: https://issues.apache.org/jira/browse/HBASE-17798 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.24, 1.2.4, 1.3.0, 2.0.0 >Reporter: Guangxu Cheng > Attachments: connections.png, HBASE-17798-0.98-v1.patch, > HBASE-17798-0.98-v2.patch, HBASE-17798-branch-1-v1.patch, > HBASE-17798-branch-1-v2.patch, HBASE-17798-master-v1.patch, > HBASE-17798-master-v2.patch > > > In our production cluster(0.98), some of the requests were unacceptable > because RpcServer.Listener.Reader were aborted. > getReader() will return the next reader to deal with request. > The implementation of getReader() as below: > {code:title=RpcServer.java|borderStyle=solid} > // The method that will return the next reader to work with > // Simplistic implementation of round robin for now > Reader getReader() { > currentReader = (currentReader + 1) % readers.length; > return readers[currentReader]; > } > {code} > If one of the readers abort, then it will lead to fall on the reader's > request will never be dealt with. > Why does RpcServer.Listener.Reader abort?We add the debug log to get it. > After a while, we got the following exception: > {code} > 2017-03-10 08:05:13,247 ERROR [RpcServer.reader=3,port=60020] ipc.RpcServer: > RpcServer.listener,port=60020: unexpectedly error in Reader(Throwable) > java.nio.channels.CancelledKeyException > at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73) > at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87) > at java.nio.channels.SelectionKey.isReadable(SelectionKey.java:289) > at > org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:592) > at > org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:566) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > So, when deal with the request in reader, we should handle > CanceledKeyException. > -- > versions 1.x and 2.0 will log and retrun when dealing with the > InterruptedException in Reader#doRunLoop after HBASE-10521. It will lead to > the same problem. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-17798) RpcServer.Listener.Reader can abort due to CancelledKeyException
[ https://issues.apache.org/jira/browse/HBASE-17798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guangxu Cheng updated HBASE-17798: -- Attachment: connections.png HBASE-17798-master-v2.patch HBASE-17798-branch-1-v2.patch HBASE-17798-0.98-v2.patch > RpcServer.Listener.Reader can abort due to CancelledKeyException > > > Key: HBASE-17798 > URL: https://issues.apache.org/jira/browse/HBASE-17798 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0, 1.3.0, 1.2.4, 0.98.24 >Reporter: Guangxu Cheng > Attachments: connections.png, HBASE-17798-0.98-v1.patch, > HBASE-17798-0.98-v2.patch, HBASE-17798-branch-1-v1.patch, > HBASE-17798-branch-1-v2.patch, HBASE-17798-master-v1.patch, > HBASE-17798-master-v2.patch > > > In our production cluster(0.98), some of the requests were unacceptable > because RpcServer.Listener.Reader were aborted. > getReader() will return the next reader to deal with request. > The implementation of getReader() as below: > {code:title=RpcServer.java|borderStyle=solid} > // The method that will return the next reader to work with > // Simplistic implementation of round robin for now > Reader getReader() { > currentReader = (currentReader + 1) % readers.length; > return readers[currentReader]; > } > {code} > If one of the readers abort, then it will lead to fall on the reader's > request will never be dealt with. > Why does RpcServer.Listener.Reader abort?We add the debug log to get it. > After a while, we got the following exception: > {code} > 2017-03-10 08:05:13,247 ERROR [RpcServer.reader=3,port=60020] ipc.RpcServer: > RpcServer.listener,port=60020: unexpectedly error in Reader(Throwable) > java.nio.channels.CancelledKeyException > at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73) > at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87) > at java.nio.channels.SelectionKey.isReadable(SelectionKey.java:289) > at > org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:592) > at > org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:566) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > So, when deal with the request in reader, we should handle > CanceledKeyException. > -- > versions 1.x and 2.0 will log and retrun when dealing with the > InterruptedException in Reader#doRunLoop after HBASE-10521. It will lead to > the same problem. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-17798) RpcServer.Listener.Reader can abort due to CancelledKeyException
[ https://issues.apache.org/jira/browse/HBASE-17798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guangxu Cheng updated HBASE-17798: -- Attachment: HBASE-17798-master-v1.patch HBASE-17798-branch-1-v1.patch HBASE-17798-0.98-v1.patch > RpcServer.Listener.Reader can abort due to CancelledKeyException > > > Key: HBASE-17798 > URL: https://issues.apache.org/jira/browse/HBASE-17798 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0, 1.3.0, 1.2.4, 0.98.24 >Reporter: Guangxu Cheng > Attachments: HBASE-17798-0.98-v1.patch, > HBASE-17798-branch-1-v1.patch, HBASE-17798-master-v1.patch > > > In our production cluster(0.98), some of the requests were unacceptable > because RpcServer.Listener.Reader were aborted. > getReader() will return the next reader to deal with request. > The implementation of getReader() as below: > {code:title=RpcServer.java|borderStyle=solid} > // The method that will return the next reader to work with > // Simplistic implementation of round robin for now > Reader getReader() { > currentReader = (currentReader + 1) % readers.length; > return readers[currentReader]; > } > {code} > If one of the readers abort, then it will lead to fall on the reader's > request will never be dealt with. > Why does RpcServer.Listener.Reader abort?We add the debug log to get it. > After a while, we got the following exception: > {code} > 2017-03-10 08:05:13,247 ERROR [RpcServer.reader=3,port=60020] ipc.RpcServer: > RpcServer.listener,port=60020: unexpectedly error in Reader(Throwable) > java.nio.channels.CancelledKeyException > at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73) > at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87) > at java.nio.channels.SelectionKey.isReadable(SelectionKey.java:289) > at > org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:592) > at > org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:566) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > So, when deal with the request in reader, we should handle > CanceledKeyException. > -- > versions 1.x and 2.0 will log and retrun when dealing with the > InterruptedException in Reader#doRunLoop after HBASE-10521. It will lead to > the same problem. -- This message was sent by Atlassian JIRA (v6.3.15#6346)