[jira] [Updated] (HBASE-17798) RpcServer.Listener.Reader can abort due to CancelledKeyException

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-17798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-17798:
---
Fix Version/s: 1.3.3

> RpcServer.Listener.Reader can abort due to CancelledKeyException
> 
>
> Key: HBASE-17798
> URL: https://issues.apache.org/jira/browse/HBASE-17798
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.0, 1.2.4, 0.98.24, 2.0.0
>Reporter: Guangxu Cheng
>Assignee: Guangxu Cheng
>Priority: Major
> Fix For: 1.4.0, 1.3.3, 2.0.0
>
> Attachments: 17798-master-v2.patch, HBASE-17798-0.98-v1.patch, 
> HBASE-17798-0.98-v2.patch, HBASE-17798-branch-1-v1.patch, 
> HBASE-17798-branch-1-v2.patch, HBASE-17798-master-v1.patch, 
> HBASE-17798-master-v2.patch, connections.png
>
>
> In our production cluster(0.98), some of the requests were unacceptable 
> because RpcServer.Listener.Reader were aborted.
> getReader() will return the next reader to deal with request.
> The implementation of getReader() as below:
> {code:title=RpcServer.java|borderStyle=solid}
> // The method that will return the next reader to work with
> // Simplistic implementation of round robin for now
> Reader getReader() {
>   currentReader = (currentReader + 1) % readers.length;
>   return readers[currentReader];
> }
> {code}
> If one of the readers abort, then it will lead to fall on the reader's 
> request will never be dealt with.
> Why does RpcServer.Listener.Reader abort?We add the debug log to get it.
> After a while, we got the following exception:
> {code}
> 2017-03-10 08:05:13,247 ERROR [RpcServer.reader=3,port=60020] ipc.RpcServer: 
> RpcServer.listener,port=60020: unexpectedly error in Reader(Throwable)
> java.nio.channels.CancelledKeyException
> at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
> at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87)
> at java.nio.channels.SelectionKey.isReadable(SelectionKey.java:289)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:592)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:566)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> So, when deal with the request in reader, we should handle 
> CanceledKeyException.
> --
> versions 1.x and 2.0 will log and retrun when dealing with the 
> InterruptedException in Reader#doRunLoop after HBASE-10521. It will lead to 
> the same problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-17798) RpcServer.Listener.Reader can abort due to CancelledKeyException

2017-03-30 Thread Duo Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-17798:
--
Fix Version/s: (was: 2.0)
   2.0.0

> RpcServer.Listener.Reader can abort due to CancelledKeyException
> 
>
> Key: HBASE-17798
> URL: https://issues.apache.org/jira/browse/HBASE-17798
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.3.0, 1.2.4, 0.98.24
>Reporter: Guangxu Cheng
>Assignee: Guangxu Cheng
> Fix For: 2.0.0, 1.4.0
>
> Attachments: 17798-master-v2.patch, connections.png, 
> HBASE-17798-0.98-v1.patch, HBASE-17798-0.98-v2.patch, 
> HBASE-17798-branch-1-v1.patch, HBASE-17798-branch-1-v2.patch, 
> HBASE-17798-master-v1.patch, HBASE-17798-master-v2.patch
>
>
> In our production cluster(0.98), some of the requests were unacceptable 
> because RpcServer.Listener.Reader were aborted.
> getReader() will return the next reader to deal with request.
> The implementation of getReader() as below:
> {code:title=RpcServer.java|borderStyle=solid}
> // The method that will return the next reader to work with
> // Simplistic implementation of round robin for now
> Reader getReader() {
>   currentReader = (currentReader + 1) % readers.length;
>   return readers[currentReader];
> }
> {code}
> If one of the readers abort, then it will lead to fall on the reader's 
> request will never be dealt with.
> Why does RpcServer.Listener.Reader abort?We add the debug log to get it.
> After a while, we got the following exception:
> {code}
> 2017-03-10 08:05:13,247 ERROR [RpcServer.reader=3,port=60020] ipc.RpcServer: 
> RpcServer.listener,port=60020: unexpectedly error in Reader(Throwable)
> java.nio.channels.CancelledKeyException
> at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
> at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87)
> at java.nio.channels.SelectionKey.isReadable(SelectionKey.java:289)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:592)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:566)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> So, when deal with the request in reader, we should handle 
> CanceledKeyException.
> --
> versions 1.x and 2.0 will log and retrun when dealing with the 
> InterruptedException in Reader#doRunLoop after HBASE-10521. It will lead to 
> the same problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17798) RpcServer.Listener.Reader can abort due to CancelledKeyException

2017-03-21 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-17798:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.0
   1.4.0
   Status: Resolved  (was: Patch Available)

Thanks for the patch, Guangxu.

branch-1.3 is in quiet period. Resolving for now.

> RpcServer.Listener.Reader can abort due to CancelledKeyException
> 
>
> Key: HBASE-17798
> URL: https://issues.apache.org/jira/browse/HBASE-17798
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.3.0, 1.2.4, 0.98.24
>Reporter: Guangxu Cheng
>Assignee: Guangxu Cheng
> Fix For: 1.4.0, 2.0
>
> Attachments: 17798-master-v2.patch, connections.png, 
> HBASE-17798-0.98-v1.patch, HBASE-17798-0.98-v2.patch, 
> HBASE-17798-branch-1-v1.patch, HBASE-17798-branch-1-v2.patch, 
> HBASE-17798-master-v1.patch, HBASE-17798-master-v2.patch
>
>
> In our production cluster(0.98), some of the requests were unacceptable 
> because RpcServer.Listener.Reader were aborted.
> getReader() will return the next reader to deal with request.
> The implementation of getReader() as below:
> {code:title=RpcServer.java|borderStyle=solid}
> // The method that will return the next reader to work with
> // Simplistic implementation of round robin for now
> Reader getReader() {
>   currentReader = (currentReader + 1) % readers.length;
>   return readers[currentReader];
> }
> {code}
> If one of the readers abort, then it will lead to fall on the reader's 
> request will never be dealt with.
> Why does RpcServer.Listener.Reader abort?We add the debug log to get it.
> After a while, we got the following exception:
> {code}
> 2017-03-10 08:05:13,247 ERROR [RpcServer.reader=3,port=60020] ipc.RpcServer: 
> RpcServer.listener,port=60020: unexpectedly error in Reader(Throwable)
> java.nio.channels.CancelledKeyException
> at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
> at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87)
> at java.nio.channels.SelectionKey.isReadable(SelectionKey.java:289)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:592)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:566)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> So, when deal with the request in reader, we should handle 
> CanceledKeyException.
> --
> versions 1.x and 2.0 will log and retrun when dealing with the 
> InterruptedException in Reader#doRunLoop after HBASE-10521. It will lead to 
> the same problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17798) RpcServer.Listener.Reader can abort due to CancelledKeyException

2017-03-20 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-17798:
---
Attachment: 17798-master-v2.patch

> RpcServer.Listener.Reader can abort due to CancelledKeyException
> 
>
> Key: HBASE-17798
> URL: https://issues.apache.org/jira/browse/HBASE-17798
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.3.0, 1.2.4, 0.98.24
>Reporter: Guangxu Cheng
> Attachments: 17798-master-v2.patch, connections.png, 
> HBASE-17798-0.98-v1.patch, HBASE-17798-0.98-v2.patch, 
> HBASE-17798-branch-1-v1.patch, HBASE-17798-branch-1-v2.patch, 
> HBASE-17798-master-v1.patch, HBASE-17798-master-v2.patch
>
>
> In our production cluster(0.98), some of the requests were unacceptable 
> because RpcServer.Listener.Reader were aborted.
> getReader() will return the next reader to deal with request.
> The implementation of getReader() as below:
> {code:title=RpcServer.java|borderStyle=solid}
> // The method that will return the next reader to work with
> // Simplistic implementation of round robin for now
> Reader getReader() {
>   currentReader = (currentReader + 1) % readers.length;
>   return readers[currentReader];
> }
> {code}
> If one of the readers abort, then it will lead to fall on the reader's 
> request will never be dealt with.
> Why does RpcServer.Listener.Reader abort?We add the debug log to get it.
> After a while, we got the following exception:
> {code}
> 2017-03-10 08:05:13,247 ERROR [RpcServer.reader=3,port=60020] ipc.RpcServer: 
> RpcServer.listener,port=60020: unexpectedly error in Reader(Throwable)
> java.nio.channels.CancelledKeyException
> at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
> at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87)
> at java.nio.channels.SelectionKey.isReadable(SelectionKey.java:289)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:592)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:566)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> So, when deal with the request in reader, we should handle 
> CanceledKeyException.
> --
> versions 1.x and 2.0 will log and retrun when dealing with the 
> InterruptedException in Reader#doRunLoop after HBASE-10521. It will lead to 
> the same problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17798) RpcServer.Listener.Reader can abort due to CancelledKeyException

2017-03-18 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-17798:
---
Status: Patch Available  (was: Open)

> RpcServer.Listener.Reader can abort due to CancelledKeyException
> 
>
> Key: HBASE-17798
> URL: https://issues.apache.org/jira/browse/HBASE-17798
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.24, 1.2.4, 1.3.0, 2.0.0
>Reporter: Guangxu Cheng
> Attachments: connections.png, HBASE-17798-0.98-v1.patch, 
> HBASE-17798-0.98-v2.patch, HBASE-17798-branch-1-v1.patch, 
> HBASE-17798-branch-1-v2.patch, HBASE-17798-master-v1.patch, 
> HBASE-17798-master-v2.patch
>
>
> In our production cluster(0.98), some of the requests were unacceptable 
> because RpcServer.Listener.Reader were aborted.
> getReader() will return the next reader to deal with request.
> The implementation of getReader() as below:
> {code:title=RpcServer.java|borderStyle=solid}
> // The method that will return the next reader to work with
> // Simplistic implementation of round robin for now
> Reader getReader() {
>   currentReader = (currentReader + 1) % readers.length;
>   return readers[currentReader];
> }
> {code}
> If one of the readers abort, then it will lead to fall on the reader's 
> request will never be dealt with.
> Why does RpcServer.Listener.Reader abort?We add the debug log to get it.
> After a while, we got the following exception:
> {code}
> 2017-03-10 08:05:13,247 ERROR [RpcServer.reader=3,port=60020] ipc.RpcServer: 
> RpcServer.listener,port=60020: unexpectedly error in Reader(Throwable)
> java.nio.channels.CancelledKeyException
> at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
> at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87)
> at java.nio.channels.SelectionKey.isReadable(SelectionKey.java:289)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:592)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:566)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> So, when deal with the request in reader, we should handle 
> CanceledKeyException.
> --
> versions 1.x and 2.0 will log and retrun when dealing with the 
> InterruptedException in Reader#doRunLoop after HBASE-10521. It will lead to 
> the same problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17798) RpcServer.Listener.Reader can abort due to CancelledKeyException

2017-03-18 Thread Guangxu Cheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guangxu Cheng updated HBASE-17798:
--
Attachment: connections.png
HBASE-17798-master-v2.patch
HBASE-17798-branch-1-v2.patch
HBASE-17798-0.98-v2.patch

> RpcServer.Listener.Reader can abort due to CancelledKeyException
> 
>
> Key: HBASE-17798
> URL: https://issues.apache.org/jira/browse/HBASE-17798
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.3.0, 1.2.4, 0.98.24
>Reporter: Guangxu Cheng
> Attachments: connections.png, HBASE-17798-0.98-v1.patch, 
> HBASE-17798-0.98-v2.patch, HBASE-17798-branch-1-v1.patch, 
> HBASE-17798-branch-1-v2.patch, HBASE-17798-master-v1.patch, 
> HBASE-17798-master-v2.patch
>
>
> In our production cluster(0.98), some of the requests were unacceptable 
> because RpcServer.Listener.Reader were aborted.
> getReader() will return the next reader to deal with request.
> The implementation of getReader() as below:
> {code:title=RpcServer.java|borderStyle=solid}
> // The method that will return the next reader to work with
> // Simplistic implementation of round robin for now
> Reader getReader() {
>   currentReader = (currentReader + 1) % readers.length;
>   return readers[currentReader];
> }
> {code}
> If one of the readers abort, then it will lead to fall on the reader's 
> request will never be dealt with.
> Why does RpcServer.Listener.Reader abort?We add the debug log to get it.
> After a while, we got the following exception:
> {code}
> 2017-03-10 08:05:13,247 ERROR [RpcServer.reader=3,port=60020] ipc.RpcServer: 
> RpcServer.listener,port=60020: unexpectedly error in Reader(Throwable)
> java.nio.channels.CancelledKeyException
> at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
> at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87)
> at java.nio.channels.SelectionKey.isReadable(SelectionKey.java:289)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:592)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:566)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> So, when deal with the request in reader, we should handle 
> CanceledKeyException.
> --
> versions 1.x and 2.0 will log and retrun when dealing with the 
> InterruptedException in Reader#doRunLoop after HBASE-10521. It will lead to 
> the same problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17798) RpcServer.Listener.Reader can abort due to CancelledKeyException

2017-03-17 Thread Guangxu Cheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guangxu Cheng updated HBASE-17798:
--
Attachment: HBASE-17798-master-v1.patch
HBASE-17798-branch-1-v1.patch
HBASE-17798-0.98-v1.patch

> RpcServer.Listener.Reader can abort due to CancelledKeyException
> 
>
> Key: HBASE-17798
> URL: https://issues.apache.org/jira/browse/HBASE-17798
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.3.0, 1.2.4, 0.98.24
>Reporter: Guangxu Cheng
> Attachments: HBASE-17798-0.98-v1.patch, 
> HBASE-17798-branch-1-v1.patch, HBASE-17798-master-v1.patch
>
>
> In our production cluster(0.98), some of the requests were unacceptable 
> because RpcServer.Listener.Reader were aborted.
> getReader() will return the next reader to deal with request.
> The implementation of getReader() as below:
> {code:title=RpcServer.java|borderStyle=solid}
> // The method that will return the next reader to work with
> // Simplistic implementation of round robin for now
> Reader getReader() {
>   currentReader = (currentReader + 1) % readers.length;
>   return readers[currentReader];
> }
> {code}
> If one of the readers abort, then it will lead to fall on the reader's 
> request will never be dealt with.
> Why does RpcServer.Listener.Reader abort?We add the debug log to get it.
> After a while, we got the following exception:
> {code}
> 2017-03-10 08:05:13,247 ERROR [RpcServer.reader=3,port=60020] ipc.RpcServer: 
> RpcServer.listener,port=60020: unexpectedly error in Reader(Throwable)
> java.nio.channels.CancelledKeyException
> at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
> at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87)
> at java.nio.channels.SelectionKey.isReadable(SelectionKey.java:289)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:592)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:566)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> So, when deal with the request in reader, we should handle 
> CanceledKeyException.
> --
> versions 1.x and 2.0 will log and retrun when dealing with the 
> InterruptedException in Reader#doRunLoop after HBASE-10521. It will lead to 
> the same problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)