[jira] [Commented] (HBASE-5543) Add a keepalive option for IPC connections

2012-03-14 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229551#comment-13229551
 ] 

Jonathan Hsieh commented on HBASE-5543:
---

A straw man:

As a mechanism for dealing with long running rpcs, we could adding something 
similar to what I understand Hadoop's Progressable class (the uuid could be a 
reference to this in a map or something).  The coprocessor context would have a 
ref Progressable that the coprocessor would have to periodically call to 
demonstrate progress.  If it isn't called for a while, it is assumed to be hung.

This could possibly be wired into the hbase rpc mechanism also -- for HBase 
ServerCallables on the server side, we might add a ref to a Progressable -- if 
a call is long running (like a bulk call), calls to the progress() method might 
reset the rpc timeout counter.

 Add a keepalive option for IPC connections
 --

 Key: HBASE-5543
 URL: https://issues.apache.org/jira/browse/HBASE-5543
 Project: HBase
  Issue Type: Improvement
  Components: client, coprocessors, ipc
Reporter: Andrew Purtell

 On the user list someone wrote in with a connection failure due to a long 
 running coprocessor:
 {quote}
 On Wed, Mar 7, 2012 at 10:59 PM, raghavendhra rahul wrote:
 2012-03-08 12:03:09,475 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server 
 Responder, call execCoprocessor([B@50cb21, getProjection(), rpc version=1, 
 client version=0, methodsFingerPrint=0), rpc version=1, client version=29, 
 methodsFingerPrint=54742778 from 10.184.17.26:46472: output error
 2012-03-08 12:03:09,476 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server 
 handler 7 on 60020 caught: java.nio.channels.ClosedChannelException
 {quote}
 I suggested in response we might consider give our RPC a keepalive option for 
 calls that may run for a long time (like execCoprocessor).
 LarsH +1ed the idea:
 {quote}
 +1 on keepalive. It's a shame (especially for long running server code) to 
 do all the work, just to find out at the end that the client has given up.
 Or maybe there should be a way to cancel an operation if the clients decides 
 it does not want to wait any longer (PostgreSQL does that for example). Here 
 that would mean the server would need to check periodically and coprocessors 
 would need to be written to support that - so maybe that's no-starter.
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5543) Add a keepalive option for IPC connections

2012-03-12 Thread Himanshu Vashishtha (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227333#comment-13227333
 ] 

Himanshu Vashishtha commented on HBASE-5543:


What is the scope of the uuid token in the Coprocessor context. The current 
approach is to subdivide the calls in terms of regions; then submit a Callable 
object for each of these Regions; obtain a Future object on each of these calls 
and block until all of them have returned some result. 
So, a uuid from the client side server proxy object, or a list of uuids from 
all the involved regions, or something more elegant which I am missing. Please 
suggest. Thanks.

 Add a keepalive option for IPC connections
 --

 Key: HBASE-5543
 URL: https://issues.apache.org/jira/browse/HBASE-5543
 Project: HBase
  Issue Type: Improvement
  Components: client, coprocessors, ipc
Reporter: Andrew Purtell

 On the user list someone wrote in with a connection failure due to a long 
 running coprocessor:
 {quote}
 On Wed, Mar 7, 2012 at 10:59 PM, raghavendhra rahul wrote:
 2012-03-08 12:03:09,475 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server 
 Responder, call execCoprocessor([B@50cb21, getProjection(), rpc version=1, 
 client version=0, methodsFingerPrint=0), rpc version=1, client version=29, 
 methodsFingerPrint=54742778 from 10.184.17.26:46472: output error
 2012-03-08 12:03:09,476 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server 
 handler 7 on 60020 caught: java.nio.channels.ClosedChannelException
 {quote}
 I suggested in response we might consider give our RPC a keepalive option for 
 calls that may run for a long time (like execCoprocessor).
 LarsH +1ed the idea:
 {quote}
 +1 on keepalive. It's a shame (especially for long running server code) to 
 do all the work, just to find out at the end that the client has given up.
 Or maybe there should be a way to cancel an operation if the clients decides 
 it does not want to wait any longer (PostgreSQL does that for example). Here 
 that would mean the server would need to check periodically and coprocessors 
 would need to be written to support that - so maybe that's no-starter.
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5543) Add a keepalive option for IPC connections

2012-03-08 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13225566#comment-13225566
 ] 

stack commented on HBASE-5543:
--

Yeah, it looks like its inevitable that we'll ask the server to do legitimate 
stuff that will take longer than the rpctimeout yet the server is making 
headway: e.g. the reproducing test case, though a little artificial, for 
HBASE-4890  fix possible NPE in HConnectionManager was asking the 
regionserver to open 3k regions.

If its a task like the above, there should be a facility for telling client 
we're alive still or we should just refuse the request because it will take too 
long (The latter we need to do t -- from Benoiit.  If server is going to 
take too long servicing a request, so long the client will be gone by the time 
its done its work, then refuse the request... don't do the increment or update 
that the updating client will not be around to see).

 Add a keepalive option for IPC connections
 --

 Key: HBASE-5543
 URL: https://issues.apache.org/jira/browse/HBASE-5543
 Project: HBase
  Issue Type: Improvement
  Components: client, coprocessors, ipc
Reporter: Andrew Purtell

 On the user list someone wrote in with a connection failure due to a long 
 running coprocessor:
 {quote}
 On Wed, Mar 7, 2012 at 10:59 PM, raghavendhra rahul wrote:
 2012-03-08 12:03:09,475 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server 
 Responder, call execCoprocessor([B@50cb21, getProjection(), rpc version=1, 
 client version=0, methodsFingerPrint=0), rpc version=1, client version=29, 
 methodsFingerPrint=54742778 from 10.184.17.26:46472: output error
 2012-03-08 12:03:09,476 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server 
 handler 7 on 60020 caught: java.nio.channels.ClosedChannelException
 {quote}
 I suggested in response we might consider give our RPC a keepalive option for 
 calls that may run for a long time (like execCoprocessor).
 LarsH +1ed the idea:
 {quote}
 +1 on keepalive. It's a shame (especially for long running server code) to 
 do all the work, just to find out at the end that the client has given up.
 Or maybe there should be a way to cancel an operation if the clients decides 
 it does not want to wait any longer (PostgreSQL does that for example). Here 
 that would mean the server would need to check periodically and coprocessors 
 would need to be written to support that - so maybe that's no-starter.
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5543) Add a keepalive option for IPC connections

2012-03-08 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13225803#comment-13225803
 ] 

Jonathan Hsieh commented on HBASE-5543:
---

Instead of adding to the rpc to make it keep alive longer, maybe be make it 
async, returning some sort of uuid token that the client can poll (or get 
notified) for progress instead?

 Add a keepalive option for IPC connections
 --

 Key: HBASE-5543
 URL: https://issues.apache.org/jira/browse/HBASE-5543
 Project: HBase
  Issue Type: Improvement
  Components: client, coprocessors, ipc
Reporter: Andrew Purtell

 On the user list someone wrote in with a connection failure due to a long 
 running coprocessor:
 {quote}
 On Wed, Mar 7, 2012 at 10:59 PM, raghavendhra rahul wrote:
 2012-03-08 12:03:09,475 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server 
 Responder, call execCoprocessor([B@50cb21, getProjection(), rpc version=1, 
 client version=0, methodsFingerPrint=0), rpc version=1, client version=29, 
 methodsFingerPrint=54742778 from 10.184.17.26:46472: output error
 2012-03-08 12:03:09,476 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server 
 handler 7 on 60020 caught: java.nio.channels.ClosedChannelException
 {quote}
 I suggested in response we might consider give our RPC a keepalive option for 
 calls that may run for a long time (like execCoprocessor).
 LarsH +1ed the idea:
 {quote}
 +1 on keepalive. It's a shame (especially for long running server code) to 
 do all the work, just to find out at the end that the client has given up.
 Or maybe there should be a way to cancel an operation if the clients decides 
 it does not want to wait any longer (PostgreSQL does that for example). Here 
 that would mean the server would need to check periodically and coprocessors 
 would need to be written to support that - so maybe that's no-starter.
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5543) Add a keepalive option for IPC connections

2012-03-08 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13225818#comment-13225818
 ] 

stack commented on HBASE-5543:
--

bq. Instead of adding to the rpc to make it keep alive longer, maybe be make it 
async, returning some sort of uuid token that the client can poll (or get 
notified) for progress instead?

I like this idea.

 Add a keepalive option for IPC connections
 --

 Key: HBASE-5543
 URL: https://issues.apache.org/jira/browse/HBASE-5543
 Project: HBase
  Issue Type: Improvement
  Components: client, coprocessors, ipc
Reporter: Andrew Purtell

 On the user list someone wrote in with a connection failure due to a long 
 running coprocessor:
 {quote}
 On Wed, Mar 7, 2012 at 10:59 PM, raghavendhra rahul wrote:
 2012-03-08 12:03:09,475 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server 
 Responder, call execCoprocessor([B@50cb21, getProjection(), rpc version=1, 
 client version=0, methodsFingerPrint=0), rpc version=1, client version=29, 
 methodsFingerPrint=54742778 from 10.184.17.26:46472: output error
 2012-03-08 12:03:09,476 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server 
 handler 7 on 60020 caught: java.nio.channels.ClosedChannelException
 {quote}
 I suggested in response we might consider give our RPC a keepalive option for 
 calls that may run for a long time (like execCoprocessor).
 LarsH +1ed the idea:
 {quote}
 +1 on keepalive. It's a shame (especially for long running server code) to 
 do all the work, just to find out at the end that the client has given up.
 Or maybe there should be a way to cancel an operation if the clients decides 
 it does not want to wait any longer (PostgreSQL does that for example). Here 
 that would mean the server would need to check periodically and coprocessors 
 would need to be written to support that - so maybe that's no-starter.
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira