[ 
https://issues.apache.org/jira/browse/HBASE-10185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13893195#comment-13893195
 ] 

Nicolas Liochon commented on HBASE-10185:
-----------------------------------------

Yes, if it's an issue in trunk, we should solve it at least there. 
In addCall, we reusing the exception accross calls, so other calls may receive 
'DoNotRetry'. We can't take a decision based on the exception that caused the 
connection to be closed, as it's shared between the calls.
Connection closed should come from connection issue, i.e. we don't expect a  
DoNotRetryIOException, as a connection error will be retried all the time.
May be the issue is that an error (or some errors?) in Writable.write should 
not close the connection... Or that Writable.write should just not send a 
DoNotRetryIOException: it's an application level exception, difficult to manage 
in the network connectivity layer... I suppose that the 0.96+ cannot have this, 
as the serialization is done with protobuf.

> HBaseClient retries even though a DoNotRetryException was thrown
> ----------------------------------------------------------------
>
>                 Key: HBASE-10185
>                 URL: https://issues.apache.org/jira/browse/HBASE-10185
>             Project: HBase
>          Issue Type: Bug
>          Components: IPC/RPC
>    Affects Versions: 0.94.12
>            Reporter: Samarth
>
> Throwing a DoNotRetryIOException inside  Writable.write(Dataoutput) method 
> doesn't prevent HBase from retrying. Debugging the code locally, I figured 
> that the bug lies in the way HBaseClient simply throws an IOException when it 
> sees that a connection has been closed unexpectedly.  
> Method:
> public Writable call(Writable param, InetSocketAddress addr,
>                        Class<? extends VersionedProtocol> protocol,
>                        User ticket, int rpcTimeout)
> Excerpt of code where the bug is present:
> while (!call.done) {
>         if (connection.shouldCloseConnection.get()) {
>           throw new IOException("Unexpected closed connection");
>         }
> Throwing this IOException causes the ServerCallable.translateException(t) to 
> be a no-op resulting in HBase retrying. 
> From my limited view and understanding of the code, one way I could think of 
> handling this is by looking at the closeConnection member variable of a 
> connection to determine what kind of exception should be thrown. 
> Specifically, when a connection is closed, the current code does this: 
>     protected synchronized void markClosed(IOException e) {
>       if (shouldCloseConnection.compareAndSet(false, true)) {
>         closeException = e;
>         notifyAll();
>       }
>     }
> Within HBaseClient's call method, the code could possibly be modified to:
> while (!call.done) {
>         if (connection.shouldCloseConnection.get() ) {
>                  if(connection.closeException instanceof                   
> DoNotRetryIOException) {
> throw closeException;
> }
>           throw new IOException("Unexpected closed connection");
>         }



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to