[ 
https://issues.apache.org/jira/browse/HBASE-19215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-19215:
-----------------------------------
      Resolution: Fixed
    Hadoop Flags: Reviewed
          Status: Resolved  (was: Patch Available)

Pushed to 1.3 and up

> Incorrect exception handling on the client causes incorrect call timeouts and 
> byte buffer allocations on the server
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-19215
>                 URL: https://issues.apache.org/jira/browse/HBASE-19215
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.3.1
>            Reporter: Abhishek Singh Chouhan
>            Assignee: Abhishek Singh Chouhan
>             Fix For: 2.0.0, 3.0.0, 1.4.0, 1.3.2
>
>         Attachments: HBASE-19215-branch-1.3.patch, 
> HBASE-19215.branch-1.001.patch
>
>
> Ran into the situation of oome on the client : java.lang.OutOfMemoryError: 
> Direct buffer memory.
> When we encounter an unhandled exception during channel write at RpcClientImpl
> {noformat}
> checkIsOpen(); // Now we're checking that it didn't became idle in between.
>         try {
>           call.callStats.setRequestSizeBytes(IPCUtil.write(this.out, header, 
> call.param,
>               cellBlock));
>         } catch (IOException e) {
> {noformat}
> we end up leaving the connection open. This becomes especially problematic 
> when we get an unhandled exception between writing the length of our request 
> on the channel and subsequently writing the params and cellblocks
> {noformat}
>    *dos.write(Bytes.toBytes(totalSize));*
>     // This allocates a buffer that is the size of the message internally.
>     header.writeDelimitedTo(dos);
>     if (param != null) param.writeDelimitedTo(dos);
>     if (cellBlock != null) dos.write(cellBlock.array(), 0, 
> cellBlock.remaining());
>     dos.flush();
>     return totalSize;
> {noformat}
> After reading the length rs allocates a bb and expects data to be filled. 
> However when we encounter an exception during param write we release the 
> writelock in rpcclientimpl and do not close the connection, the exception is 
> handled at AbstractRpcClient.callBlockingMethod and retried. Now the next 
> client request to the same rs writes to the channel however the server 
> interprets this as part of the previous request and errors out during proto 
> conversion when processing the request since its considered malformed(in the 
> worst case this might be misinterpreted as wrong data?). Now the remaining 
> data of the current request is read(the current request's size > prev 
> request's allocated partially filled bytebuffer) and is misinterpreted as the 
> size of new request, in my case this was in gbs. All the client requests time 
> out since this bytebuffer is never completely filled. We should close the 
> connection for any Throwable and not just ioexception.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to