[
https://issues.apache.org/jira/browse/HBASE-19215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Purtell updated HBASE-19215:
-----------------------------------
Attachment: HBASE-19215-branch-1.3.patch
As [~abhishek.chouhan] mentioned, his patch applies to 1.4 and up. Here's a
patch for 1.3 that I'll commit shortly along with the rest.
> Incorrect exception handling on the client causes incorrect call timeouts and
> byte buffer allocations on the server
> -------------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-19215
> URL: https://issues.apache.org/jira/browse/HBASE-19215
> Project: HBase
> Issue Type: Bug
> Affects Versions: 1.3.1
> Reporter: Abhishek Singh Chouhan
> Assignee: Abhishek Singh Chouhan
> Fix For: 2.0.0, 3.0.0, 1.4.0, 1.3.2
>
> Attachments: HBASE-19215-branch-1.3.patch,
> HBASE-19215.branch-1.001.patch
>
>
> Ran into the situation of oome on the client : java.lang.OutOfMemoryError:
> Direct buffer memory.
> When we encounter an unhandled exception during channel write at RpcClientImpl
> {noformat}
> checkIsOpen(); // Now we're checking that it didn't became idle in between.
> try {
> call.callStats.setRequestSizeBytes(IPCUtil.write(this.out, header,
> call.param,
> cellBlock));
> } catch (IOException e) {
> {noformat}
> we end up leaving the connection open. This becomes especially problematic
> when we get an unhandled exception between writing the length of our request
> on the channel and subsequently writing the params and cellblocks
> {noformat}
> *dos.write(Bytes.toBytes(totalSize));*
> // This allocates a buffer that is the size of the message internally.
> header.writeDelimitedTo(dos);
> if (param != null) param.writeDelimitedTo(dos);
> if (cellBlock != null) dos.write(cellBlock.array(), 0,
> cellBlock.remaining());
> dos.flush();
> return totalSize;
> {noformat}
> After reading the length rs allocates a bb and expects data to be filled.
> However when we encounter an exception during param write we release the
> writelock in rpcclientimpl and do not close the connection, the exception is
> handled at AbstractRpcClient.callBlockingMethod and retried. Now the next
> client request to the same rs writes to the channel however the server
> interprets this as part of the previous request and errors out during proto
> conversion when processing the request since its considered malformed(in the
> worst case this might be misinterpreted as wrong data?). Now the remaining
> data of the current request is read(the current request's size > prev
> request's allocated partially filled bytebuffer) and is misinterpreted as the
> size of new request, in my case this was in gbs. All the client requests time
> out since this bytebuffer is never completely filled. We should close the
> connection for any Throwable and not just ioexception.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)