[
https://issues.apache.org/jira/browse/HADOOP-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12654027#action_12654027
]
Raghu Angadi commented on HADOOP-4797:
--------------------------------------
JVM :
- For NIO sockets, Sun's implementation uses a internal direct buffer.
It keeps up to 3 such buffers for each thread. It creates a new one each time
the existing buffers are not large enough.
RPC Server :
- While sending and receiving serialized data, the handlers invoke
read() or write() with the _entire_ buffer.
- If there are RPCs that return a lot of data (e.g. listFiles() on a
large directory), it ends up creating large direct buffers
- in one of the cases, clients listed a large directory (35k files, 6MB
serialized data).
-- in addition the clients increased number of files after such
calls.
-- as result, server ends up creating thousands of 6MB buffers
since each time JVM requires a slightly larger direct buffer.
-- Full GC might help but not a viable option.
-- Not sure even after full GC if this memory will be returned
back to OS.
I think fix is fairly straight fwd. RPC server read or write in smaller chunks.
for e.g. :
{code}
// Replace
nWritten = write(buf, 0, len);
// with
nWritten = 0;
while (nWritten < len) {
int ret = write(buf, nWritten, MIN(len-nWritten, 64KB));
if (ret <= 0) break;
//...
}
{code}
> RPC Server can leave a lot of direct buffers
> ---------------------------------------------
>
> Key: HADOOP-4797
> URL: https://issues.apache.org/jira/browse/HADOOP-4797
> Project: Hadoop Core
> Issue Type: Bug
> Components: ipc
> Affects Versions: 0.17.0
> Reporter: Raghu Angadi
> Assignee: Raghu Angadi
>
> RPC server unwittingly can soft-leak direct buffers. One observed case is
> that one of the namenodes at Yahoo took 40GB of virtual memory though it was
> configured for 24GB memory. Most of the memory outside Java heap expected to
> be direct buffers. This shown to be because of how RPC server reads and
> writes serialized data. The cause and proposed fix are in following comment.
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.