[
https://issues.apache.org/jira/browse/HDFS-7276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14185421#comment-14185421
]
Colin Patrick McCabe commented on HDFS-7276:
--------------------------------------------
Are {{Packet}} objects long-lived? Do they regularly become part of the JVM's
PermGen? If so, then {{ByteArrayManager}} will provide some benefit, because
it avoid full GCs resulting from garbage in the PermGen. On the other hand, if
{{Packet}} objects are not long-lived, then {{ByteArrayManager}} is not
necessary, and may even make performance worse. The JVM is pretty good at
dealing with short-lived objects, at least when using CMS. Clearly
{{ByteArrayManager}} is not going to handle dealing with many arrays of
different sizes as well as the JVM can. It also has additional overhead from
logging, volatiles, and all the bookkeeping.
I haven't thought about this that much, but my gut feeling is that most Packet
objects are not long-lived, and manually managing this memory is not a win.
Maybe someone else can comment with some numbers or analysis?
I agree that it might be useful sometimes to limit the number of Packet objects
that are in flight. Of course we don't need to manually manage memory to do
this, we just need to have a counter and a condition variable. I think we
definitely need to make this configurable, not mandatory as it is in this
patch. Some people will not want to limit the number of packets in flight, or
will want to set a different limit than 2048 (which it is in this patch).
> Limit the number of byte arrays used by DFSOutputStream
> -------------------------------------------------------
>
> Key: HDFS-7276
> URL: https://issues.apache.org/jira/browse/HDFS-7276
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: hdfs-client
> Reporter: Tsz Wo Nicholas Sze
> Assignee: Tsz Wo Nicholas Sze
> Attachments: h7276_20141021.patch, h7276_20141022.patch,
> h7276_20141023.patch, h7276_20141024.patch
>
>
> When there are a lot of DFSOutputStream's writing concurrently, the number of
> outstanding packets could be large. The byte arrays created by those packets
> could occupy a lot of memory.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)