[ 
https://issues.apache.org/jira/browse/HDFS-7276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14185421#comment-14185421
 ] 

Colin Patrick McCabe commented on HDFS-7276:
--------------------------------------------

Are {{Packet}} objects long-lived?  Do they regularly become part of the JVM's 
PermGen?  If so, then {{ByteArrayManager}} will provide some benefit, because 
it avoid full GCs resulting from garbage in the PermGen.  On the other hand, if 
{{Packet}} objects are not long-lived, then {{ByteArrayManager}} is not 
necessary, and may even make performance worse.  The JVM is pretty good at 
dealing with short-lived objects, at least when using CMS.  Clearly 
{{ByteArrayManager}} is not going to handle dealing with many arrays of 
different sizes as well as the JVM can.  It also has additional overhead from 
logging, volatiles, and all the bookkeeping.

I haven't thought about this that much, but my gut feeling is that most Packet 
objects are not long-lived, and manually managing this memory is not a win.  
Maybe someone else can comment with some numbers or analysis?

I agree that it might be useful sometimes to limit the number of Packet objects 
that are in flight.  Of course we don't need to manually manage memory to do 
this, we just need to have a counter and a condition variable.  I think we 
definitely need to make this configurable, not mandatory as it is in this 
patch.  Some people will not want to limit the number of packets in flight, or 
will want to set a different limit than 2048 (which it is in this patch).

> Limit the number of byte arrays used by DFSOutputStream
> -------------------------------------------------------
>
>                 Key: HDFS-7276
>                 URL: https://issues.apache.org/jira/browse/HDFS-7276
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>            Reporter: Tsz Wo Nicholas Sze
>            Assignee: Tsz Wo Nicholas Sze
>         Attachments: h7276_20141021.patch, h7276_20141022.patch, 
> h7276_20141023.patch, h7276_20141024.patch
>
>
> When there are a lot of DFSOutputStream's writing concurrently, the number of 
> outstanding packets could be large.  The byte arrays created by those packets 
> could occupy a lot of memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to