[ 
https://issues.apache.org/jira/browse/HDFS-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13677633#comment-13677633
 ] 

Suresh Srinivas edited comment on HDFS-4879 at 6/7/13 12:13 AM:
----------------------------------------------------------------

bq. In the case of a delete of 25M blocks, like we saw, the linked list is 
going to be significantly bigger than the ArrayList. I think each Node object 
takes up 64 bytes, right? So the short lived linked list would be 1.6GB instead 
of 400MB, which is likely to push it out of the young generation. It also has 
worse locality of access, causing many more CPU cache misses to traverse over.
Each node object takes 40 bytes. But namenodes are run with enough head room 
and this much memory should not be an issue. As regards to CPU cache misses, 
given delete had to touch so many objects and namenode in general has so much 
active memory, garbage collection etc. I do not think it is such a big deal. 
Even if namenode were to do more work, this kind of big deletes are such a 
rarity, trying to make things performant for it seems unnecessary to me.
                
      was (Author: sureshms):
    bq. In the case of a delete of 25M blocks, like we saw, the linked list is 
going to be significantly bigger than the ArrayList. I think each Node object 
takes up 64 bytes, right? So the short lived linked list would be 1.6GB instead 
of 400MB, which is likely to push it out of the young generation. It also has 
worse locality of access, causing many more CPU cache misses to traverse over.
Each node object takes 40 bytes. But namenodes are run with enough head room 
and this much memory should not be an issue. As regards to CPU cache misses, 
given delete had to touch so many object and namenode in general has so much 
active memory, garbage collection etc. I do not think it is such a big deal. 
Even if namenode say were to do more, this kind of big deletes are such a 
rarity, trying to make things performant for it seems unnecessary to me.
                  
> Add "blocked ArrayList" collection to avoid CMS full GCs
> --------------------------------------------------------
>
>                 Key: HDFS-4879
>                 URL: https://issues.apache.org/jira/browse/HDFS-4879
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>    Affects Versions: 3.0.0, 2.0.4-alpha
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hdfs-4879.txt, hdfs-4879.txt
>
>
> We recently saw an issue where a large deletion was issued which caused 25M 
> blocks to be collected during {{deleteInternal}}. Currently, the list of 
> collected blocks is an ArrayList, meaning that we had to allocate a 
> contiguous 25M-entry array (~400MB). After a NN has been running for a long 
> amount of time, the old generation may become fragmented such that it's hard 
> to find a 400MB contiguous chunk of heap.
> In general, we should try to design the NN such that the only large objects 
> are long-lived and created at startup time. We can improve this particular 
> case (and perhaps some others) by introducing a new List implementation which 
> is made of a linked list of arrays, each of which is size-limited (eg to 1MB).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to