[ 
https://issues.apache.org/jira/browse/HDFS-7174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160847#comment-14160847
 ] 

Colin Patrick McCabe commented on HDFS-7174:
--------------------------------------------

This is a great idea, [~kihwal].  Definitely something we need.

This reminds me of why [~tlipcon] created {{ChunkedArrayList}}. We found that 
block reports were generating too much garbage when they created their giant 
{{ArrayLists}}.  We had the same problem described here where resizing a giant 
{{ArrayList}} required an enormous amount of copying, and made the previous 
giant array a giant piece of garbage (which could trigger a full GC).  I was 
about to suggest using {{ChunkedArrayList}}, but I don't think it supports 
insertion into the middle of the list, unfortunately.  It might not be too hard 
to extend {{ChunkedArrayList}} to support insertion into the middle, though... 
perhaps we should consider this.

As [~hitliuyi] pointed out, the current patch has a problem.  If we go back and 
forth between {{switchingThreshold}} (say, by repeatedly adding and removing a 
single element to a directory), we pay a very high cost.  To prevent this, the 
threshold for converting a {{INodeHashedArrayList}} back to a simple 
{{INodeArrayList}} should be lower than the threshold for doing the opposite 
conversion.

I also agree with [~jingzhao] that scaling could become a problem with the 
proposed scheme, since it only has a single level of partitioning.  I guess the 
counter-argument here is that there won't be that many giant directories and 
this works for your needs.

[~shv] wrote: I am probably late to the party, but for whatever it worth.  Did 
you consider using balanced trees for inode lists, something like B-trees?

B-trees would be an excellent solution here.  Since they generally use arrays 
in the leaf nodes, this also gets you the benefits of tighter packing in 
memory.  I guess the tricky part is writing the code.

> Support for more efficient large directories
> --------------------------------------------
>
>                 Key: HDFS-7174
>                 URL: https://issues.apache.org/jira/browse/HDFS-7174
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Kihwal Lee
>            Assignee: Kihwal Lee
>            Priority: Critical
>         Attachments: HDFS-7174.new.patch, HDFS-7174.patch, HDFS-7174.patch
>
>
> When the number of children under a directory grows very large, insertion 
> becomes very costly.  E.g. creating 1M entries takes 10s of minutes.  This is 
> because the complexity of an insertion is O\(n\). As the size of a list 
> grows, the overhead grows n^2. (integral of linear function).  It also causes 
> allocations and copies of big arrays.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to