[jira] [Commented] (HDFS-7174) Support for more efficient large directories

Yi Liu (JIRA) Thu, 02 Oct 2014 22:07:03 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-7174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14157676#comment-14157676
 ]


Yi Liu commented on HDFS-7174:
------------------------------

[~kihwal], I like the idea. It improve performance of add/remove for large 
directory, but almost doesn't increase memory usage (has some increment when 
switching to use {{INodeHashedArrayList}}, explain as following) and search 
operation performance is not decreased.
* 1024 size list table needs 1024 * 4 = 4K size memory to store references 
(assume reference is 4 bytes) of ArrayLists, and originally we only have 1 
ArrayList, but now the number is 1024 ArrayLists, and in Java, Class has 
overhead, suppose it's 12 bytes, so total increased Class overhead is ~1024*12 
= 12K. In conclusion there are 16K memory increased for this FSDirectory.

Currently the children of INodeDirectory are stored as sorted array list. The 
time complexity:
Add: O(log n) + resize the list and array copy.
Delete: O(log n) + array copy.
Search: O(log n)
When the list gets bigger,  _resize list (allocate new memory) and array copy_ 
becomes performance issue. 
Your idea is good to solve this issue.

{quote}
The biggest performance hit is due to the way insertion is done in ArrayList. 
An insertion requires copying of average of 1/2 n entries. Also as the list 
gets bigger, a new one with a bigger size needs to be allocated, copied and 
partially copied again. For big directories, we are talking megabytes. Far 
better performance can be obtained by using different data structures, but I 
believe ArrayList (originally primitive array) was chosen to minimize the 
memory usage.
{quote}
Agree

{quote}
I wonder, is it worthwhile to try to store the inodeid as an long[] directly?
As a reference, for c++ on modern machines inserting elements in std::vector is 
consistently faster than std::list
{quote}
ArrayList is also efficient for memory, it stores array of references, just 
have a Class overhead (Maybe 12 bytes?).
std::vector is similar with ArrayList in Java (Seems ArrayList and vector in 
java just have difference in synchronization and  growth)

I have few comments:
*1.*  I see you choose the threshold 50000, and from the performance grid, 
{{ArrayList}} and {{Auto Switching}} are close at this point.  I suggest we 
choose a bit bigger value, since we need to consider the memory usage, as 
explained in #1
*2.* Following code
{code}
@Override
  public void add(int index, INode node) {
    getList().add(index, node);
    size++;
    checkSize();
  }
{code}
We can check the size firstly and then do add, it will be more efficient, same 
as delete
*3.* We also need to consider the behavior when reach the threshold. There may 
be add/remove alternately, and cause switch the list frequently?  Maybe we can 
define separate thresholds for add/remove, and remove threshold is a bit 
smaller.



> Support for more efficient large directories
> --------------------------------------------
>
>                 Key: HDFS-7174
>                 URL: https://issues.apache.org/jira/browse/HDFS-7174
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Kihwal Lee
>            Assignee: Kihwal Lee
>            Priority: Critical
>         Attachments: HDFS-7174.new.patch, HDFS-7174.patch, HDFS-7174.patch
>
>
> When the number of children under a directory grows very large, insertion 
> becomes very costly.  E.g. creating 1M entries takes 10s of minutes.  This is 
> because the complexity of an insertion is O\(n\). As the size of a list 
> grows, the overhead grows n^2. (integral of linear function).  It also causes 
> allocations and copies of big arrays.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7174) Support for more efficient large directories

Reply via email to