[
https://issues.apache.org/jira/browse/HDFS-7174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14157676#comment-14157676
]
Yi Liu commented on HDFS-7174:
------------------------------
[~kihwal], I like the idea. It improve performance of add/remove for large
directory, but almost doesn't increase memory usage (has some increment when
switching to use {{INodeHashedArrayList}}, explain as following) and search
operation performance is not decreased.
* 1024 size list table needs 1024 * 4 = 4K size memory to store references
(assume reference is 4 bytes) of ArrayLists, and originally we only have 1
ArrayList, but now the number is 1024 ArrayLists, and in Java, Class has
overhead, suppose it's 12 bytes, so total increased Class overhead is ~1024*12
= 12K. In conclusion there are 16K memory increased for this FSDirectory.
Currently the children of INodeDirectory are stored as sorted array list. The
time complexity:
Add: O(log n) + resize the list and array copy.
Delete: O(log n) + array copy.
Search: O(log n)
When the list gets bigger, _resize list (allocate new memory) and array copy_
becomes performance issue.
Your idea is good to solve this issue.
{quote}
The biggest performance hit is due to the way insertion is done in ArrayList.
An insertion requires copying of average of 1/2 n entries. Also as the list
gets bigger, a new one with a bigger size needs to be allocated, copied and
partially copied again. For big directories, we are talking megabytes. Far
better performance can be obtained by using different data structures, but I
believe ArrayList (originally primitive array) was chosen to minimize the
memory usage.
{quote}
Agree
{quote}
I wonder, is it worthwhile to try to store the inodeid as an long[] directly?
As a reference, for c++ on modern machines inserting elements in std::vector is
consistently faster than std::list
{quote}
ArrayList is also efficient for memory, it stores array of references, just
have a Class overhead (Maybe 12 bytes?).
std::vector is similar with ArrayList in Java (Seems ArrayList and vector in
java just have difference in synchronization and growth)
I have few comments:
*1.* I see you choose the threshold 50000, and from the performance grid,
{{ArrayList}} and {{Auto Switching}} are close at this point. I suggest we
choose a bit bigger value, since we need to consider the memory usage, as
explained in #1
*2.* Following code
{code}
@Override
public void add(int index, INode node) {
getList().add(index, node);
size++;
checkSize();
}
{code}
We can check the size firstly and then do add, it will be more efficient, same
as delete
*3.* We also need to consider the behavior when reach the threshold. There may
be add/remove alternately, and cause switch the list frequently? Maybe we can
define separate thresholds for add/remove, and remove threshold is a bit
smaller.
> Support for more efficient large directories
> --------------------------------------------
>
> Key: HDFS-7174
> URL: https://issues.apache.org/jira/browse/HDFS-7174
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Kihwal Lee
> Assignee: Kihwal Lee
> Priority: Critical
> Attachments: HDFS-7174.new.patch, HDFS-7174.patch, HDFS-7174.patch
>
>
> When the number of children under a directory grows very large, insertion
> becomes very costly. E.g. creating 1M entries takes 10s of minutes. This is
> because the complexity of an insertion is O\(n\). As the size of a list
> grows, the overhead grows n^2. (integral of linear function). It also causes
> allocations and copies of big arrays.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)