[ 
https://issues.apache.org/jira/browse/HADOOP-4483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12641674#action_12641674
 ] 

Ahad Rana commented on HADOOP-4483:
-----------------------------------

Re: That is wrong. clear() removes all elements. You only have to set root = 
null,

You are right. I see that TreeSet uses TreeMap as the underlying container, 
and, each iterator.remove causes a re-balance of the red-black tree. And, yes, 
looks like TreeSet.clear() set root = null, which is obviously speedier. I 
would still argue that under normal load scenarios (from my initial 
observations), the number of blocks in the Collection are <= 10, and since 
delete on the CLRS implementation of the Red-Black Tree takes O(log n), there 
might not be that much gained from the .clear() optimization. In the edge case 
where the system is under heavy load, I observed block count in excess of 1000. 
In this case, a partial removal of the blocks (based on the max blocks 
limitation) would still require the iterator.remove pattern. So perhaps in the 
long term, it might be better to replace the underlying data structure, as 
Hairong suggests. I guess it would be interesting to find out if the code (post 
patch) is a performance bottle neck or not before undertaking the more 
aggressive modifications.    

> getBlockArray in DatanodeDescriptor does not honor passed in maxblocks value
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-4483
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4483
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.1
>         Environment: hadoop-0.18.1 running on a cluster of 16 nodes.
>            Reporter: Ahad Rana
>            Priority: Critical
>             Fix For: 0.18.2
>
>         Attachments: patch.HADOOP-4483
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The getBlockArray method in DatanodeDescriptor.java should honor the passed 
> in maxblocks parameter. In its current form it passed in an array sized to 
> min(maxblocks,blocks.size()) into the Collections.toArray method. As the 
> javadoc for Collections.toArray indicates, the toArray method may discard the 
> passed in array (and allocate a new array) if the number of elements returned 
> by the iterator exceeds the size of the passed in array. As a result, the 
> flawed implementation of this method would return all the invalid blocks for 
> a data node in one go, and thus trigger the NameNode to send a DNA_INVALIDATE 
> command to the DataNode with an excessively large number of blocks. This 
> INVALIDATE command, in turn, could potentially take a very long time to 
> process at the DataNode, and since DatanodeCommand(s) are processed in 
> between heartbeats at the DataNode, this would trigger the NameNode to 
> consider the DataNode to be offline / unresponsive (due to a lack of 
> heartbeats). 
> In our use-case at CommonCrawl.org, we regularly do large scale hdfs file 
> deletions after certain stages of our map-reduce pipeline. These deletes 
> would make certain DataNode(s) unresponsive, and thus impact the cluster's 
> capability to properly balance file-system reads / writes across the whole 
> available cluster. This problem only surfaced once we migrated from our 16.2 
> deployment to the current 18.1 release. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to