[jira] Commented: (HDFS-1287) Why TreeSet is used when collecting block information FSDataSet::getBlockReport

Scott Carey (JIRA) Fri, 09 Jul 2010 10:12:19 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886774#action_12886774
 ]


Scott Carey commented on HDFS-1287:
-----------------------------------

As a general rule, if you need an ordered set or map as a result of an 
operation in Java, it is much faster to use a HashMap or HashSet and then order 
the results afterwards than to use a TreeMap or TreeSet and iterate. Inserts 
are very slow in those data structures.  Its even faster still if you can just 
use a List for the whole thing.  If you have to detect duplicates, probe a Map 
or Set before placing into a list then sort the result at the end.

> Why TreeSet is used when collecting block information 
> FSDataSet::getBlockReport
> -------------------------------------------------------------------------------
>
>                 Key: HDFS-1287
>                 URL: https://issues.apache.org/jira/browse/HDFS-1287
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: NarayanaSwamy
>
> As a return value we are converting this to array and returning and in name 
> node also we are iterating ... so can we use list onstead of set. (As the 
> block ids are unique, there may not be duplicates)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1287) Why TreeSet is used when collecting block information FSDataSet::getBlockReport

Reply via email to