[ 
https://issues.apache.org/jira/browse/COLLECTIONS-302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12642375#action_12642375
 ] 

Henri Yandell commented on COLLECTIONS-302:
-------------------------------------------

Ran a basic test and interestingly ArrayList came out better.

{code:java}
import java.util.*;
import org.apache.commons.collections.*;

public class Bob {

    public static void main(String[] args) throws Exception {
        test(Integer.parseInt(args[0]));
    }
    public static void test(int n) {
        Collection a = new ArrayList();
        for(int i=0; i<n; i++) {
            a.add("bob"+i);
        }
        Collection b = new ArrayList();
        for(int i=0; i<n; i+=2) {
            b.add("bob"+i);
        }
        long t1 = System.currentTimeMillis();
        CollectionUtils.subtract(a, b);
        long t2 = System.currentTimeMillis();
        System.err.println("T" + n + ": " + (t2-t1));
    }

}
{code}

For an input of 10,000, both were around 550 msec. For 100,000 the ArrayList 
was 58000, while the LinkedList was 84000. Hardly scientific as I'm not 
repeating the test in the same run so could be missing out on JIT improving a 
second run, not running multiple times etc. My suspicion is that the ArrayList 
constructor checks to see if things are ArrayLists and does quick arraycopies, 
while the LinkedList constructor just sits and plods along. I retested by 
changing the input to LinkedLists from ArrayLists and the time doubled up to 
102000. Of course when I try LinkedList passing in to the LinkedList variant, 
it goes up to 125000. Ah well.

Point of all that - apart from implying that more testing is needed - is that 
the collection type used might want to depend on the type of the 'a' variable. 

> CollectionUtils.subtract() should not use ArrayList to improve speed
> --------------------------------------------------------------------
>
>                 Key: COLLECTIONS-302
>                 URL: https://issues.apache.org/jira/browse/COLLECTIONS-302
>             Project: Commons Collections
>          Issue Type: Improvement
>          Components: Collection
>            Reporter: Joachim Rudolph
>            Priority: Minor
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> The implementation of version 3.2.1 is
> public static Collection subtract(final Collection a, final Collection b) {
>         ArrayList list = new ArrayList( a );
>         for (Iterator it = b.iterator(); it.hasNext();) {
>             list.remove(it.next());
>         }
>         return list;
>     }
> when a and b are large and similar the subtract implementation will call 
> ArrayList.remove() frequently which copies a potentially large part of the 
> list using system.arraycopy.
> Suggestion : use LinkedList ( at least for large lists )

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to