Oswaldo Olivo created COLLECTIONS-534:
-----------------------------------------

             Summary: Performance bug in CollectionBag::retainAll
                 Key: COLLECTIONS-534
                 URL: https://issues.apache.org/jira/browse/COLLECTIONS-534
             Project: Commons Collections
          Issue Type: Bug
          Components: Bag, Collection
    Affects Versions: 4.0
         Environment: Ubuntu 12.04
            Reporter: Oswaldo Olivo


Hi,
There seems to be a performance bug in the method retainAll in the 
CollectionBag class.
The problem is that the code is checking containment over the parameter 
collection, which could be expensive for some types of collections like 
ArrayLists.
One solution could be to convert the Collection into a HashSet and check 
containment in the HashSet.
 If this is done, then running retainAll on a 1,000,000 collection would take 
less than 2 seconds instead of 27 mins, according to my experiments.

____________________________________________
Here's a function to expose the bug:

 public static void collectionBagRetainAllTest() {

        ArrayList<Integer> coll=new ArrayList<Integer>();
        for(int i=0;i<=1000000;++i)
            coll.add(new Integer(i));

        TreeBag<Integer> treeBag=new TreeBag<Integer>(coll);

        CollectionBag<Integer> bag = new CollectionBag<Integer>(treeBag);

        bag.retainAll(coll);
     }
_____________________________________

Here's a proposed patch:

  public boolean retainAll(final Collection<?> coll) {
        if (coll != null) {
            boolean modified = false;
            final Iterator<E> e = iterator();
            HashSet<Object> set=new HashSet<Object>(coll);
            while (e.hasNext()) {
                if (!set.contains(e.next())) {
                    e.remove();
                    modified = true;
                }
            }
            return modified;
        } else {
            // let the decorated bag handle the case of null argument
            return decorated().retainAll(null);
        }
    }
_____________________________________



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to