Oswaldo Olivo created COLLECTIONS-545:
-----------------------------------------
Summary: Undocumented performance issue in the removeAll method in
CollectionUtils
Key: COLLECTIONS-545
URL: https://issues.apache.org/jira/browse/COLLECTIONS-545
Project: Commons Collections
Issue Type: Bug
Components: Collection
Affects Versions: 4.1
Environment: Ubuntu 14.04
Reporter: Oswaldo Olivo
Priority: Trivial
This bug is analogous to https://issues.apache.org/jira/browse/COLLECTIONS-544
The method removeAll in CollectionUtils is inefficient when the second
parameter collection has a slow containment method.
The following is the current implementation with its documentation:
============================
/**
* Removes the elements in <code>remove</code> from
<code>collection</code>. That is, this
* method returns a collection containing all the elements in <code>c</code>
* that are not in <code>remove</code>. The cardinality of an element
<code>e</code>
* in the returned collection is the same as the cardinality of
<code>e</code>
* in <code>collection</code> unless <code>remove</code> contains
<code>e</code>, in which
* case the cardinality is zero. This method is useful if you do not wish
to modify
* the collection <code>c</code> and thus cannot call
<code>collection.removeAll(remove);</code>.
*
* @param <E> the type of object the {@link Collection} contains
* @param collection the collection from which items are removed (in the
returned collection)
* @param remove the items to be removed from the returned
<code>collection</code>
* @return a <code>Collection</code> containing all the elements of
<code>collection</code> except
* any elements that also occur in <code>remove</code>.
* @throws NullPointerException if either parameter is null
* @since 4.0 (method existed in 3.2 but was completely broken)
*/
public static <E> Collection<E> removeAll(final Collection<E> collection,
final Collection<?> remove) {
return ListUtils.removeAll(collection, remove);
}
=======================================
We can notice the inefficiency by looking at the removeAll method in ListUtils.
The removeAll method from ListUtils is implemented and documented as follows:
=======================================
/**
* Removes the elements in <code>remove</code> from
<code>collection</code>. That is, this
* method returns a list containing all the elements in
<code>collection</code>
* that are not in <code>remove</code>. The cardinality of an element
<code>e</code>
* in the returned collection is the same as the cardinality of
<code>e</code>
* in <code>collection</code> unless <code>remove</code> contains
<code>e</code>, in which
* case the cardinality is zero. This method is useful if you do not wish
to modify
* <code>collection</code> and thus cannot call
<code>collection.removeAll(remove);</code>.
* <p>
* This implementation iterates over <code>collection</code>, checking each
element in
* turn to see if it's contained in <code>remove</code>. If it's not
contained, it's added
* to the returned list. As a consequence, it is advised to use a
collection type for
* <code>remove</code> that provides a fast (e.g. O(1)) implementation of
* {@link Collection#contains(Object)}.
*
* @param <E> the element type
* @param collection the collection from which items are removed (in the
returned collection)
* @param remove the items to be removed from the returned
<code>collection</code>
* @return a <code>List</code> containing all the elements of
<code>c</code> except
* any elements that also occur in <code>remove</code>.
* @throws NullPointerException if either parameter is null
* @since 3.2
*/
public static <E> List<E> removeAll(final Collection<E> collection, final
Collection<?> remove) {
final List<E> list = new ArrayList<E>();
for (final E obj : collection) {
if (!remove.contains(obj)) {
list.add(obj);
}
}
return list;
}
=======================================
In the case of ListUtils:removeAll, the inefficiency is properly documented.
Perhaps the disclaimer about potential inefficiencies depending on the type
of the parameter collection in ListUtils:removeAll should also be included in
CollectionUtils:removeAll.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)