Hi, Stuart.
That all looks pretty good to me. I think, "membership semantics" is a
good term.
Just a few minor comments:
Collection.java:
110 * that use different membership semantics. For operations that
involve more than
111 * one collection, it is specified which collection's membership
semantics are
112 * used by the operation.
addAll() and copyOf() involve more than one collection, though I agree
that they do not need to be updated to specify membership semantics.
AbstractCollection.java:
404 * obtaining an iterator from the {@code iterator} method. Each element
405 * is passed to the {@code contains} method of the specified collection.
406 * If this call returns {@code false}, the element is removed from
^^^^^^^^^
Is "this call" a little ambiguous? Maybe:
"If contains() returns false..."
or
"If false is returned..."
?
List.java:
Should containsAll(), removeAll(), retainAll() have the @implNote about
contains() performance?
Thanks,
-Brent
On 5/2/19 6:36 PM, Stuart Marks wrote:
Hi all,
Please review these spec and implementation changes to remove the
"optimization" to AbstractSet.removeAll. Briefly, this method was
specified (and implemented) to iterate one collection or the other
depending on the relative sizes of the collections. The problem is that
this would cause an unexpected semantic shift, since one or the other
collection's contains() method would be called depending on their
relative sizes, and the contains() methods might implement different
semantics depending upon the kind of collection.
The fix is to remove the specification and implementation of
AbstractSet.removeAll and to inherit AbstractCollection.removeAll, which
does the iteration one way consistently.
I've removed overriding removeAll method implementations from
IdentityHashMap's view sets which were added in order to avoid the
"optimization" inherited from AbstractSet.removeAll.
I've added some words to the Collection interface to introduce the term
"membership semantics" for a concept that's been around for a long time
but which never had a name, essentially the contains() method. I've then
updated the specifications of containsAll, removeAll, retainAll, and
(where necessary) equals to specify which collection's membership
semantics are used.
Finally, since this change may introduce some performance issues the
optimization was intended to avoid, I've added some implementation notes
to the various methods to warn about potential performance issues if
this collection's (or the other's) contains() method is linear or worse.
There have been various discussions in the past (see JDK-8178425 for
example) that propose optimizations to the various bulk operations. They
usually involve modifying the decision criteria for iterating this
collection vs iterating the other collection. However, as we concluded
previously, doing this introduces semantic problems. Another approach to
optimizating the bulk cases is to copy the other collection into an
intermediate collection that has an O(1) contains() method; however,
this also changes the semantics and thus must be ruled out. Such
approaches should be left to the caller.
Bug:
https://bugs.openjdk.java.net/browse/JDK-6394757
Previous discussions:
http://mail.openjdk.java.net/pipermail/core-libs-dev/2019-January/058140.html
http://mail.openjdk.java.net/pipermail/core-libs-dev/2019-February/058378.html
(see also additional bugs and linked to the above bug)
Webrev:
http://cr.openjdk.java.net/~smarks/reviews/6394757/webrev.0/
Thanks,
s'marks