Re: RFR (M): JDK-6394757: AbstractSet.removeAll semantics are surprisingly dependent on relative sizes

Brent Christian Tue, 07 May 2019 16:44:34 -0700

Hi, Stuart.

That all looks pretty good to me. I think, "membership semantics" is agood term.


Just a few minor comments:

Collection.java:

110 * that use different membership semantics. For operations thatinvolve more than111 * one collection, it is specified which collection's membershipsemantics are

 112  * used by the operation.

addAll() and copyOf() involve more than one collection, though I agreethat they do not need to be updated to specify membership semantics.



AbstractCollection.java:


404 * obtaining an iterator from the {@code iterator} method. Each element
405 * is passed to the {@code contains} method of the specified collection.
406 * If this call returns {@code false}, the element is removed from
         ^^^^^^^^^

Is "this call" a little ambiguous?  Maybe:

"If contains() returns false..."
or
"If false is returned..."

?

List.java:

Should containsAll(), removeAll(), retainAll() have the @implNote aboutcontains() performance?


Thanks,
-Brent

On 5/2/19 6:36 PM, Stuart Marks wrote:

Hi all,
Please review these spec and implementation changes to remove the"optimization" to AbstractSet.removeAll. Briefly, this method wasspecified (and implemented) to iterate one collection or the otherdepending on the relative sizes of the collections. The problem is thatthis would cause an unexpected semantic shift, since one or the othercollection's contains() method would be called depending on theirrelative sizes, and the contains() methods might implement differentsemantics depending upon the kind of collection.
The fix is to remove the specification and implementation ofAbstractSet.removeAll and to inherit AbstractCollection.removeAll, whichdoes the iteration one way consistently.
I've removed overriding removeAll method implementations fromIdentityHashMap's view sets which were added in order to avoid the"optimization" inherited from AbstractSet.removeAll.
I've added some words to the Collection interface to introduce the term"membership semantics" for a concept that's been around for a long timebut which never had a name, essentially the contains() method. I've thenupdated the specifications of containsAll, removeAll, retainAll, and(where necessary) equals to specify which collection's membershipsemantics are used.
Finally, since this change may introduce some performance issues theoptimization was intended to avoid, I've added some implementation notesto the various methods to warn about potential performance issues ifthis collection's (or the other's) contains() method is linear or worse.
There have been various discussions in the past (see JDK-8178425 forexample) that propose optimizations to the various bulk operations. Theyusually involve modifying the decision criteria for iterating thiscollection vs iterating the other collection. However, as we concludedpreviously, doing this introduces semantic problems. Another approach tooptimizating the bulk cases is to copy the other collection into anintermediate collection that has an O(1) contains() method; however,this also changes the semantics and thus must be ruled out. Suchapproaches should be left to the caller.
Bug:

     https://bugs.openjdk.java.net/browse/JDK-6394757

Previous discussions:
http://mail.openjdk.java.net/pipermail/core-libs-dev/2019-January/058140.htmlhttp://mail.openjdk.java.net/pipermail/core-libs-dev/2019-February/058378.html
     (see also additional bugs and linked to the above bug)

Webrev:

     http://cr.openjdk.java.net/~smarks/reviews/6394757/webrev.0/

Thanks,

s'marks

Re: RFR (M): JDK-6394757: AbstractSet.removeAll semantics are surprisingly dependent on relative sizes

Reply via email to