On 2/15/19 11:30 AM, Alan Snyder wrote:
I think this situation is a mess.

The “general contract” of a Collection, I believe, is that it contains zero
or more identified member objects, as defined by appropriate equals() method.
I know this is hard to specify precisely, but I presume we all “know it when
we see it”.

There is value to “collections” whose members are not objects but are
equivalence classes of objects, as defined by a nonstandard equality test,
but I think it is a mistake to call them Collections.

If a method takes a parameter of type Collection, it should be free to assume
that the parameter object supports the “general contract” of Collection. Is
there any plausible alternative?

[back from vacation]

Yes, this situation is something of a mess.

I think the original idea was that the "general contract" was that collections contained elements whose membership was defined by equals(). However, for a very long time now there have been collections in the JDK such as SortedSet and IdentityHashMap (more precisely, IdentityHashMap's keyset). I think we have to accept the fact that these are collections even though their membership semantics differ from those provided by equals().

Changing one method in the JDK to support a non-standard Collection parameter
does not solve the problem, because non-JDK methods with Collection (etc.)
parameters could have similar “misbehavior”. How would the developer know
when a specific TreeSet can or cannot be passed to a method? Does every
method that accepts a Collection (etc.) parameter require boilerplate (as in
the disjoint example) explaining its exact requirements or how it can go
wrong?

Perhaps it would be useful to define specific behaviors that these
nonstandard “collections” might support. For example, a Membership interface
(with method contains(e)) would be perfect for a removeAll(Membership) method
on Collection, implemented as you propose.
The wording on Collections.disjoint() is particularly clumsy. I'm not entirely sure what to do about it and its implementation; suggestions welcome. But I also don't think Collections.disjoint() is a very important problem, compared to what's going on in the rest of the framework.

Introducing a new interface type is basically a non-starter. Objects of type Collection get passed around a lot, and users can usefully rely on things like size(), they can iterate the Collection, etc.

Once we've admitted to the fact that different collections can have different membership semantics, the main point of this bug (JDK-6982173) is to fix up the semantics of AbstractSet.removeAll() to remove the semantic shifting that can occur based on the relative sizes of the collections.

There's another bug JDK-8190545 that needs to be fixed at some point, which is a larger scale cleanup to clarify and remove assumptions about collection membership. It starts out from the point that SortedSet and its ilk define membership using a Comparator; yet TreeSet.contains() still talks about using equals(), which is simply incorrect. Pulling on that thread for a while reveals a bunch of places in the spec that need adjustment. See the comments in the bug report and the linked email discussions.

After these are fixed, the specification and implementation will at least be self-consistent. However, it will still be the case that HashSet, TreeSet, and IdentityHashMap's keyset will have different membership semantics, and if you pass them around interchangeably you might get unexpected results. I think we just have to live with that.

s'marks

Reply via email to