Re: Migrating methods in Collections

Brian Goetz Wed, 23 Dec 2015 07:54:12 -0800

Some good thoughts, and some wishful thinking.

tl;dr Summary:

- I think its a stretch to say that equals() and contains() can beanyfied while still accepting Object. I think there are linguisticsolutions so that existing Object-accepting code can continue to rununchanged for reference instantiations, and that the signatures can begenerally rescued, but I think that's something different than "methodsaccepting Objects can be anyfied."- toArray() is indeed a problem. I believe that the same tools forrescuing equals() can also probably be applied towards toArray().

If methods accepting Object arguments can be anyfied, this removes
some methods from your problem table: Collection.{contains(Object),
remove(Object)}, List.{indexOf(Object), lastIndexOf(Object)} and
Map.{containsKey(Object), containsValue(Object), remove(Object)}.

I realize that there are still a bunch of unresolved issues
in pulling this off. But ignoring them for now...

I agree the same solution should work for all these methods. But Idon't think we'll get to the point where the signature of equals() orcontains() simply accepts Object. Several major concerns:

- Boxing. If these methods accept Object, there is going to be somedegree of boxing that we can't eliminate. Whether this is "some" or "alot", I can't imagine getting it down to the point where we'recomfortable.

- Intrusion. Do we really want to ask authors to deal with Object inComplex.equals()? I would think these methods would want to start witha V and go from there, not have to reason about "if its anything otherthan a boxed V, forget about it, otherwise cast and unbox." This is notlogic we want the user to have to write for each of these methods.


What we want, I think, is for the signature of those methods to be:
 - x(Object) // for reference instantiations
 - x(T)      // for value instantiations

That Object is the erasure of T is a powerful connection we can hang ourhat on here. I think there are at least three linguistic approaches torescuing these methods:


 - contravariant type args (<U super T>)

- some sort of peeling that treats x(T) and x(Object) as separatemethods, but usually defaults/bridges one of them, so you just have toimplement the appropriate one- some way of expressing a signature that means "T when a value, orObject when a reference"

All of these have cons, but we've got a long enough list to suggest thatthere is *a* solution here, and maybe there's a better one if we pull onthat string some more.

So let's assume there's *some* way to write equals/contains/etc so theright things happen. Your list above stands, except that there's stillsome degree of migration.

One natural follow-on question is that if we can anyfy contains, why
can't we do so for containsAll(Collection<?>)? And similarly for
removeAll(Collection<?>), retainAll(Collection<?>). In other words,
is this or some variant allowed?
   <any T> boolean containsAll(Collection<T>)

Good thought! Gavin and I bashed our heads against this one for a whileabout a year ago.

First, note that we only have three such methods:remove/retain/containsAll. And we can "retire" two of them as beinginferior to removeIf. Which means there's just one method here to rescue.

If we have <U super T> vars, I think we can do the same trick. But theother tricks don't work as well, because of a (sensible but frustrating)limitation of old generics interop -- if you have a method with generics:


    void foo(T t)
    void moo(Foo<T> f)

you can do a "raw override"

    void foo(Object t)  // acceptable raw override
    void moo(Foo f)     // acceptable raw override

and that's fine, but you can't do the same with a wildcard:

    void moo(Foo<?> f)  // not OK

So this wouldn't be source-compatible for existing subclasses ofCollection.

However, its possible that the third variant in our candidate list above-- which amounts to some way of writing the dependent type "if T iserased, then the bound of T, otherwise T" -- might be able to get ushere. Or not. If this is the worst of our problems, we have already won.

If so, the main remaining questions surround optionality of results,
that I'll answer separately.


Right, there's a real space of API design here.

Doing nothing about List.remove(index) seems to be legal option.

Yes, that's a legal option (just as today, you can overload foo(T) andfoo(String)). Not sure if it *should* be a legal option (at the veryleast, the compiler should warn you of this, as it should also probablywith overloads that fail to follow a meet rule.)

No
existing code will encounter an ambiguity that is not already present
due to autoboxing (for List<Integer>). New code using or implementing
List<int> will need some way to disambiguate. But I think that some
syntax will be needed to allow anyway.  It might be nice introduce
method removeAt to reduce need to use this syntax, but doesn't seem
necessary?


Can you expand on what you might want for disambiguation here?

About the two Collection toArray() methods:

The no-arg version must return Object[]. I don't see how anyfying (in
any way) can guarantee compatible results.  The <T> T[] toArray(T[]
array) version has worse problems: most current implementations use
reflection if the argument array is not big enough (because there is
no syntax for "new T[n]").  I don't see offhand how to compatibly
mangle reflective code.  Plus, the spec explicitly says that if the
array is too large, a null is appended to elements. Null is of course
not a legal value for non-ref types.

I think "null" can be compatibly replaced with "the default value forthe type", which is the same as "null" for all existing code. So that'snot a blocker. Reflection is harder, but its quite possible that thiswill come out in the "specialization wash". If we can have an anyfiedversion of Arrays.copyOf -- which seems doable -- then I think thatproblem goes away too.

That said, maybe the second version of toArray() should be abandoned inthe ref layer for compatibility only, and we should add the new total method


    T[] toArray(IntFunction<T[]> generator)

as we did with streams. (I think we should introduce this methodregardless, actually, for all the reasons that came up when we werediscussing it for streams. This is not a method we could have(credibly) had in 1.2, but with lambdas in the language, its kind of ano brainer.)

I don't see a good alternative to leaving both forms of toArray as-is,
and to box results -- requiring that even custom non-ref
implementations do so. But this suggests that we should find some
other way (possibly in a utility class) to create a val-type array of
elements in a val-type collection.

Speaking only about *signatures* now, I think the same techniques thatallow us to rescue contains(Object) may do the same for toArray().


 - <U super T> U[] toArray() could work;

- peeling into separate Object[] toArray() for ref / T[] toArray() forval could work;

 - expressing the dependent type (T.erased ? T.bound : T) would also work.

Re: Migrating methods in Collections

Reply via email to