[The Java Posse] Creative solution for the equality problem: Am I missing something?

Reinier Zwitserloot Sat, 16 Oct 2010 23:06:48 -0700

For those who DONT know why equals() is really complicated, scroll to
the end for an explanation. Without knowing about it this post is
probably not going to make much sense. If you understand why a
hypothetical "ColoredList extends ArrayList" class, which adds a color
property to any list, MUST have an equals implementation that says
that a red empty list is equal to a blue empty list, even though that
seems silly, you don't need to read the footnote.


What we really need is for AbstractList's equals() method to be
intelligent enough to realize if 'other' is a subclass of AbstractList
that isn't adding any state that is relevant for equality, in which
case it can do its comparison as usual, or, if 'other' is a subclass
that DOES add state relevant for equality, such as a color property.
If that is the case, AbstractList's equals method should conclude
immediately with: Not equal, even if the contents are.

A few people have proposed such a system, including a somewhat well
known writeup by Venners and Odersky. It's very long so I'll explain
the gist here, but the full paper can be found here:
http://www.artima.com/lejava/articles/equality.html

What they propose is adding a protected boolean canEquals(Object o)
method. The equals() method will actually call other.canEquals(this),
and if that is false, return false. The standard implementation of any
canEquals method pretty much always looks like: return (o instanceof
Point3D);, where Point3D is replaced with the closest parent (or
yourself) that added equality-significant state. Thus, ArrayList and
LinkedList would not override AbstractList's canEquals (which has:
return (o instanceof AbstractList);), but something like a ColoredList
WOULD override and replace it with "return (o instanceof
ColoredList)". This works.... provided you don't forget to override
the canEquals() method, which, as its certainly not a standard java
idiom is easy to forget, and it also introduces another method to the
API.

My flash of insight here is to use this trick to entirely avoid the
need for a canEquals method *AND* automatically do the right thing,
leaving virtually no room for accidental error:

if (!(o instanceof Self)) return false;
Method m1 = o.getClass().getMethod("equals", Object.class);
Method m2 = Self.class.getMethod("equals", Object.class);
if (m1 != m2) return false;

The idea is: If a  hypothetical other.equals(this) call would end up
using the same equals method as myself, then these objects could be
equal, even if their actual types don't match.  A new equivalence
relation, like Point3D or colouredlist, HAVE to override equals so
they can include their new property (z for Point3D, colour for
ColouredList) in the comparison. However, an implementation detail,
such as ArrayList and LinkedList, or a JPA proxy, have absolutely no
need for overriding AbstractList/Point's equals method, and in fact,
they don't. I've double-checked the java sources, neither LinkedList
nor ArrayList override AbstractList's default equals implementation.

I guess there's a somewhat theoretical space where a subclass
overrides equals() for efficiency reasons, but that's probably an
acceptable price to pay to gain the advantage of not having another
method cluttering up the API, and a far smaller chance of breaking the
contract by forgetting to override canEquals.

Am I missing something, or is this too hacky a solution?


FOOTNOTE: Why is equals problematic?

Equality in java is a lot more problematic than you might at first
glance think. Josh Bloch, when he wrote effective java, proposed the
following template for writing equals methods. Let's assume we have a
simple point class:

public boolean equals(Object o) {
    if (o == null) return false;
    if (o == this) return true;
    if (!(o instanceof Point)) return false;
    if (((Point)o).x != this.x) return false;
    if (((Point)o).y != this.y) return false;
    return true;
}

Simple enough. But wrong. In the second edition, the instanceof check
was revised to this:

if (o.getClass() != this.getClass()) return false;

and the reason is the equals contract, which says that equality in
java must be reflexive (if a.equals(b), then b.equals(a) must also
hold), symmetric (a.equals(a) must always hold) and transitive (if
a.equals(b), and b.equals(c), then a.equals(c) must hold). symmetric
is simple enough, but the others aren't. Let's say there's a subclass
of Point named 3d point, which adds a z coordinate.

Equals is easily rewritten to include: if (((Point3D)o).z != this.z)
return false; - but what should Point3D do when you give it a Point
class? There's only one thing to do, because of the reflexive rule: It
should compare x and y and not compare z (as the non-3D point has
none). It HAS to do this - because when calling
point2d.equals(point3d), that's what happens, and you have to do the
same as it.

But now we're in deep trouble. If [0, 0, 1] is equal to [0, 0], and
[0, 0] is in turn equal to [0, 0, 2], we are forced by the
transitivity rule to conclude that [0, 0, 1] is equal to [0, 0, 2].
But that's preposterous! Nobody  would expect these 2 different points
in 3D space to nevertheless be .equals() to each other. And yet,
that's the ONLY way to get equality right if Point is written with
that instanceof check.

This is why Josh revised effective java. But now we have a problem:
Technically, one should only use subclassing when changing the nature
of objects. i.e. you have a class named "Shape" and you subclass it to
create "Square". It's perfectly allright than any random shape is
never equal to a square, but unfortunately there's a lot of
subclassing merely for implementation details. For example, LinkedList
and ArrayList are virtual similes of each other and certainly model
the same construct, they are just different implementations.
AbstractList's equals() method is basically broken because a
LinkedList can be equals to an ArrayList - it uses the instanceof
style. As long as you only create implementations which don't add new
state of their own, you're fine, but if you ever create a ColoredList
class, which gives all lists color, you MUST write its equals method
so that an empty red list is equal to an empty blue list, even though
that seems ridiculous. After all, you can't change ArrayList's
equals() method, and it will ignore that color property. Then, by way
of transitivity, red lists equal blue lists if their contents are
equal. Your hands are tied.

-- 
You received this message because you are subscribed to the Google Groups "The 
Java Posse" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/javaposse?hl=en.

[The Java Posse] Creative solution for the equality problem: Am I missing something?

Reply via email to