Re: Value equality

Brian Goetz Wed, 18 May 2016 07:58:20 -0700

Great summary of the options.

For those who didn't read the whole thing:
 - CE is bitwise equality -- "are these two things identical copies"
 - OE is calling Object.equals()

- NE (for values) is the synthetic "recurse with == on primitivecomponents, NE on value components, and OE on reference components"

If it were 1995, and we were inventing Java (and we didn't have ourheads addled with an interpreter-based cost model), what would we do? Ithink we'd bind ==(ref,ref) to OE, with an (uglier-named) API point forCE (e.g., Objects.isSameReference) which would be used (a) forknown-interned things, (b) for IdentityHashMap, (c) as a defaultimplementation of Object.equals(), and (d) possibly as ashort-circuiting optimization *inside* overrides of equals().

This hypothetical world (call it J') still gives users the choice of CEvs OE whenever they want, while nudging users towards OE (by giving itthe prime syntactic real estate) which is probably what they want mostof the time.

Why didn't we do this in 1995? Hard to know (I'll ask James next time Isee him), but I'd posit two main forces:

- C bias. Since C has *only* CE (and it was desirable to make Javafeel like "a safer C") it probably seemed like a big improvement alreadyto offer programmers both CE and OE on all references, and binding == toOE probably seemed too radical at the time.

- Cost-model bias. In the Java 1.0 days, pointer comparison wasprobably 100x faster in the interpreter than a virtual call toObject.equals(). If binding == to OE was even considered, it wasprobably deemed implausible.

Of course, both of these feel a bit silly 20 years later, but here weare. So, in a J' world, what would we do with ==(val,val)? I think itwould be a no-brainer -- bind it to NE, since Java developers wouldalready associate == with a deeper comparison. Then we'd just have toadjust whatever the API point for CE is to also accomodate CE on values,and we'd be done.


But, we don't live in J' world.  So our choices become:

P1: Bind ==(val,val) to CE, as we do with refs. Optimization challengeswith the usual (a==b || a.equals(b)) idiom [1], but the rules work thesame for values and refs.

P2: Bind ==(val,val) to NE. This is J' world for values and J world forrefs. (With even bigger optimization challenges for the (a==b ||a.equals(b)) idiom.) Rules are different for values and refs, meaning(a) users will have to keep in mind which world they're in, (b) whenmigrating a class from ref to value they'll have to find and update allequality comparisons (!), (c) writing code that's generic over valuesand refs has to use an idiom that works on both, (d) when migrating codefrom ref-generic to any-generic, inspect every equality comparison tomake sure it's still what was intended.

P3: Add a new equality operator. I've already been laughed at enough,thank you.

P4: Ban ==(val,val). This might be fine in value-only code, but itcomplicates writing generic code, especially migrating generic code.

[1] John points out that if == is CE, then (a==b||a.equals(b)) willredundantly load the fields on failed ==. But, many equalsimplementations start with "a==b" as a short-circuiting optimization,which means "a==b" will be a common (pure) subexpression in theresulting expansion (and for values, methods are monomorphic and willget inlined more frequently), so the two checks can be collapsed.

Going back to op==, there are two plausible options for binding it to
new types:

(P1) Syntax of op==(val,val) and op==(any,any) binds to CE as with
op==(ref,ref).  Therefore, NE is uniformly reached by today's idiom,
which traverses value fields twice.

(P2) Syntax of op==(val,val) and op==(any,any) is direct access to
NE.  CE is reachable by experts at System.isEqualCopy.  The old idiom
for NE works also calls equals twice.

(P3) Same as P1, op== is uniform access to CE.  New op (spelled
"===", ".==", "=~", etc.) is uniform, optimizable access to NE,
attracting users away from legacy idiom for NE.

Re: Value equality

Reply via email to