Along the lines of the previous mail, people have and will ask "why
can't I redefine equals/hashCode". And the answer has two layers:
- The constraints on equals/hashCode are stronger for records, and
users might inadvertently violate them. (They can be specified in the
overrides of equals/hashCode in AbstractRecord, so there at least can be
a place where this specification lives, even if no one reads it.)
- In conjunction with ancillary fields, the constraints are sure to be
violated, whether inadvertently and deliberately.
Let's take a look at what sorts of modifications to equals/hashCode
would be OK, should we decide to relax this restriction. Equality
should still derive from the record's state, but there might be
acceptable variations.
Would it be OK to _widen_ the definition of equality, by ignoring a
component of the record?
This is an example of what Gunnar asked for, which is to restrict
equality to the primary key fields:
record PersonEntity(int primaryKey, String name, int age) {
// equality based only on primaryKey
}
Is this OK? Well, let's look at our model:
- Does ctor(dtor(c)) == c? Yes.
- if S1==S2, does ctor(S1) == ctor(S2)? Yes.
- For equal instances, does mutating them in the same way yield equal
instances? Yes.
- For equal instances, does calling the same method on both with the
same parameters yield equivalent results? No.
So, if p1 == p2, we cannot rely on p1.age() == p2.age(), so this fails
the requirements of our pseudo-formal model. (Assuming our model is the
right one.)
So, how would we feel about that? Two records that are equals() to each
other, but not substitable?
A more subtle version of this would be to consider all components, but
use a more inclusive notion of equality for that field, such as
comparing array components by contents.
record Numbers(int[] numbers) {
// equality based on Arrays.equals()
}
- Does ctor(dtor(c)) == c? Yes.
- Do equal state vectors produce equal records? Yes.
- Do identical mutations on equal records produce equal records? Yes.
- Does identical operations on equal records produce equal results?
Almost...
The Almost qualification can be seen here:
int[] a1;
int[] a2 = copyOf(a1);
Numbers r1 = new Numbers(a1), r2 = new Numbers(a2);
boolean same = a1.numbers().equals(a2.numbers())
The accessor will yield up the array references, which will not be
equals() to each other. This is essentially the same problem as above.
You get a similar result if your record represents something like a
rational number and you don't normalize to lowest terms in the
constructor; then you can have q1 equal q2, but q1.numerator() !=
q1.numerator().
Are any of these variations compelling enough to suggest we've got the
wrong model?
On 3/16/2018 2:55 PM, Brian Goetz wrote:
There are a number of potentially open details on the design for
records. My inclination is to start with the simplest thing that
preserves the flexibility and expectations we want, and consider
opening up later as necessary.
One of the biggest issues, which Kevin raised as a must-address issue,
is having sufficient support for precondition validation. Without
foreclosing on the ability to do more later with declarative guards, I
think the recent construction proposal meets the requirement for
lightweight enforcement with minimal or no duplication. I'm hopeful
that this bit is "there".
Our goal all along has been to define records as being “just macros”
for a finer-grained set of features. Some of these are motivated by
boilerplate; some are motivated by semantics (coupling semantics of
API elements to state.) In general, records will get there first, and
then ordinary classes will get the more general feature, but the
default answer for "can you relax records, so I can use it in this
case that almost but doesn't quite fit" should be "no, but there will
probably be a feature coming that makes that class simpler, wait for
that."
Some other open issues (please see my writeup at
http://cr.openjdk.java.net/~briangoetz/amber/datum.html for
reference), and my current thoughts on these, are outlined below.
Comments welcome!
- Extension. The proposal outlines a notion of abstract record,
which provides a "width subtyped" hierarchy. Some have questioned
whether this carries its weight, especially given how Scala doesn't
support case-to-case extension (some see this as a bug, others as an
existence proof.) Records can implement interfaces.
- Concrete records are final. Relaxing this adds complexity to the
equality story; I'm not seeing good reasons to do so.
- Additional constructors. I don't see any reason why additional
constructors are problematic, especially if they are constrained to
delegate to the default constructor (which in turn is made far simpler
if there can be statements ahead of the this() call.) Users may find
the lack of additional constructors to be an arbitrary limitation (and
they'd probably be right.)
- Static fields. Static fields seem harmless.
- Additional instance fields. These are a much bigger concern. While
the primary arguments against them are of the "slippery slope"
variety, I still have deep misgivings about supporting unrestricted
non-principal instance fields, and I also haven't found a reasonable
set of restrictions that makes this less risky. I'd like to keep
looking for a better story here, before just caving on this, as I
worry doing so will end up biting us in the back.
- Mutability and accessibility. I'd like to propose an odd choice
here, which is: fields are final and package (protected for abstract
records) by default, but finality can be explicitly opted out of
(non-final) and accessibility can be explicitly widened (public).
- Accessors. Perhaps the most controversial aspect is that records
are inherently transparent to read; if something wants to truly
encapsulate state, it's not a record. Records will eventually have
pattern deconstructors, which will expose their state, so we should go
out of the gate with the equivalent. The obvious choice is to expose
read accessors automatically. (These will not be named getXxx; we are
not burning the ill-advised Javabean naming conventions into the
language, no matter how much people think it already is.) The obvious
naming choice for these accessors is fieldName(). No provision for
write accessors; that's bring-your-own.
- Core methods. Records will get equals, hashCode, and toString.
There's a good argument for making equals/hashCode final (so they
can't be explicitly redeclared); this gives us stronger preservation
of the data invariants that allow us to safely and mechanically
snapshot / serialize / marshal (we'd definitely want this if we ever
allowed additional instance fields.) No reason to suppress override
of toString, though. Records could be safely made cloneable() with
automatic support too (like arrays), but not clear if this is worth it
(its darn useful for arrays, though.) I think the auto-generated
getters should be final too; this leaves arrays as second-class
components, but I am not sure that bothers me.