Seeing no dissent on the claim that the essential use case for ancillary
fields is caching derived properties, let me talk about how I would like
to handle this: lazy (final) fields.
For background, this is something we've been exploring for a long time
(see for example
http://cr.openjdk.java.net/~jrose/draft/lazy-final.html), but this is
also something that we can do in the context of the language if we're
willing to relax the requirements a bit.
The basic idea is that we can describe fields as `lazy` (either static
or instance fields), with an initializer, which are implicitly `final`,
and have the compiler rewrite reads of those fields to do a lazy
initialization instead. For static fields, we can use ConstantDynamic
and get lazy initialization for free; for instance fields, we have to do
a little more work (CASes, fences), but the game is the same.
This is useful well beyond records. For example, classes like `String`
cache a lazily computed has code; these classes could just do
private int cacheHash = computeHashCode();
public int hashCode() { return cacheHash; }
It's also useful for frequently used static fields:
private lazy Logger logger = Logger.of("com.foo.bar");
Much lazy initialization code is error-prone, so this would eliminate
those errors; its also tempting to avoid lazy initialization where it
might be marginally useful. (Static initializers are also one of the
big pain points in AOT; this eliminates many static initializers.)
What does this have to do with records? Well, if the goal is to cache
lazily computed values derived from the state, then lazy fields would
give us that without opening up to the full generality of ancillary
fields. We'd then say that records can only have additional _lazy_
instance fields.
(Sometimes lazy fields are cast in the opposite direction -- cached
methods rather than lazy fields. There are an obvious set of tradeoffs
for how to structure it, but neither is strictly more powerful than the
other.)
On 4/13/2018 1:15 PM, Kevin Bourrillion wrote:
As one of the voices demanding we allow ancillary fields, I can
confirm that I had only these derived-state use cases in mind. I don't
see anything else as legitimate. That is, I think that the semantic
invariants you're trying to preserve for records are worth fighting
for, and additional /non-derived/ state would violate them.
On Fri, Apr 13, 2018 at 9:46 AM, Brian Goetz <brian.go...@oracle.com
<mailto:brian.go...@oracle.com>> wrote:
Let's see if we can make some progress on the elephant in the room
-- ancillary fields. Several have expressed the concern that
without the ability to declare some additional instance state, the
feature will be too limited.
The argument in favor of additional fields is the obvious one;
more classes can be records. And there are some arguably valid
use cases for additional fields that don't conflict with the
design center for records. The best example is derived state:
- When a field is a cached property derived from the record state
(such as how String caches its hashCode)
Arguably, if a field is derived deterministically from immutable
record state, then it is not creating any new record state. This
surely seems within the circle.
The argument against is more of a slippery-slope one; I believe
developers would like to view this feature through the lens of
syntactic boilerplate, rather than through semantics. If we let
them, they would surely and routinely do the following:
record A(int a, int b) {
private int c;
public A(int a, int b, int c) {
this(a, b);
this.c = c;
}
public boolean equals(Object other) {
return default.equals(other) && ((A) other).c == c;
}
}
Here, `c` is surely part of the state of `A`. And, they wouldn't
even know what they'd lost; they would just assume records are a
way of "kickstarting" a class declaration with some public fields,
and then you can mix in whatever private state you want.
Why is this bad? While "reduced-boilerplate classes" is a valid
feature idea, our design goal for records is much more than that.
The semantic constraints on records are valuable because they
yield useful invariants; that they are "just" their state vector,
that they can be freely taken apart and put back together with no
loss of information, and hence can be freely serialized/marshaled
to JSON and back, etc.
We currently prohibit records like `A` via a number of
restrictions: no additional fields, no override of equals. We
don't need all of these restrictions to achieve the desired goal,
but we also can't relax them all without opening the gate. So we
should decide carefully which we want to relax, as making the
wrong choice constrains us in the future.
Before I dive into details of how we might extend records to
support the case of "cached derived state", I'd like to first come
to some agreement that this covers the use cases that we think
fall into the "legitimate" uses of additional fields.
On 3/16/2018 2:55 PM, Brian Goetz wrote:
There are a number of potentially open details on the design
for records. My inclination is to start with the simplest
thing that preserves the flexibility and expectations we want,
and consider opening up later as necessary.
One of the biggest issues, which Kevin raised as a
must-address issue, is having sufficient support for
precondition validation. Without foreclosing on the ability to
do more later with declarative guards, I think the recent
construction proposal meets the requirement for lightweight
enforcement with minimal or no duplication. I'm hopeful that
this bit is "there".
Our goal all along has been to define records as being “just
macros” for a finer-grained set of features. Some of these
are motivated by boilerplate; some are motivated by semantics
(coupling semantics of API elements to state.) In general,
records will get there first, and then ordinary classes will
get the more general feature, but the default answer for "can
you relax records, so I can use it in this case that almost
but doesn't quite fit" should be "no, but there will probably
be a feature coming that makes that class simpler, wait for that."
Some other open issues (please see my writeup at
http://cr.openjdk.java.net/~briangoetz/amber/datum.html
<http://cr.openjdk.java.net/%7Ebriangoetz/amber/datum.html>
for reference), and my current thoughts on these, are outlined
below. Comments welcome!
- Extension. The proposal outlines a notion of abstract
record, which provides a "width subtyped" hierarchy. Some
have questioned whether this carries its weight, especially
given how Scala doesn't support case-to-case extension (some
see this as a bug, others as an existence proof.) Records can
implement interfaces.
- Concrete records are final. Relaxing this adds complexity
to the equality story; I'm not seeing good reasons to do so.
- Additional constructors. I don't see any reason why
additional constructors are problematic, especially if they
are constrained to delegate to the default constructor (which
in turn is made far simpler if there can be statements ahead
of the this() call.) Users may find the lack of additional
constructors to be an arbitrary limitation (and they'd
probably be right.)
- Static fields. Static fields seem harmless.
- Additional instance fields. These are a much bigger
concern. While the primary arguments against them are of the
"slippery slope" variety, I still have deep misgivings about
supporting unrestricted non-principal instance fields, and I
also haven't found a reasonable set of restrictions that makes
this less risky. I'd like to keep looking for a better story
here, before just caving on this, as I worry doing so will end
up biting us in the back.
- Mutability and accessibility. I'd like to propose an odd
choice here, which is: fields are final and package (protected
for abstract records) by default, but finality can be
explicitly opted out of (non-final) and accessibility can be
explicitly widened (public).
- Accessors. Perhaps the most controversial aspect is that
records are inherently transparent to read; if something wants
to truly encapsulate state, it's not a record. Records will
eventually have pattern deconstructors, which will expose
their state, so we should go out of the gate with the
equivalent. The obvious choice is to expose read accessors
automatically. (These will not be named getXxx; we are not
burning the ill-advised Javabean naming conventions into the
language, no matter how much people think it already is.) The
obvious naming choice for these accessors is fieldName(). No
provision for write accessors; that's bring-your-own.
- Core methods. Records will get equals, hashCode, and
toString. There's a good argument for making equals/hashCode
final (so they can't be explicitly redeclared); this gives us
stronger preservation of the data invariants that allow us to
safely and mechanically snapshot / serialize / marshal (we'd
definitely want this if we ever allowed additional instance
fields.) No reason to suppress override of toString, though.
Records could be safely made cloneable() with automatic
support too (like arrays), but not clear if this is worth it
(its darn useful for arrays, though.) I think the
auto-generated getters should be final too; this leaves arrays
as second-class components, but I am not sure that bothers me.
--
Kevin Bourrillion | Java Librarian | Google, Inc. |kev...@google.com
<mailto:kev...@google.com>