Reference-default style

Dan Smith Thu, 19 Dec 2019 13:14:53 -0800

As we flesh out the migration story for inline classes, I've found it useful to 
identify two different styles that programmers will want to follow as they make 
use of inline classes. I want to introduce some terminology to talk about these 
styles, which will hopefully help us think about what use cases we're designing 
for.


-----
Inline-default

In this style, most clients of an inline class will want to treat it like 
another primitive type. They'll use the type directly, and maybe allocate flat 
arrays (or, eventually, other specialized data structures). Operations that 
make use of null or erased generics will be uncommon.

Examples:
- Numeric types that are effectively variations on primitives
- Typed wrappers for single values (e.g., measurements, pointers)
- Low-level flat building blocks for data structures
- Multiple-return structures (like cursors)

Most of our design is tailored to this style. In the language model, 'Point' is 
an inline type, and you need to modify it—'Point.ref'—to access the equivalent 
reference type (if you must—we expect people to utter 'Point.ref' about as 
frequently as they currently utter 'Integer').

----
Reference-default

In this style, most clients of an inline class will interact with it through a 
reference type. They'll use nulls and erased generics the same way they always 
have. Clients may not even realize that there is an inline class under the 
hood. Flattening is not a priority, and may even be unwanted (because of 
cycles, tearing, etc.).

Examples:
- Published classes that are already committed to a reference view (Optional, 
LocalDateTime)
- Components in a system that makes heavy use of null or generics
- General-purpose records (e.g., a POJO view of a database)
- Nodes in recursive data structures
- Behavior abstractions (e.g., functions)
- APIs that don't want their clients to have to think about inline types
- Classes without a natural default value
- APIs that want to limit access to default values

"Why even bother with an inline class?" is a fair question for these use cases. 
Some answers:
- Principle of least privilege: if you don't need identity, don't claim it
- Potential for GC improvements* and opportunistic JIT flattening
- A subset of users need flattening, but few enough that it doesn't deserve 
"default" treatment
- A migration strategy—new code should work with inline types, but old code is 
written for reference types (sorry, new code)

(*On GC: do we have good numbers on this? Personally, my choices about adding 
class abstractions often come down to "is this abstraction worth the allocation 
and GC pressure costs associated with lots of new objects?" My performance 
model here is horrible, so who knows if this is a smart question to ask, but it 
would be nice if we could say broadly "use an inline class and stop worrying 
about it.")

Brian proposes an approach to supporting the reference-default style in "State 
of Valhalla", but I'm not sure it's ideal—this design space seems fairly 
unexplored to me still. Briefly, here are some ways the language might support 
it:

1) As a design pattern

We tell programmers to produce two declarations, an inline class and an 
interface (or abstract class, per another thread). The interface gets the "good 
name" and exposes the intended API. The inline class may be exposed with an 
alternate name, for clients who need it, or hidden as private.

The language is still 100% inline-default—the 'Foo.ref' type still exists, but 
it's redundant and nobody needs to use it.

If we did nothing (and honestly, that's an attractive feature roadmap! super 
cheap!), I think we'd see this design pattern developing naturally in the 
community.

2) As an "advanced" feature of inline classes

This is the State of Valhalla strategy: inline classes are designed to be 
inline-default, but as a special-case feature, you can also declare the 
'Foo.ref' interface, give it a name, and wire it up to the inline class 
declaration.

In reference-default style, the programmer gives the "good name" to the 
reference projection, and either gives an alternate name to the inline class or 
is able to elide it entirely (in that case, clients use 'Foo.inline').

Ways this is different than (1):
- The 'Foo.inline' type operator
- Implicit conversions (although sealed types can get us there in (1))
- There are two types, not three (and two JVM classes, not three)
- Opportunities for "boilerplate reduction" in the two declarations

3) As an equal partner with inline-default

An inline class declaration introduces two types, an inline type and a 
reference type. But a modifier on the declaration determines whether the "good 
name" goes to the inline type or the reference type. The other type can be 
derived using an operator ('Foo.ref' or 'Foo.inline'). There's never a need for 
an alternate name.

In this case, the language isn't biased to one style or the other; each 
declaration picks one. The trade-off is that clients need to keep track of one 
more bit when thinking about the inline class ("Is this a *foo* inline class or 
a *bar* inline class?" Actual terminology to be bikeshedded...)

4) As the only supported style

An inline class declaration always gives the "good name" to the reference type, 
and you always use an operator to get to the inline type ('Foo.inline'—but 
we're gonna need better syntax.)

This one would represent a significant shift in the design center of the 
feature. If you want flattening everywhere, you're going to need to make 
liberal use of the '.inline' operator. But if you just want to declare that a 
bunch of your classes don't have identity, and hopefully get a cheap 
performance boost as a result, it's simple. The burden of learning something 
new is shifted to "advanced" users and APIs to whom flattening is important.

5) As a use-site contextual option

There's a single inline class declaration with two corresponding types. At the 
use site, the programmer provides context that picks one or the other for the 
"good name" (perhaps as a property of the 'import' statement, or some new 
compiler direction in a source file header, package/module declaration, command 
line flag, ...).

This probably makes the most sense paired with (4): the *default* default is 
the reference type, but the language lets you switch to the inline type if you 
want. Then, unless the client opts in to inline types, they get familiar 
reference type behavior from the class.

Conclusion:

I'm not ready to completely dismiss any of these designs, but my preferences at 
the moment are (1) and (3). Options (4) and (5) are more ambitious, discarding 
some of our assumptions and taking things in a different direction.

Like many design patterns, (1) suffers from boilerplate overhead ((2) too, 
without some language help). It also risks some missed opportunities for 
optimization or language convenience, because the relationship between the 
inline and reference type is incidental. (I'd like to get a clearer picture of 
whether this really matters or not.)

(2), (3), and (5) suffer from added language complexity. (2) tries to manage it 
by pushing the feature off into "advanced" territory. But, ultimately, you 
can't understand the language without understanding those advanced features—the 
first time you encounter reference-default style, you'll have to rethink your 
understanding of how inline classes work.

(5) feels like something fundamentally new in Java, although if you squint it's 
"just" a variation on name resolution. What originally prompted this idea was 
seeing a similar approach in attempts to introduce nullability type 
operators—legacy code has the "wrong" default, so you need some lightweight way 
to pick a different default.

(4) is a simple and consistent story, but probably not the feature we're 
building. It hinges on how important we think the inline-default use cases are, 
and how painful we think the 'inline' operator (spelling TBD) would be to use 
in those cases.

Since (1) is already done (it's the "do nothing" option), it makes sense to use 
it as a baseline, and then ask whether any of the alternatives are a 
significant enough improvement that they're worth developing. This will be 
informed by our understanding of the use cases for the two styles, and some 
real world experience would probably help that understanding.

Reference-default style

Reply via email to