Re: Value types, encapsulation, and uninitialized values

Brian Goetz Wed, 20 Feb 2019 12:10:46 -0800

Closing the loop on this story....

To summarize what's been said on this thread:

- Everyone agrees that there are at least some value types that don'thave a natural (non-null) default, and that making up a default withzeros for these types is, at best, surprising. As Kevin put it:

Most value types, I think, don't have a zero, and I believe it will bequite damaging to treat them as if they do. If Java doesn't eitherprovide or simulate nullability, then users will be left jumpingthrough a lot of hoops to simulate nullability themselves (`implementsIsValid`??).

- John made an impassioned plea for not inventing new,null-like-but-not-null mechanisms, which can be summarized as "no newnulls":


    http://cr.openjdk.java.net/~jrose/values/nullable-values.html

- The motivation for supporting "nullable" values is not because wethink values should have null in their value set; this is better handledby Optional or type combinators like `Foo?`. This is really about whathappens when someone stumbles across a value that has not beeninitialized by a constructor (and the most common case here is arrayelements.)

- From a user-model perspective, there are a few options. Severalfolks were bullish on letting the user provide an initial value (say,via a no-arg constructor), but I think this idea runs off the road,since there are some types that _simply have no reasonable (non-null)default value_. These include domain types like


    value record PersonName(String first, String last);

a default name of ("", "") is only slightly less stupid a default than(null, null). These also include inner class instances; if there's noenclosing instance available, what are we going to do?

Separately, we have explored a number of ways we might implement this inthe VM, and I think we have a sensible story. Some value types are_zero intolerant_ -- this means that the all-zero value is not a memberof their value set. The key observation is:


    nullability, zero-tolerance, flattenability -- pick two

That is, you can have nullable, zero-tolerant values (think `Point?`),but they don't get flattened; or you can have zero-tolerant, flattenablevalues, but they can't be null. The third combination (thanksFrederic!) is that it is possible to have nullable, flattenable values,if we make the all-zero representation illegal, and then we use theall-zero representation in the heap to represent `null`, and `getfield`/ `aaload` will check for zero on fetch and if zero, put a null on thestack. (There's a much bigger writeup on this coming; this is theexecutive summary.) And because values are monomorphic, different valuetypes can make different choices.

Further, a key use case is _migrating_ value-based classes(LocalDateTime, Optional) to value types. The key impediment so farhere has been nullability; we can represent them as nullable +flattenable if we're willing to give up zeros. Since zeros is a pureimplementation detail, a class that wants to migrate can always find arepresentation where there is at least one non-zero bit.


So, the sweet spot seems to be:

- Values, by default, are non-nullable and flattenable. The compilertranslates value `Point` as `QPoint;`.

- Users can denote the union of the value set and { null } using anemotional type: `Point?`, which the compiler translates as `LPoint;`. If a user wants a nullable `Point`, they ask for it; what they give upis flattenability / scalarization. (I resisted the emotional types aslong as I could, but the alignment with the VM implementation was toostrong to resist, and this yields significant dividends when we get tothe generics story.) Let's not harp on the details of these types justyet; that's a separate shed to paint.

- For values that need to defend against uninitialized data, or valuesthat are migrated from references, they can declare themselves to be"null-default"; the cost of these is they must be intolerant of theall-zero value. These are always translated with `L` carriers, sincethey are nullable. Users of these classes pay the extra penalty ofchecking for zeroes when we go between heap and stack, so they areslightly slower, but they still are flattened and scalarized, which isthe big benefit. (Again, I resisted John's point about nulls, buteventually the gravity was too strong; if we don't use null here, we'llreinvent a worse null.)

Which correspond to the 3-choose-2 combinations deriving from theobservation above.

From a user model perspective, users choose between zero-default values(the default) and null-default values (opt in), as the semanticsdemands. This is easy to understand (in fact, the biggest risk might beusers will like it _too much_, and they'll reach for null-default valueclasses more often than they should.) And if you want to represent"maybe Point", you use `Point?` or `Optional<Point>` as needed.

From a VM perspective, we need to support null-default values; whilewe've not implemented this yet, it seems pretty reasonable.

The bonus is that we have cleared the last blocker to migratingvalue-based classes to value types; for migrated values, we implicitlymake them null-default (also: same treatment for inner value classes),and then migrating Optional and LocalDateTime becomes a completelycompatible, in-place move.






On 10/11/2018 10:14 AM, Brian Goetz wrote:

Our story is "Codes like a class, works like an int". A key part ofthis is that value types support the same lifecycle as objects, andthe same ability to hide their internals.
Except, the current story falls down here, because authors mustcontent with the special all-zero default value, because, unlikeclasses, we cannot guarantee that instances are the result of aconstructor. For some classes (e.g., Complex, Point, etc), thisforced-on-you default is just fine, but for others (e.g., wrappers fornative resources), this is not unlike the regrettable situation withserialization, where class implementations may confront instances thatcould not have resulted from a constructor.
Classes guard against this through the magic of null; an instancemethod will never have to contend with a null receiver, because by thetime we transfer control to the method, we'd already have gotten anNPE. Values do not have this protection. While there are many thingsfor which we can say "users will learn", I do not think this is one ofthem; if a class has a constructor, it will be assumed that thereceiver in a method invocation will be on an instance that hasresulted from construction. I do not think we can expose theprogramming model as-is; it claims to be like classes, but in thisaspect is more like structs.
So, some values (but not all) will want some sort of protectionagainst uninitialized values. One approach here would be to try toemulate null, by, say, injecting checks for the default value prior todereferences. Another would be to take the route C# did, and allowusers to specify a no-arg constructor, which would customize thedefault value. (Since both are opt-ins, we can educate users aboutthe costs of selecting these tools, and users can get the benefits offlatness and density even if these have additional runtime costs.) The latter route is less rich, but probably workable. Both eliminatethe (likely perennial) surprise over uninitialized values forzero-sensitive classes.

Re: Value types, encapsulation, and uninitialized values

Reply via email to