Closing the loop on this story....

To summarize what's been said on this thread:

 - Everyone agrees that there are at least some value types that don't have a natural (non-null) default, and that making up a default with zeros for these types is, at best, surprising.  As Kevin put it:

Most value types, I think, don't have a zero, and I believe it will be quite damaging to treat them as if they do. If Java doesn't either provide or simulate nullability, then users will be left jumping through a lot of hoops to simulate nullability themselves (`implements IsValid`??).

 - John made an impassioned plea for not inventing new, null-like-but-not-null mechanisms, which can be summarized as "no new nulls":

    http://cr.openjdk.java.net/~jrose/values/nullable-values.html

 - The motivation for supporting "nullable" values is not because we think values should have null in their value set; this is better handled by Optional or type combinators like `Foo?`.  This is really about what happens when someone stumbles across a value that has not been initialized by a constructor (and the most common case here is array elements.)

 - From a user-model perspective, there are a few options. Several folks were bullish on letting the user provide an initial value (say, via a no-arg constructor), but I think this idea runs off the road, since there are some types that _simply have no reasonable (non-null) default value_.  These include domain types like

    value record PersonName(String first, String last);

a default name of ("", "") is only slightly less stupid a default than (null, null).  These also include inner class instances; if there's no enclosing instance available, what are we going to do?


Separately, we have explored a number of ways we might implement this in the VM, and I think we have a sensible story.  Some value types are _zero intolerant_ -- this means that the all-zero value is not a member of their value set.  The key observation is:

    nullability, zero-tolerance, flattenability -- pick two

That is, you can have nullable, zero-tolerant values (think `Point?`), but they don't get flattened; or you can have zero-tolerant, flattenable values, but they can't be null.  The third combination (thanks Frederic!) is that it is possible to have nullable, flattenable values, if we make the all-zero representation illegal, and then we use the all-zero representation in the heap to represent `null`, and `getfield` / `aaload` will check for zero on fetch and if zero, put a null on the stack.  (There's a much bigger writeup on this coming; this is the executive summary.)  And because values are monomorphic, different value types can make different choices.

Further, a key use case is _migrating_ value-based classes (LocalDateTime, Optional) to value types.  The key impediment so far here has been nullability; we can represent them as nullable + flattenable if we're willing to give up zeros.  Since zeros is a pure implementation detail, a class that wants to migrate can always find a representation where there is at least one non-zero bit.

So, the sweet spot seems to be:

 - Values, by default, are non-nullable and flattenable.  The compiler translates value `Point` as `QPoint;`.

 - Users can denote the union of the value set and { null } using an emotional type: `Point?`, which the compiler translates as `LPoint;`.  If a user wants a nullable `Point`, they ask for it; what they give up is flattenability / scalarization.  (I resisted the emotional types as long as I could, but the alignment with the VM implementation was too strong to resist, and this yields significant dividends when we get to the generics story.)  Let's not harp on the details of these types just yet; that's a separate shed to paint.

 - For values that need to defend against uninitialized data, or values that are migrated from references, they can declare themselves to be "null-default"; the cost of these is they must be intolerant of the all-zero value.  These are always translated with `L` carriers, since they are nullable.  Users of these classes pay the extra penalty of checking for zeroes when we go between heap and stack, so they are slightly slower, but they still are flattened and scalarized, which is the big benefit. (Again, I resisted John's point about nulls, but eventually the gravity was too strong; if we don't use null here, we'll reinvent a worse null.)

Which correspond to the 3-choose-2 combinations deriving from the observation above.


From a user model perspective, users choose between zero-default values (the default) and null-default values (opt in), as the semantics demands.  This is easy to understand (in fact, the biggest risk might be users will like it _too much_, and they'll reach for null-default value classes more often than they should.)  And if you want to represent "maybe Point", you use `Point?` or `Optional<Point>` as needed.

From a VM perspective, we need to support null-default values; while we've not implemented this yet, it seems pretty reasonable.

The bonus is that we have cleared the last blocker to migrating value-based classes to value types; for migrated values, we implicitly make them null-default (also: same treatment for inner value classes), and then migrating Optional and LocalDateTime becomes a completely compatible, in-place move.





On 10/11/2018 10:14 AM, Brian Goetz wrote:
Our story is "Codes like a class, works like an int".  A key part of this is that value types support the same lifecycle as objects, and the same ability to hide their internals.

Except, the current story falls down here, because authors must content with the special all-zero default value, because, unlike classes, we cannot guarantee that instances are the result of a constructor.  For some classes (e.g., Complex, Point, etc), this forced-on-you default is just fine, but for others (e.g., wrappers for native resources), this is not unlike the regrettable situation with serialization, where class implementations may confront instances that could not have resulted from a constructor.

Classes guard against this through the magic of null; an instance method will never have to contend with a null receiver, because by the time we transfer control to the method, we'd already have gotten an NPE.  Values do not have this protection.  While there are many things for which we can say "users will learn", I do not think this is one of them; if a class has a constructor, it will be assumed that the receiver in a method invocation will be on an instance that has resulted from construction.  I do not think we can expose the programming model as-is; it claims to be like classes, but in this aspect is more like structs.

So, some values (but not all) will want some sort of protection against uninitialized values.  One approach here would be to try to emulate null, by, say, injecting checks for the default value prior to dereferences.  Another would be to take the route C# did, and allow users to specify a no-arg constructor, which would customize the default value.  (Since both are opt-ins, we can educate users about the costs of selecting these tools, and users can get the benefits of flatness and density even if these have additional runtime costs.)  The latter route is less rich, but probably workable.  Both eliminate the (likely perennial) surprise over uninitialized values for zero-sensitive classes.




Reply via email to