----- Mail original ----- > De: "Remi Forax" <[email protected]> > À: "Brian Goetz" <[email protected]> > Cc: "valhalla-spec-experts" <[email protected]> > Envoyé: Samedi 3 Août 2019 19:48:01 > Objet: Re: Collapsing the requirements
> ----- Mail original ----- >> De: "Brian Goetz" <[email protected]> >> À: "valhalla-spec-experts" <[email protected]> >> Envoyé: Samedi 3 Août 2019 18:37:56 >> Objet: Collapsing the requirements > >> As Remi noted, we had some good discussions at JVMLS this week. Combining >> that >> with some discussions John and I have been having over the past few weeks, I >> think the stars are aligning to enable us to dramatically slim down the >> requirements. The following threads have been in play for a while: >> >> - John: I hate the LPoint/QPoint distinction >> - Brian: I hate null-default types >> - Remi: I hate the V? type >> >> But the argument for each of these depended, in some way, on the others. I >> believe, with a few compromises, we can now prune them as a group, which >> would >> bring us to a much lower energy state. >> >> ## L^Q World — Goodbye `LV;` >> >> We’ve taken it as a requirement that for a value type V, we have to support >> both >> LV and QV, where LV is the null-adjunction of QV. This has led to a lot of >> complexity in the runtime, where we have to manage dual mirrors. >> >> The main reason why we wanted LV was to support in-place migration. (In >> Q-world, LV was the box for QV, so it was natural for migration.) But, as >> we’ve worked our migration story, we’ve discovered we may not need LV for >> migration. And if we don’t, we surely don’t need it for anything else; >> worst-case, we can erase LV to `LValObject` or even `LObject` (or, if we’re >> worried about erasure and overloading, to something like `LObject//V` using >> John’s type-operator notation.) >> >> Assuming we can restructure the migration story to not require LV to >> represent a >> VM-generated “box" — which I believe we can, see below — we can drop the >> requirement for LV. An inline class V gives rise to a single type >> descriptor, >> QV (or whatever we decide to call it; John may have plans here.) >> >> ## Goodbye `V?` >> >> The other reason we wanted LV was that it was the obvious representation for >> the >> language type `V?` (V adjoined with null.) Uses for `V?` include: >> >> - Denoting non-flattened value fields; >> - Denoting non-flattened value arrays; >> - Denoting erased generics over values (`Foo<V?>`); >> - Denoting the type that is the adjunction of null to V (V | Null), when we >> really want to talk about nullability. >> >> But, we can do all this without a `V?` type; for every V, there is already at >> least one super type of V that includes `V|Null` — Object, and any interface >> implemented by V. If we arrange that every value type V has a super type V’, >> not implemented by any other type — then the value set of this V’ is exactly >> that of `V?`. And we can use V’ to do all the things `V?` did with respect >> to >> V — including sub typing. The language doesn’t need the `?` type operator, >> it >> just needs to ensure that V’ always exists. Which turns out to be easy, and >> also turns out to be essential to the migration story. >> >> #### Eclairs >> >> We can formalize this by requiring that every value type have a companion >> interface (or abstract class) supertype. Define an envelope-class pair >> (“eclair”) as a pair (V, I) such that: >> >> - V is an inline class >> - I is a sealed type >> - I permits V (and only V) >> - V <: I >> >> (We can define eclairs for indirect classes, but they are less interesting — >> because indirect classes already contain null.) >> >> If every value type be a member of an eclair, we can use V when we want the >> flattenable, non-nullable, specializable type; and we use I when we want the >> non-flattenable, nullable, erased “box”. We don’t need to denote `V?`; we >> can >> just use I, which is an ordinary, nominal type. >> >> Note that the VM can optimize eclairs about as well as it could for LV; it >> knows >> that I is the adjunction of null to V, so that all non-null values of I are >> identity free and must be of type V. >> >> What we lose relative to V? is access to fields; it was possible to do >> `getfield` on a LV, but not on I. If this is important (and maybe it’s not), >> we can handle this in other ways. >> >> #### With sugar on top, please >> >> We can provide syntax sugar (please, let’s not bike shed it now) so that an >> inline clause _automatically_ acquires a corresponding interface (if one is >> not >> explicitly provided), onto which the public members (and type variables, and >> other super types) of C are lifted. For sake of exposition, let’s say this >> is >> called `C.Box` — and is a legitimate inner class of C (which can be generated >> by the compiler as an ordinary classfile.) We’ve been here before, and >> abandoned it because “Box” seemed misleading, but let’s call it that for now. >> And now it is a real nominal type, not a fake type. In the simplest case, >> merely declaring an inline class could give rise to V.Box. >> >> Now, the type formerly known as `V?` is an ordinary, nominal interface (or >> abstract class) type. The user can say what they mean, and no magic is >> needed >> by either the language or the VM. Goodbye `V?`. >> >> #### Boxing conversion >> >> Given the constraints of the eclair relationship, it would be reasonable for >> the >> compiler to derive from this that there is a boxing conversion between C and >> I >> (I is just the value set of C, plus null — which is the relationship boxes >> have >> with their corresponding primitives.) The boxing operation is a no-op >> (since C >> <: I) and the unboxing operation is a null checking cast. >> >> #### Erased generics >> >> Using the eclair wrapper also kicks the problem of erased generics down the >> road; if we use `Foo<I>` for erased generics, and temporarily ban `Foo<V>`, >> when we get to specialized generics, it will be obvious what `Foo<V>` means >> (their common super type will be `Foo<? extends I>`). This is a less >> confusing >> world, as then “List of erased V” and “List of specialized V” don’t coexist; >> there’s only “List of V” and “List of V’s Box”. >> >> ## Migration >> >> The ability to migrate Optional and friends to values has been an important >> goal, but it has been the source of significant complexity. Our previous >> story >> leaned hard on “When we migrate X to a value, LX will describe the box, so >> old >> callsites will continue to link.” But it turned out that brought a lot of >> baggage (forwarding bridges, null-default values) and compromises >> (null-default >> values lose their calling-convention optimizations), and over the past few >> weeks John and I have been cooking up a simpler eclair-based recipe for this. >> >> The world is indeed full of existing utterances of `LOptional`, and they will >> still want to work. Fortunately, Optional follows the rules for being a >> value-based class. We start with migrating Optional from a reference class >> to >> an eclair with a public abstract class and a private value implementation. >> Now, existing code just works (source and binary) — and optionals are values. >> But, this isn’t good enough; existing variables of type Optional are not >> flattened. >> >> One of the objections raised to in-place migration was nullity; in order to >> migrate Optional to a true value, it would have to be a null-default value, >> and >> this already entailed compromises. If we’re willing to compromise further, >> we >> can get what we want without the baggage. And that compromises is: give up >> the >> name. >> >> So we define a new public value class `Opt<T>` which is the value half of the >> eclair, and the existing Optional is the interface/abstract class half. Now, >> existing fields / arrays can migrate gradually to Opt, as they want the >> benefit >> of flattening; existing APIs can continue to truck in Optional (which have >> about the same optimizations as a null-default value would have on the >> stack.) >> >> This works because of the boxing conversion. Suppose we have old code that >> does: >> >> Optional o = makeAnOptional() >> >> when the user changes this to >> >> Opt o = … >> >> the compiler seems the RHS is an Optional and the LHS is a Opt, and there is >> a >> boxing conversion between them, so we insert an unbox conversion (null check) >> and we’re done. Users can migrate their fields gradually. The cost: the >> good >> name gets burned. But there is a compatible migration path from ref to >> value. >> >> Later, when we have bridges (we don’t need them yet!), we can migrate the >> library uses from Optional to Opt. >> >> ## Null-default values >> >> About 75% of the motivation for null-default values — another huge source of >> complexity — was to support the migration of value-based classes. And it >> wasn’t even a great solution — because we still lost some key optimizations >> (e.g., calling conventions.) With the Optional -> Opt path, we don’t need >> null-default values, we get ordinary values. So while we pay the cost of >> changing the name, we gain the benefit that the new values, once the full >> migration is effected, we don’t carry the legacy performance baggage. >> >> Another 20% of the motivation was for security-sensitive classes whose >> default >> value did not represent a useful value, for which we wanted not >> null-default-ness but really initialization safety. Let’s look at another >> way >> to get there. >> >> There are a few ways to get there. One is to treat this problem as >> protecting >> such classes from uninitialized fields or array elements; another is to >> ensure >> that such classes (a) have no public fields and (b) perform the correct check >> at the top of each method (which can be injected by the compiler.) I don’t >> want to solve that problem right here, but I think there enough ways to get >> there that we can assume this isn’t a hard requirement. >> >> The other 5% was just the user-based “I want null in my value set.” For >> those, >> we can tell users: use the interface box when you need null. >> >> ## Summary >> >> In one swoop, we can banish LV from the VM, V? from the language, and >> null-default values, by making a simple requirement: every value type is >> paired >> with an interface or abstract class “box”. For most values, this can be >> automatically generated by the compiler and denoted via a well-known name >> (e.g., V.Box); for some values, such as those that are migrated from >> reference >> types, we can explicitly declare the box type and pick explicit names for >> both >> types. >> >> There’s a lot to work out, but I think it should be clear enough that this >> is a >> much, much lower energy state than what we were aiming at for L10, and also a >> simpler user model. >> >> Let’s focus discussions on validating the model first before we dive into >> mechanism or surface syntax. > > Trying to implement the Eclair interface by hand, > it seems we need to have the method of the interface and the one of the > implementation to use covariant return types, > the box version retuning a box while the inline class version returning the > inline class (which is fine because it's a subtype), > otherwise when you call a method of the inline class the result is the box so > you are loosing the non-null property when chaining calls. so depending on - if you want to 'emulate' a value based class, in that case the eclair is by example Optional and the inline class can have a specific name - if you want an inline class and an eclair only for interacting with erased generics, like Complex and Complex.box. - if the inline class use co-variant return types. so no good solution that will fit them all, which suggests that we should not provide any special compiler support. Rémi
