Re: Superclasses for inline classes

John Rose Thu, 19 Dec 2019 18:14:09 -0800

On Dec 18, 2019, at 3:57 PM, Dan Smith <daniel.sm...@oracle.com> wrote:
> 
> [Expanding on and summarizing discussion about abstract superclasses from 
> today's meeting.]
> 
> -----
> Motivation
> 
> There are some strong incentives for us to support inline classes that have 
> superclasses other than Object. Briefly, these include:
> 
> - Identity -> inline migration candidates (notably java.lang.Integer) often 
> extend abstract classes
> - A common refactoring may be to extend an existing class with a (possibly 
> private) inline implementation
> - Abstract classes are more expressive than interfaces
> - If we compile Foo.ref to an abstract class, we can better represent the 
> full API of an inline class using an abstract class


I’m glad we are cracking open this can of worms; I’ve always thought
that interfaces as inline supers were good enough but not necessarily
the whole story.  

(At the risk of instilling more terror, I’ll say that I think that an abstract
super to an inline could contribute non-static fields, in a way that is
meaningful, useful, and efficient.  The initialization of such inherited
fields would of course use withfield and would require special rules
to allow the initialization to occur in the subclass constructor/factory.
I suppose this is a huge feature, as Dan says later.  A similar effect will
be available from templates, with less special pleading.)

(Does it make sense to allow an abstract class to *also* be inline?
Maybe, although there is a serious question about its default value.
If a type is abstract its default value is probably required to be null.)

A useful organizing concept for abstract supers, relative to inlines,
is a pair of bits, not both true, “always-inlined” and “never-inlined”.
Object and interfaces have neither mark by default.  The super of
an identity class cannot be “always-inlined" and the super of an
inline class cannot be “never-inlined”.  Or, an identity (resp. inline)
class has the “always-inlined” (resp. “never-inlined”) bit set.  And
for every T <: U in the class hierarchy, if T is always-inlined, then
U must not be never-inlined, and vice versa.  Thus if U is marked
then every T <: U is forbidden to have the opposite mark.  Or,
even more simply, both bits are deemed to inherit down to all
subtypes, and no type may contain both marks.

I don’t know how to derive those bits from surface syntax.  A marker
interface for each is a first cut: AlwaysInlined, NeverInlined.  Marker
interfaces are kind of smelly.  These particular ones work a little better
than their complements (InlineFriendly, IdentityFriendly) because
they exclude options rather than include them.

(Could a *non-abstract* inline be a super of another inline?  No, I’d
like to draw the line there, because that leads to paradoxes with flattening,
or else makes the super non-flattenable in most uses, or violates a
substitutability rule.)

> To be clear, much of this has to do with migration, and I subscribe to a 
> fairly expansive view of how much we should permit and encourage migration. I 
> think most every project in the world has at least a few opportunities to use 
> inline classes. Our design should limit the friction necessary (e.g., 
> disruptive redesigns of type hierarchies) to integrate inline classes with 
> the existing body of code.

You have a point.  Migration is not a task but a way of life?

> We've considered, as an alternative, supporting transparent migration of 
> existing classes to interfaces. But this raises many difficult issues 
> surrounding source, binary, and behavioral compatibility. It would be nice 
> not to have to tackle those issues, nor introduce a lot of caveats into the 
> class -> interface migration story.

As I said earlier, for value types we have this recurring need to bend
interfaces to be more like abstract classes, or else allow abstract classes
to become more like interfaces.

> -----
> Constraints
> 
> Inline class instantiation is is fundamentally different from identity class 
> instantiation. While the language seeks to smooth over these differences, 
> under the hood all inline objects come from 'defaultvalue' and 'withfield' 
> invocations. There is no opportunity in these bytecodes for a superclass to 
> execute initialization code.

In the case we are discussing, the interface-like trick that abstract
classes need to learn is to have (declaratively) empty constructors.

I think that if a class (abstract or not) has a non-empty constructor,
it must also be given the “never-inline” mark.  (This is one reason
that mark isn’t simply a marker interface.)  In this way (or some
equivalent) a class with a non-empty constructor will never attempt
to be the super of an inline.

> (Could we redesign the construction model to properly delegate to a 
> superclass? Sure, but that's a huge new feature that probably isn't justified 
> by the use cases.)

Probably not.  Unless folks demand to factor fields as well as behaviors
into abstract supers of inlines.

> As a result, constructors, instance initializers, and instance fields in a 
> superclass are unusable to inline class instances. In fact, their existence 
> would be a vulnerability, since authors typically make assumptions about 
> initialization having occurred.

If constructors are declaratively empty, it follows that subclasses must be
given both responsibility and authority to initialize all fields of supers
with empty constructors.  The easiest way to handle this is to forbid
such fields.  Another way is to treat field initialization as a protected
activity (may occur in a subclass constructor).  Currently it is arguably
a private activity (must occur in the same class constructor).

So:

abstract class S {
  __DeclarativelyEmpty S();
  final int x;
}
final class C extends S {
  final int y;
  C(int a, int b) {
    super.x = a;
    this.y = b;
  }
}

My take is that it’s doable, and its worth is unproven at present.

> Fortunately, 'Object' doesn't require any initialization and so can safely be 
> extended. Our goal is to expand the set of safe-to-extend classes.
> 
> -----
> Language model
> 
> An inline class may extend another class, as long as the superclass has the 
> following properties:
> - It has no instance fields
> - It has no constructors
> - It has no instance initializers
> - It is abstract* or Object
> - It extends another class with these properties

It all comes down to this I think:

- The class has a declaratively empty constructor.  (Perhaps it may have 
others!)

With the following axioms and inferences:

- Object has a declaratively empty constructor.  (And no others, in fact.)
- All interfaces have a declaratively empty constructor.  (And no others, in 
fact.)
- A class (other than Object) with a declaratively empty constructor must have 
a super with declaratively empty constructor.
- A class with a declaratively empty constructor must only static initializers.
- Checking of blank final initialization is performed as if a declaratively 
empty constructor in fact has an empty body.

Therefore, a class with a declaratively empty constructor must not declare 
final fields, or else we must extend DA/DU rules for blank finals to allow 
“protected initialization”, as sketched above.

I think it’s better to tease the conditions apart in this way rather than lump 
them all together.

> Subtype polymorphism works the same for superclasses as it does for 
> superinterfaces.
> 
> (*Remi points out that we could drop the 'abstract' restriction, meaning 
> there may be identity instances of the superclass. Given the restriction on 
> fields, though, I'm struggling to envision a use case; the consensus is that 
> 'new Object()' is probably something we want to *stop* supporting.)

Pushing the other way, and given that the restriction on fields could be 
lifted, there are good use cases like Integer.  The inline subtypes of Integer 
would use the declaratively empty constructor, while the identity instances 
would use a different constructor.

And Integer would be marked neither “always-inline” nor “never-inline”, so the 
subtype “int” would work OK as an inline.  Its constructor would do a 
“withfield” to initialize Integer.value, as a protected blank final.

If we conclude that there are no use cases for such abstract-final fields, fine.
But claiming that such a use case cannot exist because there are no such fields
is a mere circularity.

> Call a class that satisfies these constraints an "initialization-free class" 
> (bikeshedding on this term is welcome!).

See above.  And note that a class might have *both* initialization free modes
and regular constructors, *at the same time*.  The key property is that there
is a constructor which is declaratively empty.  That provides us the API
surface we need to fit things together properly in the subclass.

> Like an interface, its value set may include references to both inline class 
> instances and identity class instances.

OK.

> We *do *not* want the initialization-free property to be expressed as a class 
> modifier—this feature is too obscure to deserve that much prominence, 
> encouraging every class author to consider one more degree of freedom; and we 
> don't want every class to have to manually opt in.

I agree.  It really a modifier on a constructor, isn’t it?  I suppose it always
goes on the default (nullary) constructor.

> But we *do* need the initialization-free property to be part of the public 
> information about the class. For example, the javadoc should say something 
> like "this is an initialization-free class". Otherwise, it's impossible to 
> tell the difference between, e.g., a class with no fields and a class with 
> private fields.
> 
> In the past, a Java class declaration that lacks a constructor always got a 
> default constructor. In this model, however, an initialization-free class has 
> no constructor at all. A 'super()' call directed at such a class is a no-op.

Yep.  And that’s (one reason) why declarative emptiness differs from textual
emptiness and must be contagious upward.

> 
> -----
> Compilation & JVM support
> 
> There are two alternative compilation strategies:
> 
> 1) An initialization-free class is compiled like always, including an 
> '<init>' method of the form 'aload_0; invokespecial; return;'. Some metadata 
> (flag or attribute) indicates that the class is initialization-free.

Yep, maybe.  The property is verifiable by inspecting bytecodes.
But if we are going to verify that property, since we are changing
the verifier, we could make the constructor unambiguously empty.
I suggest marking it ACC_ABSTRACT, which is currently a disallowed
marking, but it carries the correct connotations (body is disallowed,
not merely trivial).  A class with an ACC_ABSTRACT constructor
is forbidden to extend a class without a corresponding constructor.
Perhaps Object is a special case, or perhaps its sole constructor is
explicitly marked __DeclarativelyEmpty.  Perhaps the spelling of
__DeclarativelyEmpty is “abstract”.  That wouldn’t make me sad.

> The initialization-free flag is partially validated at class load time: it's 
> an error to claim to be initialization-free and be non-abstract (and 
> non-Object), declare instance fields, or extend a non-initialization-free 
> superclass.
> 
> We don't validate <init> method contents. If someone chooses to generate an 
> "initialization-free" class file that contains <init> code, they accept the 
> risk that the code won't run.

Ugh.  We don’t because we can’t reliably.  But we should.  I think
the ACC_ABSTRACT option is better for that reason.

> On loading, an inline class must extend an initialization-free class.
> 
> 2) An initialization-free class is compiled without an '<init>' method.
> 
> For binary compatibility, existing references to 'Foo.<init>' must 
> successfully resolve, with invocation being a no-op. (This is something 
> new—resolution to fake declarations—and potentially concerning.)
> 
> At class load time, an inline class must extend a chain of superclasses that 
> are abstract (or Object), lack '<init>' methods, and lack instance field 
> declarations.

That’s relatively magical, reasoning from the absence of something to
the presence of a special contract.  Yuck.  Also it make it impossible
for a class to have other kinds of constructors (Integer).  Maybe that’s
OK in the end, but it seems to cut off some natural moves.

> Existing classes that meet these requirements may act as inline class 
> superclasses (probably surprisingly, since currently a class without an 
> <init> method can't be initialized at all).
> 
> (My thoughts on (1) vs. (2): both are plausible, and I like the lack of 
> metadata overhead in (2), but otherwise (1) seems much cleaner.)
> 
> In either case, the big new feature here is that an inline class may have a 
> superclass other than Object. This may violate some existing assumptions in 
> the implementation, although it sounds like we can hope nothing new really 
> needs to be done to support it.

So, (3) allow the constructor with no arguments to be declared “abstract”
with no body.  Amend the JVM rules to allow this, and to check upward
to the super for the same condition.  During static checking of source,
treat such a constructor as empty, *and* as forbidding non-static initializers.
I think that gets what we need in a much clearer manner.

Then, all supers of an inline are required to have either no constructors
(interfaces) or a nullary abstract constructor.  Done.

I like (3) better than (1) or (2).  :-)

— John

Re: Superclasses for inline classes

Reply via email to