Re: Value types questions & comments

Brian Goetz Tue, 12 Apr 2016 13:53:29 -0700

> I would assume we're not actually changing anything about primitive boxing, 
> here...?


So, this is rife with tradeoffs….

The legacy boxes are inferior to the new boxes, for a number of reasons.  The 
association between QComplex; and LComplex; is mechanical and simple, whereas 
the association between int and Integer is ad-hoc and complex.  And since the 
new boxes are new, they can be defined from the get-go to have relaxed identity 
semantics, enabling optimizations and defending against possible bugs (e.g., 
they could throw when synchronized upon.)  Whereas its valid now to synchronize 
on a j.l.Integer, and existing code does this (shame, shame), meaning that we 
can’t necessarily take liberties with the identity of the box for optimization 
purposes.  

So it would be great if we could get away with having new mechanically 
generated primitve box classes, and deprecate Integer, but I have deep doubts 
we’ll be able to get away with that.  So, probably right that we’re stuck with 
primitive boxing mostly as is.  

> 
> We have classes at the source level — this will probably expand to include 
> value types.  We have class files — this will probably similarly expand.  I 
> don’t think these will be controversial.  But I think we need to call the 
> runtime entities something else — like TYPE and TYPE MIRROR.  The meaning of 
> “class” is already too overloaded.  Again, though, the game here is to frame 
> the old reality as a lower-dimensional projection of the new reality, and 
> this doesn’t seem impossible.  
> 
> "Is a class from the source/bytecode perspective, isn't a class from the 
> runtime perspective" is worth shooting for, but it seems difficult to even 
> get it down to something that simple. I mean, at runtime this is still a 
> thing that gets loaded and initialized by a class loader, yes? I fear we will 
> never find a clean way to address this.

One terminology we’ve been experimenting with is having “class” and “species” 
(think back to middle school: kingdom, phylum, class, order, family, genus, 
species.)  List is a class; List<int> and List<erased> are species of List.  
Similarly, the boxed projection and the value projection of Complex are both 
species of class Complex.  

Not clear whether this is the right terminology, but it gives users a way to to 
keep thinking that List is a class, while recognizing that the beasts List<int> 
and List<erased> are at the same time both of class List and also of different 
species.  

> I think the fact that we are now talking about user-defined named types with 
> fields, methods, constructors, and implemented interfaces makes this 
> something very different. 

So, how about: 
 - Java has always had values
 - Primitives are the BUILT-IN values
 - Java now gets USER-DEFINED values in addition to USER-DEFINED classes
 - USER-DEFINED values and classes can have fields, methods, constructors, and 
implement interfaces

Does this stacking make it sound less radical?  

I agree that there’s a real pedagogical challenge here, but I think it can be 
made to seem like less of a hurdle.  

> 
> Now, we redefine “C implements I” as follows:
>  - C has all the methods that I has;
>  - Ref[C] is a subtype of I
> 
> 
> Ah, I think this helps some. Maybe. So a layperson explanation is: When 
> writing a value type, you can declare interfaces, but you are actually 
> declaring which interfaces the boxed form of the value will implement, not 
> the value itself. But then if all you do myValue.myInterfaceMethod() it will 
> just skip boxing behind the scenes. Something like that?

That’s exactly how it works, yes.  And, you could put the “skip boxing behind 
the scenes” part in a smaller font, since that’s just an optimization (and, 
even when you explicitly box and then access a box member, there’s some chance 
that the box will still be elided due to escape analysis.)  

> Using a value type for something that isn't a value raises alarm bells for 
> me. At the minimum I would expect this user to have to implement eq/hc by 
> hand, because the default behavior users want 99% of the time is (deep) 
> content-based equality.

This may be the reality-distortion field speaking, but in my view a reference 
*is* a kind of value — albeit a very special kind.  They’re immutable, like 
other values.  Almost all their state is encapsulated (they can be compared by 
identity, that’s it).  They can only be constructed by privileged factories (we 
call these constructors.)  But, ultimately, they behave like values — they are 
passed by value, they have no identity of their own.  

For the Cursor class, the natural definition of equals *is* the componentwise 
one — two cursors are the same if they refer into the same source at the same 
position.  But yes, there are cases where we’d want to hand-override equals 
(which is allowed) to do a deeper comparison (generally when our components are 
value-like references, like strings or dates or big decimals.)  

Another place where references to mutable objects will show up in values is if 
we use values as a substrate for multiple return / tuples.  Here, the value is 
just an ad-hoc container for multiple values — and object references are 
entirely reasonable to use in this context.   

> Gratuitous aside about language syntax even though it is not actually 
> important right now: since we write "enum Foo" not "enum class Foo", I would 
> be quite surprised if we used "value class" here, since between the two only 
> enums are the ones that are real classes in every sense of the word.


Sure, this is one of our tools for helping frame the correct mental model.  If 
we decide that the terminology falls out as “classes are the entities that have 
fields, methods, and constructors”, then “value class” reinforces that.  But we 
could go other ways too.

Re: Value types questions & comments

Reply via email to