Re: The Record Attribute - What does it mean to be a record at runtime?

2020-11-10 Thread Alex Buckley

On 11/10/2020 1:09 PM, fo...@univ-mlv.fr wrote:

*De: *"Chris Hegarty" 
*À: *"Remi Forax" 
*Cc: *"amber-spec-experts" 
*Envoyé: *Mardi 10 Novembre 2020 21:51:38
*Objet: *Re: The Record Attribute - What does it mean to be a record
at runtime?

Remi,

A point of clarification ( which I realise may have been ambiguous
in my last email ).

On 10 Nov 2020, at 19:36, fo...@univ-mlv.fr
 wrote:

...

A better option would be to update the VM implementation to
only assert
that the fields of a record-like class are trusted if that class
contains 1) a structurally sound Record attribute, 2) is a
direct
subclass of j.l.Record, 3) is final, and 4) is non-abstract.
This would
align Core Reflection and the VM in this respect, while the
JVMS would
remain unchanged - it's an implementation detail.


No, it's pushing the JLS semantics into the JVM.

What you want to know is if a field is trusted final or not, to
disallow the creation of VarHandle, the use Unsafe, etc.
So the JDK API should ask the VM if a field is trusted final or
not instead of asking if the class is a record.


Agreed. The VM exposes a (private) interface to the core JDK
libraries to tell whether a field is trusted or not. No issue.  ( I
did not mean to say otherwise )

My issue is with how the VM determines whether a field is trusted or
not. The VM trusts fields in (among other types) “record” classes.
So what is a record class to the VM?   (that is the question that I
am trying to resolve) - the answer is not in the JVMS ( which is fine ).


For me having a Record attribute is enough, i.e. the current definition 
of what a record is for the VM is enough.


The problem is that if a class has to be a subclass of java.lang.Record, 
it means that any languages that want to use the attribute Record has to 
agree to the contract of j.l.Record,
by example, j.l.Record has a precise definition of how to compare 
floating point numbers or the fact that there is an order in between the 
record components.
Those kind of constraints are fine for Java the language but it's not 
something that the VM should enforce for any other languages.


The JLS, and therefore the Java SE Platform that incorporates the JLS, 
defines what a record class is. A record class is a class that extends 
j.l.Record, has no non-static fields, has no native methods, etc. The 
Record attribute indicates that a class file wishes to be treated as a 
record class, so a compiler and the reflection libraries must hold a 
class file with a Record attribute to the standard expected of a record 
class (extends j.l.Record, etc). In principle, a JVM implementation 
could also have this responsibility, but in practice, no-one wants to do 
these exhaustive checks at class load time. So, a non-Java compiler can 
happily spin a class file with a Record attribute and no j.l.Record 
superclass, but it doesn't represent a record class and the reflection 
libraries should not say it does, hence no "trusted" final fields.


Alex


Re: Next up for patterns: type patterns in switch

2020-08-14 Thread Alex Buckley
On 8/14/2020 4:24 AM, Brian Goetz wrote:> My initial preference was to 
make the guard logic part of the pattern;
ideally, to make “Pattern && boolean-expr” a pattern.  But this is 
messy, since it invites ambiguities, and not really needed in the other 
primary consumer of patterns, since boolean expressions can already be 
conjoined with &&, and flow scoping already does everything we want. 
The real problem is switch is too inflexible, a problem revealed when 
its case labels are made more powerful.  So it seems that the sensible
thing to do is to make guards a feature of switch, and say a case label 
is one of:


     case 
     case 
     case  when 


Got it, thanks.

Alex


Re: Next up for patterns: type patterns in switch

2020-08-13 Thread Alex Buckley

On 7/23/2020 11:52 AM, Brian Goetz wrote:

On 7/23/2020 2:38 PM, Remi Forax wrote:
On guards, they do not work well with exhaustiveness. Still not sure 
they are useful.


It works fine, it's just more work for the user to get right.

We induce a domination ordering on patterns.  If `T <: U`, then `T t` < 
`U u` (`T t` is less specific than `U u`.)  Similarly, for all guard 
conditions g, `P & g` < `P`.  What this says is that if you want 
exhaustiveness, you need an unguarded pattern somewhere, either:


     case A:
     case B & g:
     case B:  // catches B & !g
     case C:

or

     case A:
     case B & g:
     case C:
     case Object:    // catches B & !g

I understand your diffidence about guards, but I'm not sure we can do 
nothing.  The main reason that some sort of guards feel like a forced 
move (could be an imperative guard, like `continue`, but I don't think 
anyone would be happy with that) is that the fall-off-the-cliff behavior 
is so bad.  If you have a 26-arm switch, and you want the equivalent of 
the second of the above cases -- B-but-not-g gets shunted into the 
bottom clause -- you may very well have to refactor away from switch, or 
at least mangle your switch badly, which would be pretty bad.


Is the following what you mean by "mangle your switch badly" ?

switch (o) {
  case A: ...
  case B: do some B-ish stuff ... also, if (g) {...}
  case C: ...
  ...
  case Z: ...
  case Object: if (o instanceof B && !g) { do the B-ish non-g thing }
}


Is a guard (a) part of the `case` construct, or (b) part of the pattern 
operand for a `case` construct? The original mail introduced "guard 
expression" as "a boolean expression that conditions whether the case 
matches", which sounds like (a). However, the purpose of a `case` 
construct is to enumerate one or more possible values of the selector 
expression, and if a `case` construct has a post-condition `& g()` then 
it's not just enumerating, and it isn't a `case` construct anymore. I 
mean, we don't want to see guards in the `case` constructs of legacy 
switches, right? (`switch (i) { case 100 & g():`)  So, is the answer (b) ?


Alex


Re: [records] Mark generated toString/equals/hashCode in class files somehow?

2020-08-11 Thread Alex Buckley
If the mandated status of a class/member was to be reified in the class 
file, then you would need Core Reflection and Language Model APIs to 
expose that status, along the lines of isSynthetic.


Alex

On 8/10/2020 8:26 PM, Tagir Valeev wrote:

Thank you, Alex!

I created an issue to track this:
https://bugs.openjdk.java.net/browse/JDK-8251375
I'm not sure about the component. I set 'javac', though it also
touches JLS and JVMS; hopefully, no actual changes in HotSpot are
necessary as HotSpot can happily ignore this flag.
I tried to formulate proposed changes in JLS and JVMS right in the
issue. For now, I omit fields from the discussion because supporting
fields isn't critical for my applications.
But it's also ok to extend this to fields.

What else can I do to move this forward?

With best regards,
Tagir Valeev.


Re: [records] Mark generated toString/equals/hashCode in class files somehow?

2020-08-07 Thread Alex Buckley
You are right that ACC_MANDATED is not expressible for methods. This is 
unfortunate not only for equals/hashCode/toString in a record class, but 
also for values/valueOf in an enum class.


ACC_SYNTHETIC indicates an implementation artifact -- something that 
varies from compiler to compiler (or from one release of a compiler to 
the next release of the same compiler). It would be wrong to use 
ACC_SYNTHETIC to mark the five methods in the previous paragraph. They 
are language artifacts, whose existence + signature + semantics are the 
same across compilers.


It would be legitimate to add ACC_MANDATED to method_info.access_flags. 
ACC_MANDATED is defined as 0x8000 in other contexts, so convention 
dictates that it would have to be defined as 0x8000 in 
method_info.access_flags too. Happily, 0x8000 is available there. This 
also applies to field_info.access_flags.


Alex

On 8/6/2020 9:20 PM, Tagir Valeev wrote:

Hello, Jonathan!

I believe, current JVM specification doesn't say that methods could be
marked with ACC_MANDATED [1]. I won't mind if it will be used instead
of SYNTHETIC. To me, anything is ok if I can avoid bytecode
inspection.

With best regards,
Tagir Valeev.

[1] https://docs.oracle.com/javase/specs/jvms/se14/html/jvms-4.html#jvms-4.6

On Fri, Aug 7, 2020 at 11:12 AM Jonathan Gibbons
 wrote:


Tagir,

The concept and word you are looking for is "mandated", which is similar
to but different from "synthetic".

See
https://docs.oracle.com/en/java/javase/14/docs/api/java.compiler/javax/lang/model/util/Elements.Origin.html#MANDATED

-- Jon


On 8/6/20 8:48 PM, Tagir Valeev wrote:

Hello!

I'm working on class-file decompiler for records and discovered that
there's no special flag for generated equals/hashCode/toString (like
ACC_SYNTHETIC). This allows determining whether this method was
explicitly specified in the source code only by looking into method
implementation whether it has an ObjectMethods.bootstrap indy or not.
This looks implementation-dependent and somewhat fragile (though, of
course, we will do this if we have no other options). We also have a
stub decompiler that decompiles declarations only without checking
method bodies at all and it also wants to know whether
equals/hashCode/toString methods were autogenerated. Finally, other
bytecode tools like code coverage may need this to avoid calculating
coverage for methods not present in the source.

Is it possible to mark generated methods via ACC_SYNTHETIC or any
other flag or add any attribute that can be used to differentiate
auto-generated methods from the ones presented in the source code?

Having a synthetic mark for auto-generated canonical constructor or
accessor methods is less critical (as their bodies could be actually
written in the source code like this) but it would be also nice to
have it.

With best regards,
Tagir Valeev.


Re: Switch Expression - complete normally - spec omission?

2020-05-15 Thread Alex Buckley

On 5/14/2020 7:22 PM, Manoj Palat wrote:

I think there is a spec omission regarding "complete normally for switch 
statements whose switch block consists of switch rules

Ref JLS 14 Sec 14.22
...
A switch statement whose switch block consists of switch rules can complete
normally iff at least one of the following is true:
– One of the switch rules introduces a switch rule expression (which is
necessarily a statement expression).
– One of the switch rules introduces a switch rule block that can complete
normally.
– One of the switch rules introduces a switch rule block that contains a 
reachable
break statement which exits the switch statement.
...
Now consider:

switch (b) {
case 1 -> {
throw new Exception();
}
}

As per the above definition, this switch statement cannot complete normally;
but consider "b" having a value other than 1 and then it completes normally.


/* -
To demonstrate the point, the switch statement above uses a switch rule 
block that completes abruptly, but could alternatively have used a 
switch rule `throw` statement:


switch (b) {
  case 1 -> throw new Exception();
}

Also, to clarify for the many readers of this list, we are discussing 
switch statements, which never require a `default` label. We are not 
discussing switch expressions, which almost always require a `default` 
label.

- */



Also consider, 14.11.3. which says:
"If no switch label matches, the entire switch statement completes normally."

which looks inconsistent. One hand says: "completes normally" the other:
"iff at least one of the following ..." not mentioning default.

Hence, shouldn't the item,
-> – The switch block does not contain a default label.

also be added in the list in 14.22?


I agree that without a `default` switch label, 14.11.3 may evaluate "If 
no switch label matches, the entire switch statement completes 
normally."  Since we can see a way for the statement to complete 
normally, 14.22 ought to say that the statement _can_ complete normally. 
So, it looks right to add "– The switch block does not contain a default 
label."


Alex


Re: Class & interface terminology

2020-05-08 Thread Alex Buckley

On 5/7/2020 4:19 PM, Alex Buckley wrote:
We ought to be able to represent the taxonomy in a table, where any 
combination of row title and column title gives something meaningful:


   | Declaration |  Type
-
Class C  | class C {..}    | C in `C c;`
  C  | class C {..} | C in `C c;`

> ...

Please incorporate a table like this in the intro. ... 30 pages of
detailed JLS changes look impressive but all they do is embody a
taxonomy that should be explainable to every Java programmer in under
30 lines.
This top-down explanation of the taxonomy will also help when I apply 
the Consistent Terminology spec draft to the JLS proper. As thorough as 
the hundreds of edits in the spec draft are, there are bound to be 
omissions and incongruities somewhere, so having a complete and 
out-of-band overview of the desired taxonomy will get them fixed faster.


Alex


Re: Class & interface terminology

2020-05-07 Thread Alex Buckley

On 5/7/2020 3:24 PM, Dan Smith wrote:

Are you comfortable with the tension between "annotation interface
declaration" and "enum declaration", or do we need to keep pulling on
the string to get "enum class declaration"?


As weird as "annotation interface declaration" is, there has to be some 
word between "annotation" and "declaration", and if we're throwing out 
types and bringing in classes and interfaces, then "annotation 
type^H^H^H^Hinterface declaration" is the answer, that's it, nothing 
more to say. "enum declaration" OTOH doesn't need anything inserted.



(One argument for being okay with some incongruousness here: enum
declarations literally say 'enum' in their syntax. Annotation
[interface] declarations do not say 'annotation'—instead, they say
'interface'.)


If you pronounce `@` as "annotation", then `@interface Foo {..}` is an 
annotation interface declaration (of Foo) just as congruously as `enum 
Color {..}` is an enum declaration (of Color).



In that vein: If there are class and interface types for variables,
then there are also annotation types for variables -- `Foo x =
blah.getDeclarationAnnotation(Foo.class)` is legal, and lets you
call Foo's methods on `x` in order to retrieve element values. So:

Annotation: @Foo Annotation interface: Foo, with elements name and
age Annotation interface declaration: @interface Foo { String
name(); int age(); } Annotation type: Foo, as in the static type of
a variable declared as `Foo x;`

Also, there are enum types -- in `Color c = Color.RED;`, the first
Color is an enum type and the second Color is an enum class, right?
That's the kind of discussion I'm hoping for in the spec draft's
intro, expanding on the mysterious clause "A class type or an
interface type is variable or expression type".


Sure, I can expand on that some.

"Annotation type" and "enum type" have reasonable interpretations,
but you'd rarely actually want to use those terms, because these are
just special cases of "interface type" and "class type". One of the
big reasons for emphasizing that an [enum/enum type/enum class] *is
a* class is so that it's clear that everything we say about class
types includes enum types.


I agree that in a top-down discussion, you wouldn't need to say 
"annotation type" or "enum type" very often ... but in a bottom-up 
discussion about the meaning of some source code, you'd want to say 
"`col` has an enum type, and we switch over it here, RED, GREEN, BLUE 
..." rather than "`col` has a class type".


We ought to be able to represent the taxonomy in a table, where any 
combination of row title and column title gives something meaningful:


   | Declaration |  Type
-
Class C  | class C {..}| C in `C c;`
  C  | class C {..} | C in `C c;`

Enum  Color  | enum Color {..} | Color in `Color c;`
RecordPoint  | record Point... | Point in `Point p;`
-
Interface   I
Annotation  Foo

Please incorporate a table like this in the intro. Maybe the table above 
is no good because it never admits "enum class" or "annotation 
interface". 30 pages of detailed JLS changes look impressive but all 
they do is embody a taxonomy that should be explainable to every Java 
programmer in under 30 lines.


(In exchange, I will provide the ASCII art for 
toplevel/nested/member/local/anonymous classes!)


Alex


Re: Class & interface terminology

2020-05-07 Thread Alex Buckley

On 5/7/2020 2:29 PM, Dan Smith wrote:
Is there a particular part of JLS that you think should be rephrased 
to reflect these assertions?


No, I just wanted to get the assertions on the record in email, so they
could be added to the discussion at the start of the Consistent
Terminology spec. 10 years from now, this email thread will be lost to 
history, but a spec that's part of a preview feature in Java SE 15 will 
not be lost.


It would be good to enhance the introduction of the draft to 
acknowledge the common terms "static type" (perhaps in connection 
with "A class type or an interface type is [a] variable or 
expression type") and "dynamic type".


In general, I'm keeping types at arm's length here, just enough to
be able to say "classes and interfaces are not the same thing as 
types".


Perhaps this is all you're after? "A class type or an interface type 
is the static type of an expression that is known to produce 
instances of a class or interface; these terms should be avoided

when talking about the declaration."


Sure. I'm just looking for the start of the Consistent Terminology spec
to respond, via an enhanced introduction, to points raised on
amber-spec-experts.


Yeah, I noticed that is why JLS 3 got trapped into saying
"annotation type" instead of just "annotation". I agree that bare
"annotation" should be readily interpreted as the use-site
construct.

...
I went with "annotation declaration" rather than "annotation 
interface declaration" just to try to be concise. Similarly, it's 
"enum declaration" and "record declaration", not "enum class 
declaration" and "record class declaration".


But I can see how the overloading of the term "annotation" makes
this confusing. It is *not* my intent to suggest that an annotation 
[interface] declaration introduces an entity called an "annotation". 
No, it's always an "annotation interface".


Maybe we're better off with "annotation interface declaration"?


It's pretty sad to pull on a piece of string that starts in chapter 1 
and goes all the way into chapter 9, only to find that rusty tin can of 
a term attached to the end. However, it's the consistent choice, and I 
want to move away from polishing individual terms in order to understand 
the complete taxonomy.


In that vein: If there are class and interface types for variables, then 
there are also annotation types for variables -- `Foo x = 
blah.getDeclarationAnnotation(Foo.class)` is legal, and lets you call 
Foo's methods on `x` in order to retrieve element values. So:


Annotation: @Foo
Annotation interface: Foo, with elements name and age
Annotation interface declaration: @interface Foo { String name(); int 
age(); }
Annotation type: Foo, as in the static type of a variable declared as 
`Foo x;`


Also, there are enum types -- in `Color c = Color.RED;`, the first Color 
is an enum type and the second Color is an enum class, right? That's the 
kind of discussion I'm hoping for in the spec draft's intro, expanding 
on the mysterious clause "A class type or an interface type is variable 
or expression type".


Alex


Re: Class & interface terminology

2020-05-07 Thread Alex Buckley

On 5/7/2020 11:48 AM, Dan Smith wrote:

I'll add that it's useful to think in terms of three different entities:

1) Class and interface declarations. These are syntax. Example: "interface List 
..."
2) Classes and interfaces. These are "symbols". Example: the interface List
3) Class and interface types. These are a kind of type. Example: the type 
List

There's a one-to-one relationship between class/interface
declarations and classes/interfaces. There is, in general, a
one-to-many relationship between classes/interfaces and types. (More
accurately, types can use classes and interfaces however they like to
describe sets of values.)

Thank you Dan. The First Edition of the JLS revolved around

  type declarations of class and interface types

because (2) and (3) were the same thing in 1996. Less than 10 years 
later, JSR 14 had introduced a plethora of new terms -- generic types, 
parameterized types, raw types, reifiable types -- which the Third 
Edition of the JLS integrated heroically, but without the exhaustive 
analysis that you have performed here. I look forward to a future JLS 
that revolves around


  class and interface declarations of classes and interfaces.


Speaking of "generic", I would be grateful if you can clarify (maybe 
here, in the first instance, then later in the draft) :


- A generic class or interface declaration begets a generic class or 
interface. (1.1 "Class and interface declarations may be generic" + 8 
"Classes may be generic ...")


- A generic class or interface begets many parameterized class or 
interface types.


- A generic method declaration begets a generic method, which undergoes 
generic method invocation.



It would be good to enhance the introduction of the draft to acknowledge 
the common terms "static type" (perhaps in connection with "A class type 
or an interface type is [a] variable or expression type") and "dynamic 
type".



I confess to similar twitchings as Remi in relation to "annotation 
interfaces". (Not "enum classes", or "record classes" -- all good 
there.) My concern is that 99% of conversations around metadata in Java 
programs are interested in the `@Foo` annotations appearing around their 
source code, not about the Foo annotation interface defined far away by 
someone else. That is, "annotation" usually means "an instance, 
providing literal values for the elements of the annotation interface", 
and not "a declaration of an annotation interface" ... yet the draft has 
"annotation" as "a declaration":


  An annotation declaration specifies a new annotation interface,
  a special kind of interface.

What you're saying here is:  Just like a class declaration introduces a 
class Foo, which `new` turns into an object (an instance of class Foo), 
an annotation declaration introduces an annotation Foo, which `@` turns 
into an object (an instance of annotation Foo).  I always thought of the 
annotation as being `@Foo`, not `Foo`, though I concede that people will 
_say_ "Look at the annotation Foo here, it means ..." while pointing to 
the character sequence `@Foo`.


Alex


Re: [sealed] Runtime checking of PermittedSubtypes

2020-04-22 Thread Alex Buckley

On 4/22/2020 4:32 PM, Dan Smith wrote:

Another module system sanity check: is mutual recursion allowed between unnamed 
modules of different loaders? (Pre-9, mutual recursion between different 
loaders was certainly possible...)


Yes. Anything that was possible with class loaders on JDK 8 is still 
possible on JDK 9+. Class loaders in JDK 9+ are free to mutually 
delegate for classes that are in unnamed modules rather than in 
layer-defined run-time modules. We even arrange a complete readability 
graph between those unnamed modules -- "Every unnamed module reads every 
run-time module." -- in case class C in one unnamed module attempts to 
access (possibly via a `Class` object, so no need for c.p. resolution) 
class D in another unnamed module, without any concern or distress over 
the higher-level story that C refers to D and D refers to C.


Alex


Re: [sealed] Runtime checking of PermittedSubtypes

2020-04-22 Thread Alex Buckley

On 4/22/2020 4:15 PM, Dan Smith wrote:

Okay, so it would be valid to talk about "the run-time module of C"
while C is being derived?

I think this sentence from 5.3.6 anticipates this sort of early
query, but I might be misinterpreting the parenthetical:

"We say that a class is in a run-time module iff the class's run-time
package is associated (or will be associated, if the class is
actually created) with that run-time module."

I'm a little nervous about the idea that some associations haven't
happened yet. I'm not clear on the timing.


Yes, it would be valid. The wording of the 5.3.6 sentence is trying to 
afford a degree of laziness to implementations in how they they make the 
association between run-time package and run-time module. This is 
because "run-time package" is a pretty ephemeral concept: before any 
class is created, does any run-time package exist? Many implementors 
would say "no" -- and yet, the invocation of ModuleLayer::defineModules 
has already happened, so e.g. the VM has already associated the run-time 
module `java.base` with the run-time package `java.lang` of the 
bootstrap loader, even before the class `Object` is loaded. So, if you 
have the name of a class, and a loader, you can find out from the VM 
which run-time module the class has been or will be created as a member 
of ("in"). The loader narrows down the run-time modules a bit; the 
package name narrows them down further. The module:loader:package 
relationships are immutable.


Alex


Re: [sealed] Runtime checking of PermittedSubtypes

2020-04-22 Thread Alex Buckley

On 4/22/2020 1:43 PM, Remi Forax wrote:

2a) If the superclass belongs to a different run-time module, error.
2b) If the superclass doesn't have the subclass's name in its PermittedSubtypes,
error.

2b doesn't need to load anything, because 2a has guaranteed that both classes
have the same defining loader. (We do the same thing with nestmates.)

I'm a little unsure about 2a, though, because I don't have a great grasp of how
modules work—when I'm still deriving a class (JVMS 5.3.5), can I tell what my
runtime module will be?


From the class name which is qualified, you have the package name and from a 
package name and a classloader, you have the module.
A module knows all the packages inside itself, during the modules resolution, each 
module advertise all its packages, so when a classloader on a resulting 
configuration, it knows the Map.
If the classloader doesn't find the module associated to a package, it means 
that it's a package from the classpath, so the package will be created lazily 
using the unamed module of the classloader.


As Remi says, by the time class loading happens, a layer has already 
been set up to map modules (and hence their packages) to loaders. JVMS 
5.3.6 discusses this, including the possibility that a class is "in a 
run-time module".


We made a big deal in Java 9 about how "Class Loading Doesn't Change" 
for the module system -- there's no additional argument to loadClass or 
defineClass to indicate module membership -- but it is legitimate in 
Java 15 for defineClass to start caring about module membership because 
it's done to respect sealing rather than to support the module system 
itself.


Alex


Re: JLS question regarding Text Blocks

2020-04-06 Thread Alex Buckley

On 4/6/2020 11:43 AM, Jim Laskey wrote:

In section 3.10.6 Text Blocks of the updated spec;

"The opening delimiter is a sequence that starts with three double quote 
characters ("""), continues with zero or more space, tab, and form feed 
characters, and concludes with a line terminator."


However, the JEP 378 description reads "The opening delimiter is a 
sequence of three double quote characters (""") followed by zero or more 
white spaces followed by a line terminator.


javac uses the Character.isWhitespace test for characters following the 
opening three double quote characters.


Character::isWhiteSpace allows not only the JLS definition of whitespace 
-- space, tab, and FF -- but also LF and CR. So, javac would accept """ 
followed by some LF/CR characters (its idea of "zero or more white 
spaces") followed by a final LF/CR character (the "line terminator", at 
least according to the definition in JLS 3.4; can't find a definition in 
the API). That can't be right -- it extends the delimiter line into the 
content, so to speak.


Alex


Re: @since specification for preview features

2020-04-06 Thread Alex Buckley
Thanks Jim. I recorded this policy in JEP 12, under "Relationship to 
Java SE APIs":


"The API developer must also add an @since tag that indicates the 
release when @preview was first added. (If the essential API element is 
eventually made final and permanent in Java SE $Z, then the @since tag 
must be changed to indicate the $Z release.)"


Alex

On 4/6/2020 10:03 AM, Jim Laskey wrote:
FTR: A question was raised about which Java version should be used with 
the @since tag of method associated with a preview feature. The evident 
answer is that while previewing, that the value should be the version of 
Java where the preview feature was introduced. When the feature becomes 
standard, then the value should be the Java version where the feature 
became standard.


Ex. Text Blocks became a preview feature in JDK 13, thus 
String::stripIndent had a @since 13tag. Since, Text Blocks continued as 
preview feature in JDK 14 and the tag remained the same.  When Text 
Blocks become standard in JDK 15, the tag in String::stripIndent will 
become @since 15.


Cheers,

-- Jim



Re: Clarifying record reflective support

2019-12-03 Thread Alex Buckley

On 12/3/2019 8:49 AM, Dan Smith wrote:

So,

Fine: "isRecord returns true if the class extends java.lang.Record
and has a Record attribute." (a little more detailed than most
reflection methods, but that's probably good)

Overkill: "isRecord returns true if the class extends
java.lang.Record and has a Record attribute that conforms to the
following rules ..."


Yes. "has a Record attribute" is the most that the broadly-read API spec 
should admit about the class file. Even "has a *well-formed* Record 
attribute" would be too much, since it quickly devolves into your 
overkill scenario.


Alex


Re: Clarifying record reflective support

2019-12-03 Thread Alex Buckley

On 12/3/2019 8:39 AM, John Rose wrote:
On Dec 3, 2019, at 8:31 AM, Alex Buckley <mailto:alex.buck...@oracle.com>> wrote:


For example, if the Record.components[i].attributes table contains a 
Signature attribute, then that's fine per Table 4.7-C; but if said 
table contains a Code attribute, then the overall Record attribute is 
malformed, and I would expect a ClassFormatError as I would for a 
malformed descriptor in Signature. (I don't wish to rat-hole on 
whether HotSpot actually checks Signature so deeply. The purpose of 
Signature is clear, and aligned with the purpose of Record, and 
purpose is what should drive depth-of-check.)


As I argued in my just-sent mail, such purposes split into two parts:  The
purposes necessary for bytecode execution, and those necessary for 
reflection.

This leads to a middle ground of “reflectively invalid” attributes which can
nonetheless pass class file loading.


By "purpose", I mean the fundamental role of the attribute in the class 
file. That's what JVMS 4.7 seeks to document by offering three lists. A 
related but lower-level question is how much checking occurs at load 
time versus reflection time.


Let me step back. I don't disagree with anything you say. I agree that 
the second list, where Record lives because it's a Java language helper, 
should be shallowly checked so that the deep stuff is left to reflection 
(i.e. it could be "reflectively invalid" despite having passed class 
file format checking). What I am trying to do is help Chris understand 
the JVMS draft so he can ship in JDK 14 -- is there anything in section 
4.7 of the draft that needs to be changed for JDK 14?


Alex


Re: Clarifying record reflective support

2019-12-03 Thread Alex Buckley

On 12/3/2019 8:35 AM, John Rose wrote:

I have two concerns concerning JVM behavior:

1. Keep class file loading fast and simple.  Don’t go beyond precedent in 
structure checking.
The current implementation is good from this POV; it just ensures basic 
referential
integrity at the constant pool level, plus shallow syntax checking of names and 
descriptors.


Yes, integrity+shallow_syntax checks are the essence of format checking 
at class load time. 
http://cr.openjdk.java.net/~gbierman/jep359/jep359-20191125/specs/records-jvms.html#jvms-4.7 
currently specifies that Record undergoes format checking at class load 
time.



Attributes are bundled up for later, as usual.


Checking which sub-attributes appear in `attributes` tables is also part 
of format checking. In keeping with the "shallow" mantra, only a name 
check of the attribute is needed at load time. ("This record component 
has an attribute which might be total junk but is called Signature, so 
PASS, but also has an attribute which might be well-formed but is called 
Code, so FAIL.")



2. Perform deeper checks only when reflection is performed.  This is when 
things get
sticky.  There are many things which can go wrong during reflection on a class 
file
that has passed all the (relative shallow) load-time checks.  If a field or 
method descriptor
mentions a type which cannot be loaded, reflecting that field will fail, even 
though the
bytecodes are perfectly serviceable (as long as the unloaded type is only used 
to pass
nulls in bytecode execution).  The same is true for non-loadable types in 
InnerClasses.
If a generic signature syntax is wrong, you find out during reflection.  We 
could try to
test for such things earlier, but that would slow down application startup, 
which is a
weak spot for us, that we don’t want to weaken further.


I agree with the above (setting aside anything to do with Signature 
because that attribute is a mess). Deep syntax checking and deep 
sub-attribute checking is for reflection time, not class load time.



The application of point #2 to records is that a record component which has a
non-loadable type descriptor should fail (with a low-level error) on reflection,
even though the record can be used for normal bytecode execution.


Yes.


Likewise,
if a bogus record component mentions a field that doesn’t exist, this should
(I think) fail at reflection time; there’s no reason to check for this 
particular
error at earlier class load time.  And so on for any other structural problems
that can happen with records.


Yes.

(Chris can probably stop reading here.)


The upshot of this is that reflective APIs should be allowed to throw low-level
errors if the class file has a deep error in it.  Such errors cannot *all* be 
ruled
out at class load time (in principle, not just practically; details on request),
and so reflection *must* be allowed to be an incomplete operation.

This subtly affects the reflective API points which return information that
depends on classfile attributes (when, as is often the case, such attributes 
cannot
be fully validated at class load time).  Such API points as getInnerClasses and
getRecordComponents must be allowed to throw errors for “partially broken”
class files.

(Do we need a specific term for “partially broken” here? We might say
“reflectively invalid” I suppose.)

The javadoc is relatively silent about reflectively invalid class files,
but don’t take that as evidence that failure is impossible.


The four paragraphs above suggest that the API spec should document 
"low-level errors" about invalid class files (that is, invalid according 
to deep, reflection-time, checks). The next paragraph, however, suggests 
that the API spec doesn't need to document such errors/failure modes:



Should getRC document failure modes?  Maybe, but if they are just the
same as those affecting getMethods, getInnerClasses, etc., there’s no need
to.  New failure modes, such as “record component not found”, might be
documented, maybe.  (I don’t see this happening in the JVM code, which
is a good sign.)  But *all* such errors are artifacts introduced by
broken tools, and it appears that they are sufficiently rare that they
can be swept under the rug, in the javadoc.  After all, errors need not
be documented, especially endemic ones like OOME and SOE, and
the reflection doc creates CNFE with a similar level of silence.


Alex


Re: Clarifying record reflective support

2019-12-03 Thread Alex Buckley

On 12/3/2019 6:24 AM, Chris Hegarty wrote:

At least from the JVM side of things, one could lean on the JVMS,
Section 4.1 “The ClassFile Structure”. While the draft records JVMS does
amend various sections and subsections of Chapter 4, it does not touch
section 4.1.  Reading between the lines, I think one way of ensuring the
well-formedness of the Record attribute would be to add a reference to
it from the top-level `attributes[]` format, e.g.

   "If a Java Virtual Machine implementation recognizes class files
whose version number is XX.xx or above, it must recognize and
correctly read the Record (§4.7.xx) attribute found in the attributes
table of a ClassFile structure of a class file whose version number
is XX.xx or above."

This is similar to the Signature, and a few other, attributes.

If we had this, then the implementation could rely on simply the
presence of a Record attribute, and no further checking would be
necessary.


http://cr.openjdk.java.net/~gbierman/jep359/jep359-20191125/specs/records-jvms.html#jvms-4.7 
already specifies Record as an attribute similar to other 
language-oriented attributes such as Exceptions and Signature: "critical 
to correct interpretation of the class file by the class libraries of 
the Java SE Platform". Record must therefore be format-checked eagerly 
(at class load time) rather than lazily (at reflection time).


For example, if the Record.components[i].attributes table contains a 
Signature attribute, then that's fine per Table 4.7-C; but if said table 
contains a Code attribute, then the overall Record attribute is 
malformed, and I would expect a ClassFormatError as I would for a 
malformed descriptor in Signature. (I don't wish to rat-hole on whether 
HotSpot actually checks Signature so deeply. The purpose of Signature is 
clear, and aligned with the purpose of Record, and purpose is what 
should drive depth-of-check.)


Alex


Re: Updated Draft specs for JEP 359 (Records)

2019-11-27 Thread Alex Buckley

On 11/27/2019 12:59 AM, Gavin Bierman wrote:

I don't see any changes to 
http://cr.openjdk.java.net/~gbierman/jep359/jep359-20191125/specs/records-jvms.html
 ?


Should be there now.


This updated JVMS draft looks good, thanks.

Alex


Re: Updated Draft specs for JEP 359 (Records)

2019-11-26 Thread Alex Buckley



On 11/26/2019 8:48 AM, Dan Smith wrote:

Here's my slightly-tweaked version of this note:


It is a limitation of the `class` file that, while a method parameter or a
module may be marked `ACC_MANDATED` ([4.7.24], [4.7.25]), there is no
equivalent way to flag compiler-generated methods and fields which are not
considered implementation artifacts (JLS 13.1).
This limitation means that reflective APIs may not accurately indicate the
mandated status of such members.


Thanks. (I note the change from "oversight in the design of" to 
"limitation of" :-) )


I don't see any changes to 
http://cr.openjdk.java.net/~gbierman/jep359/jep359-20191125/specs/records-jvms.html 
?


Alex


Re: Updated Draft specs for JEP 359 (Records)

2019-11-26 Thread Alex Buckley

Updated JLS draft looks good, thanks.

On 11/26/2019 6:17 AM, Gavin Bierman wrote:

Thanks Alex; have made these changes to the online version.

Gavin


Re: Updated Draft specs for JEP 359 (Records)

2019-11-25 Thread Alex Buckley

// Cutting amber-dev

On 11/25/2019 3:23 PM, Gavin Bierman wrote:

http://cr.openjdk.java.net/~gbierman/jep359/jep359-20191125/specs/records-jls.html


The JLS draft is good. Some technical rewordings:

1. 8.10.3: "and the type as the declared type" -- missing a "same"

2. 8.10.3: Say "This field is annotated with the annotations, if any, 
that appear on the corresponding record component and whose annotation 
types are applicable in the field declaration context or in type 
contexts or both."  (Later, for implicitly declared accessor methods, 
the phrase "whose annotation type is" should be "whose annotation types 
are". We do not constrain all the annotations to have the same 
annotation type, nor all the annotation types to be applicable in the 
same way. You could introduce an existential qualifier if you like -- 
"For each annotation A that appears on the corresponding ..." -- but I 
don't think it's necessary.)


3. 8.10.3: "A method ***declared*** in a record type R is said to be an 
accessor method" -- this raises questions of "explicitly or implicitly". 
It's too easy to think it means "explicitly" when that's not always 
true. Sidestep the problem by speaking more directly: "In a record type 
R, an _accessor method for a record component_ is a method whose name is 
the same as the name of the record component, and whose formal parameter 
list is empty." Then some refactoring because long list items are hard 
to follow:


-
For each record component appearing in the record component list:
1. An implicitly declared private final field ...
2. An accessor method for the record component. [THAT'S IT. DON'T RULE 
ON THE FORM OF EXPLICITLY DECLARED METHODS HERE. JUST MAKE SURE ALL THE 
METHODS EXIST:]


   If an accessor method for the record component is not explicitly 
declared, then one is implicitly declared with the following properties:

   - The name is the same as ...
   - ...

   It is a compile-time error if the implicitly declared accessor 
method is override-equivalent (8.4.2) with a non-private method of the 
class Object. [THIS PARAGRAPH IS TROUBLE. PLEASE MAKE AN EXPLICIT RULE 
IN 8.10.1 THAT SPELLS OUT IN CLEAR NORMATIVE TEXT THAT A RECORD 
COMPONENT MUST NOT HAVE BE CALLED CLONE, FINALIZE, ETC. RELYING ON A 
SUBTLE RULE IN A DIFFERENT SECTION ABOUT IMPLICITLY DECLARED STUFF IS 
*TOO HARD*. IT SUGGESTS THAT THE COMPILER SHOULD POSITION THE CARET FOR 
THE ERROR AT SOME POINT IN THE RECORD'S BODY, WHERE THE IMP.DECL. METHOD 
WOULD LIVE, RATHER THAN IN THE RECORD'S HEADER.]


[ITEM 2 ENDS HERE. NON-LIST TEXT FOLLOWS.]

If an accessor method for a record component is declared explicitly, 
then it must satisfy the following rules, or else a compile-time error 
occurs:

- The return type of the accessor method ...
- ...

[THE FOLLOWING PARAGRAPH IS DISTINCTLY SURE -- "IS NOT ANNOTATED" -- 
THAT AN EXPLICIT METHOD HAS NO ANNOTATIONS LIKE THE ONES ON THE RECORD 
COMPONENT. WHAT IF THE DEVELOPER WRITES ANNOTATIONS EXPLICITLY? PLEASE 
CHANGE THE PARAGRAPH TO AN INFORMATIVE NOTE WHERE YOU CAN SAY: 
ANNOTATIONS THAT APPEAR ON THE CORRESPONDING RECORD COMPONENTS ARE NOT 
CARRIED OVER TO EXPLICITLY DECLARED ACCESSOR METHODS, IN CONTRAST TO HOW 
IMPLICITLY DECLARED ACCESSOR METHODS ARE ANNOTATED ACCORDING TO BLAH BLAH.]
An explicitly declared accessor method is not annotated with any 
applicable annotation that appears on the corresponding record component.

-

4. 8.10.4: It's odd to see in "derived constructor signature" that a 
ctor has a name R, since 8.8 doesn't admit to a ctor having a name. That 
said, 8.8.2 speaks of "two constructors with override-equivalent 
signatures (§8.4.2) in a class", and the definition of override 
equivalent implies a name is present, so I'll set my concern aside. Just 
a minor rewording for flow -- push derivation of formal parameter list 
down one level, as it's not used by anything else (OK, one place, but 
I'll get to that) :


-
To support proper ... corresponding to the record components.

A record type R has a derived constructor signature with the name R, 
with no type parameters, and with a formal parameter list that is 
derived from the record component list of R as follows:


- For each record component ...
-

8.10.5: Construct good things, or else people will wonder:  "At most one 
compact constructor declaration can be declared for a record type." + 
"In a record type R, the signature of a compact constructor declaration 
is the _derived constructor signature_ of R (8.10.4)."   Don't say in 
normative text that "one which is derived from the record component list 
of R is added implicitly" -- "added" is too imperative -- this is an 
opportunity to write an informative note that compares and contrasts a 
compact ctor with all other ctors in Java -- "Unlike ctors in records, 
and indeed in classes in general, no explicit formal parameter list is 
given for a compact ctor. The record component list provides the X from 
which a Y is derived."

Re: Updated Draft specs for JEP 359 (Records)

2019-11-25 Thread Alex Buckley

// Cutting amber-dev

On 11/25/2019 3:23 PM, Gavin Bierman wrote:

http://cr.openjdk.java.net/~gbierman/jep359/jep359-20191125/specs/records-jvms.html


The JVMS draft is good. It should have an informative note in 4.7.8: "It 
is an oversight in the design of the `class` file that there is no way 
to flag compiler-generated methods which are not considered 
implementation artifacts (JLS 13.1). This oversight means that 
reflective APIs may not accurately indicate the mandated status of such 
methods."  These words may not look like much to you, but commentary 
about what the class file CAN'T do is pure gold to readers, and will 
help us recollect the situation many years from now. I accept that "more 
notes == more stuff to update when situations change" -- that is the 
risk we take for the reward of informing our readers.


Alex


Re: Updated Draft specs for JEP 359 (Records)

2019-11-25 Thread Alex Buckley

On 11/25/2019 12:42 PM, Dan Smith wrote:

On Nov 6, 2019, at 11:21 AM, Alex Buckley 
wrote:
For spec clarity, please rename `component_info` to
`record_component_info`.


I hadn't seen this comment, but I've now applied this change as
requested (will show up next time Gavin posts an update).


Thanks.


As an aside, please drop "We're being intentionally vague here
about just what it means for a class to have a "component"."


Yep, that was intended for earlier in the design, no longer relevant.
Done.


OK.


and strengthen the opener: "The Record attribute is a
variable-length attribute in the attributes table of a ClassFile
structure. ***A `Record` attribute indicates that this class is a
_record type_ (JLS §8.10), declared with a list of _record
components_.***"  [Almost certainly declared _in source code_, but
maybe this class file was auto-generated, so no need to say how the
record type was declared ... but it was, since here we are in its
class file.]


This came up in the CSR, and the conclusion was to *weaken* it—JVMS
doesn't care about record types and under what conditions a class is
or isn't considered a record. This sentence is purely meant to give
readers some idea about why this attribute exists. So I've revised it
to:

"The `Record` attribute records information about the components of a
record type (JLS 8.10)."


OK, though as an editorial matter, avoid a record recording something. 
Prefer "... of a ClassFile structure. ***A `Record` attribute denotes 
information about the components of a record type (JLS 8.10).***"


Alex


Re: Spec: ACC_MANDATED

2019-11-22 Thread Alex Buckley

On 11/22/2019 12:10 PM, Jonathan Gibbons wrote:
Could someone also specify definitively the behavior when a user chooses 
to explicitly define a method, such as `equals` or `hashCode` for a 
record.   In other words, just because a method may be mandated in JLS, 
I'm expecting that this does not imply the use of ACC_MANDATED in those 
situations where the user explicitly defines the method.


Right, a "mandated" method is one created by the compiler because the 
JLS mandated (i.e. forced) the presence of the method if it wasn't 
explicitly declared in source code. If it was explicitly declared in 
source code, then it's not created by the compiler and is "just" an 
ordinary unflagged method (not synthetic, not mandated). This should all 
follow from 
http://cr.openjdk.java.net/~gbierman/jep359/jep359-20191121/specs/records-jls.html#jls-13.1 
... BTW, I see "Certain private fields and public methods of record 
types (8.10.3)" are not marked as mandated so I guess we're not 
introducing ACC_MANDATED to {field_info,method_info}.access_flags after all.


JCK tests which check the mapping from source code to class file can 
check that an explicitly declared method is neither ACC_MANDATED nor 
ACC_SYNTHETIC, and that source code without an explicitly declared 
method gets a method which is ACC_MANDATED and not ACC_SYNTHETIC.


Alex


Re: Spec: ACC_MANDATED

2019-11-22 Thread Alex Buckley
(Removing compiler-dev. Cross-posting to *-dev and *-spec-experts list 
is wrong. We're discussing a question driven mainly by Records, so let's 
treat it as Amber spec territory.)


On 11/22/2019 12:02 PM, Dan Smith wrote:

On Nov 22, 2019, at 12:22 PM, Leonid Kuskov
 wrote:
Does it make sense to add a definition of ACC_MANDATED to the
tables?


To clarify: are you saying javac is using the 0x8000 flags on fields
and methods, despite this flag being undefined in these contexts? Or
are you saying that you think we should *start* using the flag on
fields and methods, with supporting changes to the spec?

(The first would be a bug, the second would be a minor new feature.)


Leonid is asking for the second. Have I forgotten a discussion how 
enum/records' mandated members are represented in the class file? I see 
internal Slack questions every week about mandated members and 
reflection, but there's nothing in the JVMS draft to set expectations. 
(If we decided NOT to define ACC_MANDATED in 
{field_info,method_info}.access_flags, then the JVMS should discuss that 
in a note in 4.7.8, since "mandated members" are indicated there.)


Alex



Re: Spec: ACC_MANDATED

2019-11-22 Thread Alex Buckley
(This question was asked internally, and it wasn't clear if it was about 
JEP 359 (Records) specifically or ACC_MANDATED more broadly. Since the 
question is being driven by JEP 359 changes, and since JEP 359 is in any 
case the next feature which will change the JVMS, let's discuss on 
amber-spec-experts, NOT on compiler-dev. I have REMOVED compiler-dev 
from the header.)


On 11/22/2019 11:22 AM, Leonid Kuskov wrote:
So specification allows RI  to mark final fields associated with 
components and some methods by ACC_MANDATED flag (0x8000). The latest 
JVMS specification 
(https://docs.oracle.com/javase/specs/jvms/se13/html/index.html) permits 
this flag only for 2 attributes: MethodParameters_attribute, 
Module_attribute . The flag is not mentioned in both tables: Table 
4.5-A. "Field access and property flags", Table 4.6-A. "Method access 
and property flags".

...

Does it make sense to add a definition of ACC_MANDATED to the tables?


Good question. The mask 0x8000 is presently defined as ACC_MODULE in 
ClassFile.access_flags, but it would be legitimate to define the same 
mask as ACC_MANDATED in {field_info,method_info}.access_flags. This 
would mirror the situation for the mask 0x0020, which is defined as 
ACC_SUPER in ClassFile.access_flags but defined as ACC_SYNCHRONIZED in 
method_info.access_flags. I'm sure we have discussed ACC_MANDATED before 
so I'm not sure why the JEP 359 JVMS draft is silent on the matter.

And:

Earlier the spec stated that the Enum.values() and Enum.valueOf() are 
exceptions to the requirement that a class member that does not appear 
in the source code must be marked using a Synthetic attribute, or have 
its ACC_SYNTHETIC flag set.
Now names of methods are removed and the more loose statement "mandated 
members of enums and records" is used. Does it mean that spec won't 
enumerate "mandated" methods anymore?


JVMS 4.7.8 should cross-ref to all JLS sections which define mandated 
members. JVMS 4.7.8 should not list them explicitly. (There are numerous 
"JLS §..." cross-refs in JVMS ch.4, they are quite legitimate.)


And the setting of this flag will be implementation-specific? 


No, the presence of mandated members is JLS-defined. Search for 
"mandated" in JLS 13.1 to dispel any notion that mandated members are 
implementation-specific.



Should the JCK signature test take into account the ACC_MANDATED flag?


If ACC_MANDATED is added to {field_info,method_info}.access_flags, then 
yes, its presence on certain members of enum and record types is required.


Alex


Re: Updated Draft specs for JEP 359 (Records)

2019-11-21 Thread Alex Buckley

On 11/21/2019 7:01 AM, Gavin Bierman wrote:

A hopefully final draft language spec for JEP 359 (Records) is available at:

http://cr.openjdk.java.net/~gbierman/jep359/jep359-20191121/specs/records-jls.html

This incorporates (I hope!) all the very helpful suggestions from everyone on 
these lists - many thanks.


1. 8.10.1 says "As a record component corresponds to an accessor method, 
restrictions on accessor methods (8.10.3) mean that it is always a 
compile-time error for a record header to declare a record component 
with the name finalize, getClass, hashCode, notify, notifyAll, or 
toString." -- `record Foo(int wait){}` is also in error because 8.10.3 
will see that the imp.decl. accessor method `int wait()` is 
override-equivalent with the non-private method `void wait()` in Object. 
Similarly for `record Bar(Object clone){}`. `equals` is a different 
story, and that's a surprise, so explain it -- show `record Quux(boolean 
equals){}` as legal because the accessor method is `public boolean 
equals()` which merely (albeit rather confusingly) overloads the 
inherited `public boolean equals(Object)` method. The relationship 
between each of Object's methods and the implicit record components 
should be spelled out explicitly in the note.


2. 8.10.3: Technical rewording: from "The implicitly declared accessor 
method is annotated with the annotation that appears on the 
corresponding record component, if this annotation type is applicable to 
a method declaration or type context." to "The implicitly declared 
accessor method is annotated with the annotations, if any, that appear 
on the corresponding record component and whose annotation types are 
applicable in the method declaration context or in type contexts or 
both. Note: This means that an annotation on a record component may not 
necessarily be carried over to the corresponding implicitly declared 
accessor method."


3. 8.10.4: A canonical ctor is defined as a public ctor, and then 
there's an error if an explicit canonical ctor is not public. That error 
will never occur because a canonical ctor is always public. This was the 
discussion on November 4, which doesn't seem to have been considered. I 
think "It is a compile-time error if a record declaration contains more 
than one explicit declaration of the canonical constructor." is the 
catch-all solution, but then the JLS must explain in a note why THE 
canonical ctor can have multiple declarations.


4. 8.10.4: Technical rewording: from "The canonical constructor must not 
declare type variables." to "The canonical constructor must not be 
generic (8.8.4)."


I assume the November JVMS draft at 
http://cr.openjdk.java.net/~gbierman/jep359/jep359-20191121/specs/records-jvms.html 
is still being updated?


Alex


Re: [records] Non-compact canonical constructors

2019-11-12 Thread Alex Buckley

On 11/10/2019 8:07 PM, Tagir Valeev wrote:
1. It's not explicitly specified whether an explicitly declared 
canonical constructor must be 'public' like it's specified for

compact constructors. Does this mean that I can declare
non-public canonical constructor?


The compact constructor _is_ a canonical constructor; its just an 
alternate notation for it, and its an error to declare it both

ways (because its an error to declare the same member twice). The
canonical constructor should be public (yes, Remi, we see you
there), whether declared implicitly, explicitly with a full
argument list, or explicitly with a compact ctor.


Sure, this sounds consistent. I'm just saying that this part of the 
current spec draft is incomplete.
Yes, this was the issue @ 
https://mail.openjdk.java.net/pipermail/amber-spec-experts/2019-November/001760.html 
and 
https://mail.openjdk.java.net/pipermail/amber-spec-experts/2019-November/001761.html 
-- there is a slight misfactoring in how an explicitly declared 
canonical ctor is specified.


Alex


Re: Updated Draft specs for JEP 359 (Records)

2019-11-06 Thread Alex Buckley

On 10/31/2019 7:17 AM, Gavin Bierman wrote:

(Alongside is a draft JVM spec for this feature:
http://cr.openjdk.java.net/~gbierman/jep359/jep359-20191031/specs/records-jvms.html
)


I looked at this for the CSR JDK-8233595. The `component_info` structure 
which is mentioned all over the place really tripped me up. Unlike 
fields and methods, a component isn't a first-class JVM construct, so a 
simple (i.e. unqualified) name is not deserved. Even the JLS always uses 
the qualified name, "record component" (if nothing else, to distinguish 
from "array component").


It would be wrong to replace mentions of the `component_info` structure 
with mentions of the `Record_attribute` structure, because 
`Record_attribute` isn't literally the structure which holds attributes 
(whereas the oft-mentioned `Code_attribute` structure really is). It 
would also be clunky to spell out "the `component_info` structure of the 
`Record_attribute` structure" in many places.


For spec clarity, please rename `component_info` to 
`record_component_info`. (From a search of internal mail, I believe 
`component_info` was introduced around 7/24 in a discussion about 
annotations on record components, as an alternative to reusing the 
`field_info` structure in Record. Now that the term has spread 
throughout JVMS ch.4, it's time to name it properly.)


As an aside, please drop "We're being intentionally vague here about 
just what it means for a class to have a "component"." and strengthen 
the opener: "The Record attribute is a variable-length attribute in the 
attributes table of a ClassFile structure. ***A `Record` attribute 
indicates that this class is a _record type_ (JLS §8.10), declared with 
a list of _record components_.***"  [Almost certainly declared _in 
source code_, but maybe this class file was auto-generated, so no need 
to say how the record type was declared ... but it was, since here we 
are in its class file.]


Alex


Re: Updated Draft specs for JEP 359 (Records)

2019-11-04 Thread Alex Buckley

On 11/4/2019 12:50 PM, Florian Weimer wrote:

I think we are looking at different versions of the spec.  I don't see
either wording here:



But the updated wording works for me.


Doh, you're right, and the updated spec already reflects some of the 
re-structuring I talked about in mail. However, the thrust of my 
comments about detaching 'public' and 'throws' from the definition of 
"canonical" still apply. The 2019-10-31 spec says "If the canonical 
constructor is not explicitly declared, then it is implicitly declared." 
but it is not possible to implicitly declare `public R(int i)` if the 
pretender `R(int i)` has been explicitly declared in `record R(int i)`. 
Yes, there is an error mandated for the pretender -- "The erasure of the 
signature of the constructor must not be equal to the erasure of the 
signature of the canonical constructor." -- but (1) A compiler vendor 
now has an impossible thing mandated on the one hand and a 
must-not-be-equal error mandated on the other hand, so which should be 
reported first? and (2) While it's appropriate to mention erasure when 
contrasting List and List (e.g. the javac test case), it's 
confusing to mention erasure when dealing with int and int. The JLS 
should be more explicit about how mis-declared modifiers and 'throws' 
are handled.


Alex


Re: Updated Draft specs for JEP 359 (Records)

2019-11-04 Thread Alex Buckley

Florian,

Thanks for drawing attention to this part of the spec:

On 11/2/2019 3:21 AM, Florian Weimer wrote:

Is it allowed to declare a canonical constructor explicitly and make
it non-public?  I think the naswer is no.  But it's not quite obvious
from the spec, I think.


JLS 8.10.4 defines a "canonical" ctor to be a public ctor whose formal 
parameter list is identical to the record header. If you explicitly 
declare a ctor whose formal parameter list is identical to the record 
header, but you do not make it public, then you have not declared a 
"canonical" ctor. You have declared a ctor that, by virtue of its 
signature, prevents an implicitly declared canonical ctor from being 
declared. This is so unfortunate that it deserves an error message, and 
you will get one.


However, you are right that the spec is slightly misworded. It says "It 
is a compile-time error if a record declaration contains a canonical 
constructor declaration that is not public." but a "canonical" ctor is 
public by definition. It should say "It is a compile-time error if a 
record declaration contains a constructor declaration whose formal 
parameter list is identical to the record header of R, but which is not 
public."


An alternative approach: JLS 8.10.4 could define a "canonical" ctor as 
follows: "Every record type R has a _canonical_ constructor, which is a 
constructor [note the silence on accessibility] whose formal parameter 
list is identical to the record header of R." ... and deem an implicitly 
declared canonical ctor to be public. Then, the compile-time error can 
be left alone, though I would reword it for clarity and for tonal 
agreement with '[exp|imp]licitly declared': "If a canonical constructor 
is explicitly declared, then it must be public, or a compile-time error 
occurs."


The alternative approach is best because the sense of "canonical" is 
dominated by the signature -- just look at javac's error message 
"canonical record constructor must be public" which assumes that 
canonical-ness is obvious (signature-driven) and that the access 
modifier is a separate thing, so to speak. That seems a pretty natural 
view of things for the JLS to embody.


(Sidebar: Why allow a compact ctor declaration to be non-public, when we 
control its grammar that could be hard-coded to use `public`?)


(Sidebar: The compact ctor declaration should be introduced at the same 
time as "A canonical constructor can be explicitly declared ...". I 
recall privately discussing the flow of this section but can't find it 
easily. I suggest that 8.10.4 should be "Canonical Constructor of a 
Record" and that the sole non-canonical clause "A record declaration may 
contain constructor declarations." should live in 8.10.2 "Record Body" 
(not plural; yes, the title of 8.9.2 should drop the D-word too, which 
will happen auto-magically) ... 8.10.2 already says "may contain 
constructor and member declarations" but that opening paragraph should 
advertise the compact ctor's role in helping to initialize the member 
[because otherwise CompactConstructorDeclaration goes unexplained] ... 
the line will be hard to write and this mail is already long enough.)


Alex


Re: Fields and methods of a record are marked MANDATED

2019-10-10 Thread Alex Buckley
Enum types are specified such that a `values` method is always 
implicitly declared. (If you declare one explicitly, you have two method 
declarations with override-equivalent signatures, which is an error per 
JLS 8.4.) Accordingly, the corresponding method in the class file should 
always be marked as mandated. Sadly we don't have room for an 
ACC_MANDATED flag in `method_info`, but morally the method is mandated 
and Core Reflection should expose that fact.


Record types are specified more sensitively: a component's accessor is 
implicitly declared if and only if it isn't explicitly declared. (Same 
deal as the default constructor of a class.) If a component's accessor 
is explicitly declared, then there's nothing more to say; if it's 
implicitly declared, then it should be marked as mandated.


I wouldn't characterize this as member descriptor v. member 
implementation, because that sounds like "the signature" v. "the body". 
Fundamentally, the topic at hand is component accessors, which are 
non-abstract methods of non-abstract classes; for such methods of such 
classes, EITHER you declare both the signature and the body (in which 
case there's an explicit declaration of both signature and body) OR you 
declare neither (in which case there's an implicit declaration of both 
signature and body). Ordinary consumers of the record type are happy 
because they can be assured that component accessors are always declared 
(i.e., the compiler will always find the signature, the VM will always 
link the descriptor, and the subsequent execution of the linked method 
will always do something useful), while reflective consumers of the 
record type are happy because they can tell whether a component accessor 
(the signature and body as one undivided entity) was declared explicitly 
or implicitly. You say "random", I say "accurate".


Alex

On 10/10/2019 9:09 AM, Brian Goetz wrote:

Under that interpretation, that leaves record members in a funny place, since a 
given mandated member (e.g., an accessor for a component) _might_ have been 
explicit in the source, or might not have been.  Should ACC_MANDATED describe 
the member descriptor (“spec mandates a member with this descriptor”) or only 
the implementation (“the source didn’t have it, but its here in the byte 
code”)?  In the latter interpretation, the presence of ACC_MANDATED on a 
mandated member would basically be random, based on implementation-of-the-day, 
which seems wrong.


On Oct 10, 2019, at 12:06 PM, Joe Darcy  wrote:

A mandated construct is one that is mandated by the specification, but not 
explicitly declared. Constructs of that sort have been in the platform since 
the beginning, such as default constructors. ACC_MANDATED was added to the 
platform only more recently and has some exposure through javax.lang.model.

I recommend going forward ACC_MANDATED to be used more widely, on all the 
mandated structures, including the values methods on enum types, etc.

Cheers,

-Joe

On 10/10/2019 8:50 AM, Brian Goetz wrote:

We should match the behavior of methods like `Enum::values`.


On Oct 10, 2019, at 10:15 AM, Remi Forax  wrote:

Hi all,
fields and methods of a record are marked ACC_MANDATED which contradict JLS 
13.1.12 that explains that you can not use ACC_MANDATED on field and method.

regards,
Rémi





Re: Exploring inference for sealed types

2019-10-02 Thread Alex Buckley
You speak of "compilation unit" as if it means the scope of work 
performed by javac and Maven. ("compiles each module separately as its 
own compilation unit")  That's not the meaning. The meaning is as given 
in https://docs.oracle.com/javase/specs/jls/se13/html/jls-7.html#jls-7.3


On 10/2/2019 1:43 PM, Peter Levart wrote:

Is compilation unit really the right choice to base inference on?

For example, a program may be composed of several modules compiled all 
at once in a single compilation unit (javac supports that). This same 
program may be compiled with a build system such as Maven, which 
compiles each module separately as its own compilation unit. Would we 
really want the semantics of a program (or successful compilation 
thereoff) depend on the choice of the build tool?


What about using (module, compilation unit) as the base to perform 
inference within? I understand that compiler may only infer things 
within a compilation unit and module is usually compiled as a whole in 
one compilation unit (possibly together with other modules).


Regards, Peter



Re: record components as a first class reflection element

2019-09-24 Thread Alex Buckley
At first glance, this is sensible because of the first-class status in 
the JLS of record components and their mapping to accessors.


Based on a check of other implementations of AnnotatedElement, consider 
`boolean isVarArgs()` (IIRC a varargs component will be allowed) and 
`String toGenericString()`.


Alex

On 9/24/2019 12:05 PM, Vicente Romero wrote:

Hi amber experts,

We are considering our next move in the reflection area for records. It 
will be hoisting record components to a first class status in the 
reflection engine. Our current proposal is to define a new class named: 
java.lang.reflect.RecordComponent which will will be roughly defined as:


public final class RecordComponent implements AnnotatedElement {
     private String name;
     private Class type;
     private Method accessor;

     public String getName() { return name; }

     public Class getType() { return type; }

     public Type getGenericType() {...}

     public AnnotatedType getAnnotatedType() {}

     public Method getAccessor() { return accessor; }
}

Along with this change we are also proposing changes to java.lang.Class. 
Our proposal there is to remove current method: 
java.lang.Class::getRecordAccessors and add a new method named: 
java.lang.Class::getRecordComponents which will return a array of 
java.lang.reflect.RecordComponents. Thanks in advance for sharing any 
feedback on this proposal,


Vicente


Re: Draft JLS spec for records

2019-09-03 Thread Alex Buckley

Let me comment on Maurizio's comments, and add some more.

In 1.1, the spec for enum types has been used as a template, which is 
fine, but the summary of enum types has always been poor: the 
singular/plural construction is weird, and the mixing of kinds (class v. 
object) is horrible. The JLS is not the place to sling the term "enum" 
and sometimes have it mean the type and something the instance of the 
type; similarly, for "record". Notice that 8.9 never says "enum" on its 
own. Still, since people talk about "enums" and "records" a lot, the JLS 
should claim those terms -- as the types. For the avoidance of doubt, 
it's fine for this new draft spec to clarify/align the specification of 
older features. So:


  Enums are a special kind of class that support the definition of 
small sets of values which can then be used in a type safe manner. 
Unlike enumerations in other languages, enums may have their own methods.


  Records are a special kind of class that support the compact 
expression of simple objects that serve as aggregates of values.


The intro to ch.8 is the place to say more about enums and records. It 
is unfortunate that enums aren't covered there, e.g., the only way you 
learn that you can switch over an enum type is via an example buried in 
8.9. For records, a description like "shallowly immutable" would be OK, 
since there's room to explain. The earlier the mention of components as 
a shorthand for fields, the easier and less dry will be 8.10 and 
especially 8.10.1.


On 9/3/2019 6:40 AM, Maurizio Cimadamore wrote:

* in 8.10.1:

"Each record component in the /RecordHeader/ declares one |private 
final| field in the record class whose name is same as the /Identifier/ 
in the record component."


This seems a bit early. Also, you repeat the same in 8.10.3, which is 
arguably a much better place?


Let's dial back the focus on the record header; compare with how 8.4 
introduces `MethodHeader` but never utters "header" in narrative text.


-
8.10.1  Record Components

[This paragraph has two normative sentences in it.] The _record 
components_ of a record type, if any, are specified in the header of the 
record declaration. [The X of the type being specified in the Y of the 
declaration comports with how 8.10 says that the declaration specifies 
the type.] [The grammar allows a component-less record type; I'm 
following that. See 8.4.1 for another "if any" usage.] Each record 
component corresponds to an implicitly declared field of the record type 
(8.10.3).


RecordHeader:[Following style of 14.2 and 8.4/8.4.1]
`(` _[_RecordComponents_]_ `)`

RecordComponents:
RecordComponent _{_ `,` RecordComponent _}_

// Grammar should always be followed by basic well-formedness rules.

It is a compile-time error for a record header to specify two record 
components with the same name.


It is a compile-time error for a record header to specify a record 
component with the name clone, finalize, getClass, hashCode, notify, 
notifyAll, readObjectNoData, readResolve, serialPersistentFields, 
toString, wait, or writeReplace.  [It's right to say this here, where 
we're talking about components themselves -- we're hinting to compilers 
that the error should be about component with a bad name, not 
field-you-can't-see-in-source having a bad name.]

-

8.10.2 says "It is a compile-time error for the body of a record 
declaration to contain non-static field declarations. All non-static 
fields should be declared as record components in the record header."


Use "should" very very sparingly in normative text. This is the place 
for a note about how, "in the judgment of the designers of the Java 
programming language", the only per-instance state of a record type 
should be the components -- and yes, that state is final, so plainly a 
record type isn't syntactic sugar, it expresses something about the 
aggregate. Also a good place to compare enum types -- instance 
controlled, but state can be arbitrary -- versus record types -- not 
instance controlled, but state is locked down. (Comparison not because a 
developer is wondering whether to use an enum type or a record type; 
comparison because, well, this is the JLS -- it introduced two "special" 
kinds of class type, so needs to teach their outlines before anyone asks.)


8.10.4: Consider how 8.8.9 states that "If a class contains no 
constructor declarations, then a default constructor is implicitly 
declared." and how 8.9.2 modifies that with "In an enum declaration with 
no constructor declarations, a default constructor is implicitly 
declared. The default constructor is private, has no formal parameters, 
and has no throws clause." ... 8.10.4 should follow suit. At the same 
time, there is value in the term "canonical ctor" -- in a normal class, 
if you explicitly declare an arbitrary ctor, then you don't get a 
default ctor implicitly declared, whereas in a record type, if you 
explicitly declare an arbitrary ctor, then you still g

Re: Refinements for sealed types

2019-08-21 Thread Alex Buckley
// Declarations of RedFruit/BlueFruit corrected below from `extends 
Node` to `extends Fruit`


I don't know what I don't know about "aux classes". I do know that if 
you don't nest the concrete leaves, they can't be public in the same 
compilation unit:


--Fruit.java--
public sealed interface Fruit {  // permits RedFruit, BlueFruit
  /* public */ sealed interface RedFruit extends Fruit {}
 // permits *berry
  /* public */ sealed interface BlueFruit extends Fruit {}
 // permits Blueberry
  /* public sealed */ class FruitCake implements Fruit {}
 // permits nothing; implicitly final
}
/* package-access */ class Strawberry implements Fruit.RedFruit {}
/* package-access */ class Raspberry  implements Fruit.RedFruit {}
/* package-access */ class Blueberry  implements Fruit.BlueFruit {}
--

Putting the concrete leaves in another compilation unit so they can be 
public (assume that's the right accessibility) isn't ceremony reduction. 
Am I missing something about how this hierarchy should be declared?


Alex

On 8/20/2019 5:05 PM, Brian Goetz wrote:

I do, because the other way to get a class into the same compilation unit, aux 
classes, have some limitations.  So we’re encouraging the pattern of nesting.  
But … I am not sure we want to push it all the way.  Consider a type like:

 sealed interface Fruit {
 interface RedFruit extends Fruit { }
 interface BlueFruit extends Fruit { }
 class Strawberry implements RedFruit { }
 class Raspberry implements RedFruit { }
 class Blueberry implements BlueFruit { }
 }

Do we want to force Blueberry to be Fruit.BlueFruit.Blueberry (or at least, 
twist the user’s arm into it by offering less ceremony?)  I think that would be 
lame — and worse than lame if the intermediate interfaces (RedFruit, BlueFruit) 
were not public.  Then we’d be nesting the public types in the nonpublic ones, 
and they’d be inaccessible.


You often show the concrete classes as members of a sealed interface. Interface 
members are already implicitly public and static; is this a precedent to build 
on for a sealed interface? That is, have the nested concrete classes be 
implicitly final, and have the interface's implicit `permits` list care about 
only nested concrete classes. Top level concrete classes in the same 
compilation unit would be handed like concrete classes in other compilation 
units: nothing different than today.

Layering sealing on top of nesting has the attraction that it avoids putting 
multiple public concrete classes in a single compilation unit. It's right that 
the concrete leaves are public, but javac dislikes compilation units with 
multiple public types.

Alex




Re: Refinements for sealed types

2019-08-20 Thread Alex Buckley

On 8/20/2019 3:45 PM, Brian Goetz wrote:

This seems reasonable too, because most concrete classes in such
hierarchies _will be_ leaves.  In this world, though, it feels a
little inconsistent to only do this when the subclass and the sealed
type are in the same compilation unit; we could extend this to all
concrete subclasses of sealed types (implicitly final, unless they
say sealed or non-sealed).  This is consistent, but feels a little
more action-at-a-distance-y.  So, its pick your poison.


You often show the concrete classes as members of a sealed interface. 
Interface members are already implicitly public and static; is this a 
precedent to build on for a sealed interface? That is, have the nested 
concrete classes be implicitly final, and have the interface's implicit 
`permits` list care about only nested concrete classes. Top level 
concrete classes in the same compilation unit would be handed like 
concrete classes in other compilation units: nothing different than today.


Layering sealing on top of nesting has the attraction that it avoids 
putting multiple public concrete classes in a single compilation unit. 
It's right that the concrete leaves are public, but javac dislikes 
compilation units with multiple public types.


Alex


Re: Refinements for sealed types

2019-08-20 Thread Alex Buckley

On 8/20/2019 12:40 PM, Brian Goetz wrote:

Gathering the various threads in this discussion, here’s what seems a sensible 
landing place:

1.  A sealed _class_ with no permitted subtypes is a final class.  The 
distinction between declared-sealed-with-no-permits and declared-final is 
erased away.  [*]

2.  Sealed abstract classes and interfaces with no permitted subtypes are 
illegal.

3.  If a sealed type does not specify a permits list, it can be inferred by 
enumerating the subtypes in the same compilation unit.

4.  If a concrete class is a subtype of a sealed type declared in the same 
compilation unit, and none of the modifiers { sealed, non-sealed, final } are 
present, it is inferred to be sealed with an inferred permits list (which, 
according to (1), may be treated as final if there are none.) >
5.  (optional) It is an error to specify an explicit permits list if `sealed` 
is not specified.

6.  Otherwise, a subtype of a sealed type must have exactly one of the 
following modifiers: sealed, non-sealed, final.


I see two committed principles here:

1. Permitted subtypes are common; their enumeration isn't. It's almost 
like explicit `permits` is a separable feature that could be set aside 
for now.


2. Ceremony reduction is important enough to treat concrete direct 
subclasses of a sealed type specially.


However, that special treatment comes at the expense of another 
principle: "leaf nodes [implicitly final classes] are obvious". I was 
hoping to have this because implicitly final classes [sealed with zero 
permitted subtypes] are The Show. Below, it's kinda hard to figure out 
that class C is NOT a leaf node:


--I.java--
sealed interface I {}// implicitly `permits C`

class C implements I {}  // implicitly `sealed` and `permits D`

... lots of other stuff ...
class D extends C {} // implicitly `sealed` w/ no permitted subtypes
--

whereas below, the leaf nodes are mostly obvious by virtue of how the 
intermediate hierarchy is explicitly `sealed`:


--I.java--
sealed interface I {}// implicitly `permits J`
sealed interface J extends I {}  // plainly not a leaf
sealed abstract class AC implements J {}  // plainly not a leaf

class PolarData extends AC {}  // maybe a leaf
class CartesianData extends AC {}  // maybe a leaf
// EOF, so yes, PolarData and CartesianData are indeed leaves
--

tl;dr We're back to having both sealed-ness and the 
potentially-non-trivial permits list be implicit, which was earlier 
thought to strain the reader. If we said that concrete direct subclasses 
of a sealed type were implicitly final, then ceremony reduction would 
still be high, and if you really don't want them to be leaf nodes then 
say `sealed` or `non-sealed`.


Alex


Re: Refinements for sealed types

2019-08-19 Thread Alex Buckley

On 8/19/2019 3:22 PM, Brian Goetz wrote:
I don't have a strong opinion on whether "sealed but no permitted 
subtypes" is a habitable space separate from "final", but I'm fine to

say "that means final" for most purposes.  Not sure what reflection
should say; is it OK for a type that is clearly sealed in its 
declaration to report back as "not sealed, but final?"  Or does
that mean it is both sealed _and_ final (an empty PermittedSubtypes 
attribute, plus an ACC_FINAL)?


Are you referring here to a source declaration such as `sealed interface
X {}` where, per Vicente, X.class has ACC_FINAL set? I think 
X.class.isSealed() should return true and X.class.isFinal() [simplifying 
for space] should return false. Has there been any discussion of 
reclaiming ACC_SUPER, 0x0020, for ACC_SEALED in v58 class files? Using 
an empty attribute to denote sealing is pretty ugly.


Alex


Re: Refinements for sealed types

2019-08-19 Thread Alex Buckley

On 8/19/2019 11:52 AM, Brian Goetz wrote:
How do you know from `sealed class X {}` and the rest of its 
compilation unit that all X's subtypes are co-declared? 


By co-declared, I mean "in the same compilation unit."


The emphasis should be on the word "all", not "co-declared". How do 
you know that ALL of X's subtypes are declared in the same compilation 
unit?


Here's what I'm suggesting.

If you say

     sealed interface X { ... }

with no permits clause, then the permits clause is inferred from the 
contents of the compilation unit, which is _by definition_ all the 
permitted subtypes.  (If there are no subtypes in the current 
compilation unit, a warning may be in order.)


OK.


Similarly, if you have a subtype of X:

     sealed interface X {
     class A implements X { }
     }

that is _in the same compilation unit_, then we will infer `sealed` on A 
unless you say otherwise.


To be clear, infer `final` on A. We were already inferring `sealed` last 
week; the new thing this week is to infer A's `permits` list as empty 
(giving A an overall score of implicitly `final`) rather than "any 
co-declared subtypes of A". Given the code above, it is a compile-time 
error for any class in any compilation unit to attempt to subclass A.


Where this is all headed is that we never infer `sealed` with a 
non-empty `permits`. As you said, no need for `permits none`. It is good 
to draw a clear distinction between 
`sealed`-with-at-least-one-permitted-subtype versus 
`final`-with-zero-subtypes.


Alex


Re: Refinements for sealed types

2019-08-19 Thread Alex Buckley

On 8/19/2019 11:27 AM, Brian Goetz wrote:
So, given all this, we should focus all our ceremony-reduction on the 
case of co-declared sum types.  Which is mostly what I think I was 
suggesting:


  - Infer the permits clause when all the subtypes are co-declared;
  - Infer “final” for leaf classes in a sum type;
  - Require explicitness in both sealed/non-sealed, and permits 
clause, in other cases.


How do you know from `sealed class X {}` and the rest of its 
compilation unit that all X's subtypes are co-declared? 


By co-declared, I mean "in the same compilation unit."


The emphasis should be on the word "all", not "co-declared". How do you 
know that ALL of X's subtypes are declared in the same compilation unit?


Alex


Re: Refinements for sealed types

2019-08-19 Thread Alex Buckley

On 8/18/2019 12:25 PM, Brian Goetz wrote:
So, given all this, we should focus all our ceremony-reduction on the 
case of co-declared sum types.  Which is mostly what I think I was 
suggesting:


  - Infer the permits clause when all the subtypes are co-declared;
  - Infer “final” for leaf classes in a sum type;
  - Require explicitness in both sealed/non-sealed, and permits clause, 
in other cases.


How do you know from `sealed class X {}` and the rest of its compilation 
unit that all X's subtypes are co-declared? Maybe there's another class 
which extends X that someone forgot to pass to javac. Do you really mean 
to determine which subfeature is in use (and hence whether ceremony 
reduction is needed) based on what the host system can observe? I 
wondered if your intention is for a top-level sealed RECORD class to 
indicate sum-types code and a top-level sealed ABSTRACT class to 
indicate restricted-hierarchy code, but you downplayed 
abstract-superclass for restricted hierarchies earlier.


Alex


Re: Draft specification for java.lang.Record

2019-08-15 Thread Alex Buckley

On 8/15/2019 12:18 PM, Brian Goetz wrote:
Cloning: if a record class was to implement Cloneable, then the 
inherited implementation of Object::clone would not preserve copy 
equality (because, yes, cloning is not the same as copying). Recommend 
not implementing Cloneable? 


We have an opportunity to do any of the following:

  - Prohibit cloning by making Record::clone final;
  - Be clone-agnostic by saying nothing;
  - Promote cloning by making Record implement Cloneable, and having an 
implementation of clone() either in Record.java, or having the compiler 
generate it according to the obvious component-copying formula.


j.l.Record's copy-equality property suggests that copying a record 
instance should be done neither by shallow-copy nor by deep-copy. 
Bringing Cloneable's shallow-copy into the picture muddies things up. My 
view isn't worth much here, but prohibition of cloning looks good to me.


Alex


Re: Refinements for sealed types

2019-08-15 Thread Alex Buckley

On 8/15/2019 10:38 AM, Brian Goetz wrote:

     sealed interface X permits A, B { }
     class A implements X { }
     class B implements X { }

In the current design, we took the following path:

  - Subtypes of sealed types are implicitly sealed, unless marked 
non-sealed.
  - We infer a permits clause when it is omitted, which is possibly 
empty (in which case the type is effectively final.)


So, A is implicitly sealed, but (IIRC) its lack of `permits` means that 
any class which is in the same compilation unit as A and which says `... 
extends A` is a permitted subtype.


And, you are saying that it's not reasonable for A's author to have to 
oversee the whole compilation unit all the time, just in case some 
permitted subtype is lurking around with a `non-sealed` modifier that 
lets the X hierarchy be polluted yet further.



So, let me propose a simplification:

  - A concrete subtype A of a sealed type X, which has no permits 
clause, no known subtypes, and is not marked non-sealed, is implicitly 
sealed (with an empty permits clause).


Sounds good -- and where "implicitly sealed (with an empty permits 
clause)" === "implicitly final", right?


  - Any other subtype of a sealed type must either have a "sealed" 
modifier, or a "non-sealed" modifier.

  - Any type with a permits list must have a sealed modifier.


Alex


Re: Draft specification for java.lang.Record

2019-08-15 Thread Alex Buckley
I am reading this javadoc from the POV of someone in 2034 (15 years 
hence, like we are 15 years from Enum) who doesn't know anything about 
Amber.


On 8/15/2019 10:34 AM, Brian Goetz wrote:

/**
  * This is the common base class of all Java language record classes.


I know this borrows from Enum, but "base class" is a terrible un-Java 
phrase there and it's terrible here.  "This is the common superclass of 
all record classes in the Java language."



  *
  * More information about records, including descriptions of the
  * implicitly declared methods synthesized by the compiler, can be
  * found in section 8.10 of
  * The Java™ Language Specification.


Too much too soon; drop.


  *
  * A record class is a shallowly immutable, transparent 
carrier for
  * a fixed set of values, called the record components.  The 
Java™
  * language provides concise syntax for declaring record classes, 
whereby the

  * record components are declared in the record header.  The list of record
  * components declared in the record header form the record 
descriptor.


Are classes immutable, or objects? I think "shallowly immutable" belongs 
in a paragraph about "records" or "record instances" (not "record 
classes") that is not yet written, but should appear just before "For 
all record classes, ..."



  *
  * A record class has the following mandated members: a public 
canonical

  * constructor, whose descriptor is the same as the record descriptor;


"A record class has a public canonical constructor, whose signature is 
the same ...;"



  * a private ... field corresponding to each component, whose name and
  * type are the same as that of the component; a public accessor method
  * corresponding to each component, whose name and return type are the 
same as
  * that of the component.  If not explicitly declared in the body of 
the record,


Prefer "; and a public _no-args_ accessor method corresponding ...  Any 
or all of these elements may be declared explicitly; if one is not 
declared explicitly, then it is provided implicitly."



  * implicit implementations for these members are provided.
  *
  * The implicit declaration of the canonical constructor initializes the
  * component fields from the corresponding constructor arguments. The 
implicit
  * declaration of the accessor methods returns the value of the 
corresponding
  * component field.  The implicit declaration of the {@link 
Object#equals(Object)},
  * {@link Object#hashCode()}, and {@link Object#toString()} methods are 
derived

  * from all of the component fields.
  *
  * The primary reasons to provide an explicit declaration for the
  * canonical constructor or accessor methods are to validate constructor
  * arguments, perform defensive copies on mutable components, or 
normalize groups
  * of components (such as reducing a rational number to lowest terms.)  
If any

  * of these are provided explicitly.


Can't grok "mutable components". Do you mean the (mutable) constructor 
arguments which correspond to components?


Prefer ", or cross-check the values of different components." instead of 
normalizing and reducing which again is hard to follow.


Expecting: "An instance of a record class is _shallowly immutable_, 
which means ..."



  *
  * For all record classes, the following invariant must hold: if a 
record R's
  * components are {@code c1, c2, ... cn}, then if a record instance is 
copied

  * as follows:


Calling `R` a "record" is obtuse since the invariant is about "record 
classes" and this whole spec outlines a "record class". There should be 
two kinds of entity -- "record class" and "record instance", or "record 
class" and "record" -- not three. Also, "Given a non-null instance `r` 
of record class `R`, whose components are ..., then if `r` is copied in 
the following way:"



  * 
  * R copy = new R(r.c1(), r.c2(), ..., r.cn());
  * 
  * then it must be the case that {@code r.equals(copy)}.


A name like "copy equality" or "by parts equality" or "component 
equality" would be helpful for this property, both for the spec of 
equals and for developers' general knowledge.


Cloning: if a record class was to implement Cloneable, then the 
inherited implementation of Object::clone would not preserve copy 
equality (because, yes, cloning is not the same as copying). Recommend 
not implementing Cloneable?


Alex


Re: Escape Sequences For Managing Whitespace (Preview)

2019-08-13 Thread Alex Buckley
- Title: "Escape Sequences for Line Continuation and White Space 
(Preview)"  (the narrative term is "white space" per the JLS and JEP 
355; the only time the ` ` character after "white" is missing is in the 
name of the grammar production WhiteSpace)


- Goal: "Improve the the observability of the space (U+0020) character 
in string literals." -- not sure that's ever been a huge problem, and it 
distracts from the real deal which is retaining white space in text 
blocks. Recommend deletion.


- \040 is introduced as the "space escape sequence". Please don't 
confuse people by making them look in JLS 3.10.6 for a non-existent 
sequence; please reuse how JEP 355 introduced \040.


- In the retaining white space section, the argument is slightly 
mis-ordered. You show the \040\040\040\040 example, then say it's arcane 
(yes) and the \040 escape is perplexing (yes) and that readability could 
be enhanced (yes) but then you double down on \040 by showing it as a 
fence in the `red   \040` example. Better to show the \040\040\040\040 
example, then say "don't worry, you don't need that whole ugly sequence, 
you only need one \040, it's called a fence, look:" then show the `red 
\040` example, THEN say that \040 is arcane and a better escape is needed.


- "Strings that require using backslash as a character can use the \\ 
escape sequence. This is also true at the end of line." -- please say 
that \\ works because Java does not do recursive processing of escape 
sequences -- once \\ has been processed to \, the \ and the following NL 
are NOT further processed to a line terminator. Being explicit about how 
escape processing works will keep us sane as we grow the "escape 
language" whose processing is split across JLS and API.


- General: example code shown in the Motivation should be reused in the 
Description but with the new escape sequences. You use lorem ipsum for a 
concatenated string literal in Motivation, use it again in Description! 
Same for the red green blue example, which is much better than x yy zzz.


- The Alternatives for Line Continuation talk about long string 
literals, then show text block examples. Since \ works in a string 
literal, I was expecting a story which ignores text blocks and talks 
only of improved string literals. Too many things varying at once.


- Reading "Replacing marker sequence (plus newline) with empty string", 
I realized the `...` is another kind of fence -- rather than preventing 
trailing white space for going beyond itself (the definition of a 
fence), it prevents the entire line going beyond itself. Consider saying 
"In a text block, the newline is an implicit fence; a more explicit 
fence can be made not just with \ but with any character sequence, 
e.g. `...` or `$`, which is then replaced along with the 
immediately-following newline."


Alex

On 8/13/2019 5:46 AM, Jim Laskey wrote:

https://bugs.openjdk.java.net/browse/JDK-8227870

Comment back to this list, thank you.

Cheers,

-- Jim





Re: Programmer's Guide To Text Blocks

2019-08-05 Thread Alex Buckley

On 8/5/2019 5:37 AM, Jim Laskey wrote:

http://cr.openjdk.java.net/~jlaskey/Strings/TextBlocksGuide_v8.html


- Please number the guidelines like in the var style guidelines.

- "Guideline: If a string literal fits on a single line" -- A string 
literal CAN ONLY fit on a single line; you mean "If a string fits ..."


- "Guideline: Most text blocks should be indented to align with 
neighbouring Java code."  should come after "Guideline: Avoid aligning 
the opening and closing delimiters"


- "Guideline: Avoid in-line text blocks within complex expressions" -- 
now your readers are wondering why a normal `for` loop is "complex". 
You're going for something about: be cautious using text blocks in 
nested expressions. A `for` loop which pushes the text block down one 
level, into the `for` header, is an example; another example would be 
passing a text block in a method call (which is probably OK); another 
example would be passing a text block in a truly madly deeply nested 
expression (which isn't OK; show one).


- IMO a variable declaration with a text block on the RHS cries out for 
a `var` on the LHS. The """ is as clear a marker as you'll ever get 
about the inferred type of the variable. Is there a reason to not use 
`var` almost everywhere here, and to explicitly recommend that?


- I think there should be a guideline that says it's OK to have \n 
sequences in a text block -- there may be times when that's more 
readable overall than physically introducing a newline.


- "Guideline: It is sometimes reasonable to fully left justify a wide 
string" -- I think it's not merely reasonable, I think it's recommended.


- "... when the closing delimiter is likely to scroll out of view." -- 
when does this happen? There's context here which is not stated in the 
guidelines, especially when an earlier guideline said to put the closing 
delimiter on its own line.


Alex


Re: Draft language spec for JEP 355: Text Blocks

2019-06-03 Thread Alex Buckley

On 5/21/2019 12:45 PM, Alex Buckley wrote:

On 5/21/2019 5:51 AM, Brian Goetz wrote:

As string literals get longer, the cost-benefit of interning get
worse, and eventually turn negative; it is super-unlikely that two
compilation units will use the same 14-line snippet of JSON (no
benefit), and at the same time, we’re taking up much more space in
the intern table (more cost).

Surely today we’ll use Constant_String_info because that’s the
sensible translation target, and if the same string appears twice in
a single class, it’ll automatically get merged by the constant pool
writer.  But committing forever to interning seems likely to be
something we’ll eventually regret, without buying us very much.  Even
the migration benefit seems questionable.


OK, I have walked back the requirement to intern text blocks in 3.10.6
and 12.5. Spec updated in place
(http://cr.openjdk.java.net/~abuckley/jep355/text-blocks-jls.html), old
version available
(http://cr.openjdk.java.net/~abuckley/jep355/text-blocks-jls-20190520.html).


Thanks to the scrutiny of the CSR process, we realized the need to state 
plainly that all text blocks are constant expressions. And, since text 
blocks are of type String, and all String-typed constant expressions are 
interned, the outcome is that all text blocks must be interned. I have 
updated http://cr.openjdk.java.net/~abuckley/jep355/text-blocks-jls.html 
to reflect this.


Alex


Re: Draft language spec for JEP 355: Text Blocks

2019-05-30 Thread Alex Buckley

On 5/29/2019 9:57 AM, Arthur Neufeld wrote:

String season = """
 winter
 """; // the six characters w i n t e r

Doesn’t “season” actually contain 7 characters?

w i n t e r \n


Good catch, thanks. Yes, seven characters. The final character is a line 
terminator per step 7 of the reindentation algorithm: "If the final line 
in the list from step 6 is empty [because the final line was all white 
space prior to stripping], then the joining LF from the previous line 
will be the last character in the result string."


There were other spec examples which had the closing delimiter on its 
own line, yet forgot to include the final LF in the result string. I 
have corrected the spec @ 
http://cr.openjdk.java.net/~abuckley/jep355/text-blocks-jls.html


Alex


Re: Yield as contextual keyword

2019-05-30 Thread Alex Buckley
To be clear, in the new approach, the lexeme `yield` is always tokenized 
as an identifier, and never as a keyword.


Gavin has already changed the MethodName production so that it uses 
UnqualifiedMethodIdentifier rather than Identifier. And since MethodName 
is used by MethodInvocation (15.12), ALL unqualified method invocations 
are now constrained by the "can't call `yield`" policy -- whether an 
invocation is top level (an expression statement) or nested (an 
expression). If you write `f(g(yield(1)))` then you will get a 
compile-time error due to g's argument not parsing as an Expression.


Alex

On 5/29/2019 9:15 AM, Peter Levart wrote:

Even in expression context, unqualified yield could be tokenized as
keyword (and hence produce a compile-time error). What do we loose? If
it is a field, it can be qualified. If it is a local variable, it only
presents source incompatibility, which can easily be fixed at next
re-compile.

The treatment would be more regular (not dependent on expression vs.
statement context) this way.

Regards, Peter

On 5/29/19 4:21 PM, Gavin Bierman wrote:

Upon reflection, the simplest way out of this is to not go down the
path of trying to identify tokens so that the lexer knows something
about parsing, but rather follow the suggestion made by Dan earlier in
this thread. To wit, we treat `yield` much like we treat `var`. It’s a
"restricted identifier", which means that it can’t be used as a
*TypeIdentifier* nor as a *MethodName*. Thus any unqualified method
invocation needs to be qualified (or in the extreme corner case
involving an anonymous class spotted by Tagir, may need (local)
renaming). Without qualification, `yield (42);` will be *parsed* as a
`yield` statement and not an expression statement. Our corpus
analysis, as reported by Brian, shows this not to be a problem.
Tagir’s analysis of the Idea Ultimate sources suggests the same.

The revised JLS is available at:
http://cr.openjdk.java.net/~gbierman/jep354-jls-20190528.html

Thanks,
Gavin




On 24 May 2019, at 23:44, Alex Buckley mailto:alex.buck...@oracle.com>> wrote:

On 5/24/2019 1:19 PM, Tagir Valeev wrote:

Hello! Answering myself


The first token in a YieldStatement production is always preceded
by one of these separator tokens: ;, {, }, ), or ->.


Seems I'm missing something. Could you please illustrate in which
case YieldStatement could be preceded by ')'?


Nevermind. if(foo) yield bar; is a good example. Other my points
still apply.


Also what about '->'? In lambda '->' is followed by an expression
or block, but not a statement. In switch '->' is followed by block,
throw or expression plus semicolon. Also could YieldStatement be
preceded by ':' in old switch format? E.g.

System.out.println(switch(0) { default: yield 1; }); // seems
legit


You're right that `->` should not appear in the list. Any `yield`
which follows `->` is necessarily the start of an expression, so
`yield` should be tokenized as an identifier there.

`:` is tricky. On the one hand, the space after `:` is sometimes
desirous of an statement, so tokenize `yield` as a keyword:

- `default : yield (1);` in a switch expression (also `case ... :`)

- `L1 : yield (1);` in a switch expression (labeled statements are
legitimate in a switch-labeled block! If there was no label, we would
quickly say that this `yield` is a YieldStatement not an
ExpressionStatement, and that if you want an ExpressionStatement
which invokes a method, then qualify the invocation.)

On the other hand, the space after `:` is sometimes desirous of an
expression, so tokenize `yield` as a identifier: (and it might be the
name of a local variable, so no way to qualify)

- `for (String s : yield . f) ...`

- `m(a ? yield . f : yield . g)`

Alex






Re: Yield as contextual keyword

2019-05-24 Thread Alex Buckley

On 5/24/2019 1:19 PM, Tagir Valeev wrote:

Hello! Answering myself


The first token in a YieldStatement production is always preceded
by one of these separator tokens: ;, {, }, ), or ->.


Seems I'm missing something. Could you please illustrate in which
case YieldStatement could be preceded by ')'?


Nevermind. if(foo) yield bar; is a good example. Other my points
still apply.


Also what about '->'? In lambda '->' is followed by an expression
or block, but not a statement. In switch '->' is followed by block,
throw or expression plus semicolon. Also could YieldStatement be
preceded by ':' in old switch format? E.g.

System.out.println(switch(0) { default: yield 1; }); // seems
legit


You're right that `->` should not appear in the list. Any `yield` which 
follows `->` is necessarily the start of an expression, so `yield` 
should be tokenized as an identifier there.


`:` is tricky. On the one hand, the space after `:` is sometimes 
desirous of an statement, so tokenize `yield` as a keyword:


- `default : yield (1);` in a switch expression (also `case ... :`)

- `L1 : yield (1);` in a switch expression (labeled statements are 
legitimate in a switch-labeled block! If there was no label, we would 
quickly say that this `yield` is a YieldStatement not an 
ExpressionStatement, and that if you want an ExpressionStatement which 
invokes a method, then qualify the invocation.)


On the other hand, the space after `:` is sometimes desirous of an 
expression, so tokenize `yield` as a identifier: (and it might be the 
name of a local variable, so no way to qualify)


- `for (String s : yield . f) ...`

- `m(a ? yield . f : yield . g)`

Alex


Re: Yield as contextual keyword

2019-05-23 Thread Alex Buckley

On 5/23/2019 2:29 PM, Dan Smith wrote:

2) Type names: 'yield' might be used as the name of a class, type of
a method parameter, type of a field, array component type, type of a
'final' local variable etc. Or we can prohibit it entirely as a type
name.

We went through this when designing 'var', and settled on the more
restrictive position: you can't declare classes/interfaces/type vars
or make reference to types with name 'var', regardless of context.
That way, there's no risk of confusion between subtly different
programs—wherever you see 'var' used as a type, you know it can only
mean the keyword.

I think it's best to treat 'yield' like 'var' in this case.

3) Method names: 'yield(' at the start of a statement means
YieldStatement, but what about other contexts in which method
invocations can appear?
Taking inspiration from the treatment of type names, my preference
here is to make a blanket restriction that's easy to visualize: an
*unqualified* method invocation must not use the name 'yield'.
Context is irrelevant. The workaround is always to add a qualifier.


This policy is "You can declare a method called `yield`, but you can 
only invoke the method by using qualified invocation syntax." OK, great.


Could the policy in SE 10 have been similar? -- "You can declare a type 
called `var`, but you can only declare a variable at the type by using a 
qualified name." -- `var x = ...` to always indicate LVTI, 
`com.example.api.var x = ...` to still be possible. The need for 
TypeIdentifier to kick `var` out of type names (such as the type name 
used in a LocalVariableDeclarationStatement) would be unnecessary, as 
the rules of 14.4.1 would special-case the `var` identifier like they do 
today.


OTOH, no-one has noticed that types called `var` can't be declared 
anymore, so maybe no-one will notice if types called `yield` can't be 
declared anymore.


Alex


Re: Draft language spec for JEP 355: Text Blocks

2019-05-21 Thread Alex Buckley

On 5/21/2019 5:51 AM, Brian Goetz wrote:

As string literals get longer, the cost-benefit of interning get
worse, and eventually turn negative; it is super-unlikely that two
compilation units will use the same 14-line snippet of JSON (no
benefit), and at the same time, we’re taking up much more space in
the intern table (more cost).

Surely today we’ll use Constant_String_info because that’s the
sensible translation target, and if the same string appears twice in
a single class, it’ll automatically get merged by the constant pool
writer.  But committing forever to interning seems likely to be
something we’ll eventually regret, without buying us very much.  Even
the migration benefit seems questionable.


OK, I have walked back the requirement to intern text blocks in 3.10.6 
and 12.5. Spec updated in place 
(http://cr.openjdk.java.net/~abuckley/jep355/text-blocks-jls.html), old 
version available 
(http://cr.openjdk.java.net/~abuckley/jep355/text-blocks-jls-20190520.html).


Alex


Re: Draft language spec for JEP 355: Text Blocks

2019-05-21 Thread Alex Buckley

On 5/21/2019 5:30 AM, Jim Laskey wrote:

TextBlock:

" " " { the ASCII SP character } LineTerminator { TextBlockCharacter } " " "

  "the ASCII SP character" in the open delimiter is currently
implemented as any "white space" but not a line terminator. Later on you
state "zero or more white spaces".


Thank you for this clarification. I agree that a text-block-only form of 
"white space" -- spaces, tabs, form feeds -- can legitimately appear 
after the """. I have added a production and narrative to capture this.



The string represented by a text block is /not/ the literal sequence of
characters in the content. Instead, the string represented by a text
block is the result of processing the content, as follows:

I think this could be reworded so that the importance of order is made
clear. Later on you state "Interpreting escape sequences last allows",
but it's still not clear the order of 1 & 2 is important. In the JEP we
described them as "steps". Stages might work as well.


A numbered list in the JLS traditionally means in-order processing, but 
for the avoidance of doubt I have said "... the result of applying the 
following transformations to the content, in order:"


Alex


Re: Draft language spec for JEP 355: Text Blocks

2019-05-20 Thread Alex Buckley

We already know the migration incompatibility of how:

"SELECT ..." +
"FROM ..." +
"WHERE ..."

is not ever equals() to:

"""
SELECT ...
FROM ...
WHERE ..."""

because of the extra line terminators in the string derived from the 
text block. There will be a further migration incompatibility if:


"""
Hello world"""

is not always == to:

"Hello world"

because of the lack of guaranteed string interning. Are you saying that 
the freedom to compile text blocks as dynamically-computed constants 
(rather than as static constants; see JVMS12 5.1) is more important than 
the space savings and identity guarantees from interning? I understand 
that starting off loose allows tightening later, but the loose behavior 
is significant.


Alex

On 5/20/2019 4:46 PM, Brian Goetz wrote:

I wonder if we want to be cagey about committing to interning, which
is another way to say we must translate too a constant string info.
In the future, alternate condy- based representations may seem
desirable and we don’t want to be painted into a translation by
overspecification.



Sent from my iPad


On May 20, 2019, at 7:08 PM, Alex Buckley 
wrote:

Please see
http://cr.openjdk.java.net/~abuckley/jep355/text-blocks-jls.html
for JLS changes that align with the JEP.

Text blocks compile to the same class file construct as string
literals, namely CONSTANT_String_info entries in the constant pool.
Helpfully, the JVMS is already agnostic about the origin of a
CONSTANT_String_info, making no reference to "string literals".
Therefore, there are no JVMS changes for text blocks, save for a
tiny clarification w.r.t. annotation elements.

Alex




Draft language spec for JEP 355: Text Blocks

2019-05-20 Thread Alex Buckley
Please see 
http://cr.openjdk.java.net/~abuckley/jep355/text-blocks-jls.html for JLS 
changes that align with the JEP.


Text blocks compile to the same class file construct as string literals, 
namely CONSTANT_String_info entries in the constant pool. Helpfully, the 
JVMS is already agnostic about the origin of a CONSTANT_String_info, 
making no reference to "string literals". Therefore, there are no JVMS 
changes for text blocks, save for a tiny clarification w.r.t. annotation 
elements.


Alex


Re: Call for bikeshed -- break replacement in expression switch

2019-05-17 Thread Alex Buckley

On 5/17/2019 3:13 PM, John Rose wrote:

On May 17, 2019, at 2:56 PM, Alex Buckley 
wrote:


So, recognizing a hyphenated contextual keyword `yield-value` would
still require careful reasoning about context, about as much as
we're doing to recognize a unitary contextual keyword `yield`.


Much less so than the rules either Brian or I sketched. It’s a
statement, not an expression. And no expression statement begins with
ID - ID right? It’s not as ambiguous as ID (.


A single-expression lambda body has the flavor of a statement form, 
though yes, it has no `;` of its own and is not parsed as 
ExpressionStatement.


  `map(y -> yield-value +y);`

So, I agree that parsing `yield-value (1);` as a YieldStatement in a 
SwitchLabeledBlock does not have the ambiguity of parsing `yield (1);` 
as a YieldStatement|ExpressionStatement in a SwitchLabeledBlock ... but 
the decision still has to be taken about whether "in a 
SwitchLabeledBlock" or "in " is the proper context to 
recognize something new.


Alex


Re: Call for bikeshed -- break replacement in expression switch

2019-05-17 Thread Alex Buckley
Correction: `yield-value` is a hyphenated keyword. Specifically, a 
hyphenated contextual keyword, where each term is itself a unitary 
contextual keyword. This is discussed, with examples, in the JEP 
(https://openjdk.java.net/jeps/8223002).


Introducing `yield-value` as a hyphenated contextual keyword doesn't buy 
you much. Both `yield` and `value` would tokenize as identifiers 
everywhere, so that you can keep on subtracting your `value` variable 
and the result of your `value` method:


  `int x = yield-value +y;`
  `int x = yield -value(x);`

So, recognizing a hyphenated contextual keyword `yield-value` would 
still require careful reasoning about context, about as much as we're 
doing to recognize a unitary contextual keyword `yield`.


Alex

On 5/17/2019 8:40 AM, Remi Forax wrote:

Hi Manoj,
yield-value is not a hyphenated keyword, the left part of the right part
as to be an existing keyword.

Remi

On May 17, 2019 2:55:14 PM UTC, Manoj Palat  wrote:

Hi,
I have a few points regarding this – since there was a flurry of
mails last night/day, I have given references below to specific
threads below:

-As Maurizio pointed out in

_https://mail.openjdk.java.net/pipermail/amber-spec-experts/2019-May/001334.html_,
“yield” is not really a _reserved_type_identifier_ like “var” –
“var” is correct only at places (at some places actually) where a
type can occur-
*Our view point*: At parsing time “var” is just taken as a type and
hence from a compiler implementation point of view, “var” is less of
a challenge than the proposed “yield”. If “yield” value is used
instead of “break” value, then again, the compiler needs to
disambiguate – the disambiguation problem just manifests in a
different avatar.

-Alex, in the discussion here

_https://mail.openjdk.java.net/pipermail/amber-spec-experts/2019-May/001338.html_has
pointed out that “The parsing of a `(` token has triggered
potentially unbounded lookahead for some time [1][2], and everything
worked out, so I don't see why the language should disallow any of
John's examples” where The reference [1] is “[1] See slides 9-11
from

_https://www.eclipsecon.org/na2014/session/jdt-embraces-lambda-expressions.html_“

*Our View point: *However, though the problem was resolved finally
for lambda, additions of new context sensitive keywords would make
our parsing more complicated with additional logic in lookaheads.
Although the problem was solved from a pure compiler perspective, we
are far from winning the battle as an IDE where one major value add
is code completion, which works on incomplete code. Due to these
hacks, code completion for lambdas still has unresolved issues for us.

- An additional input to this discussion is the proposal for
hyphenated keywords as described in
_https://openjdk.java.net/jeps/8223002_. “break-with” which was the
earlier proposed one, was one among these hyphenated keywords.
*Our View point: *We are fine with that as mentioned in the mailing
list sometime earlier in the context of switch expressions and
break-with, the hyphenated keyword. The more the number of context
sensitive keywords are introduced, causing more hacks, it would be
really difficult to sustain and scale the Eclipse IDE.
- Based on the above, I believe “break-with” was a better candidate
with less or disambiguation and it goes along with the future
direction of keywords. Here the assumption is break-with is not
context sensitive at any point in time. Given that “break-with” had
opposition, and “yield” was more popular candidate, planning to
reply with a new suggestion of hyphenated keyword “*yield-value*” or
any other hyphenated keyword.

Regards,
Manoj.
Eclipse Java Dev.


Re: Call for bikeshed -- break replacement in expression switch

2019-05-16 Thread Alex Buckley

On 5/16/2019 2:05 PM, Maurizio Cimadamore wrote:

There are other contexts in which we limit what can be done w/r/t/
parenthesized expressions (since these are ambiguous with cast to
generic types). So this looks like another case where the grammar has to
say - sorry no parens here.


If you're proposing to disallow a cast expression or a parenthesized 
expression after a `yield` token, then I think that's not right. The 
parsing of a `(` token has triggered potentially unbounded lookahead for 
some time [1][2], and everything worked out, so I don't see why the 
language should disallow any of John's examples:


yield (String)("answer is "+x);
yield ("answer is "+x).trim();
yield new String[]{ "answer is "+x }[0];
yield Arrays.asList("answer is "+x).get(0);
yield false ? 0 : ("answer is "+x).trim();

Alex

[1] See slides 9-11 from 
https://www.eclipsecon.org/na2014/session/jdt-embraces-lambda-expressions.html


[2] JLS 15.27 on the choice of `(...)` for lambda parameters :

The syntax has some parsing challenges. The Java programming language 
has always required arbitrary lookahead to distinguish between types and 
expressions after a '(' token: what follows may be a cast or a 
parenthesized expression. This was made worse when generics reused the 
binary operators '<' and '>' in types. Lambda expressions introduce a 
new possibility: the tokens following '(' may describe a type, an 
expression, or a lambda parameter list. Some tokens immediately indicate 
a parameter list (annotations, final); in other cases there are certain 
patterns that must be interpreted as parameter lists (two names in a 
row, a ',' not nested inside of '<' and '>'); and sometimes, the 
decision cannot be made until a '->' is encountered after a ')'. The 
simplest way to think of how this might be efficiently parsed is with a 
state machine: each state represents a subset of possible 
interpretations (type, expression, or parameters), and when the machine 
transitions to a state in which the set is a singleton, the parser knows 
which case it is. This does not map very elegantly to a fixed-lookahead 
grammar, however.


Re: Call for bikeshed -- break replacement in expression switch

2019-05-16 Thread Alex Buckley

On 5/16/2019 8:24 AM, Brian Goetz wrote:

We’ve probably pretty much explored the options at this point;  time to
converge around one of the choices...


I am very happy with `yield` as the new construct for concluding the 
evaluation of a switch expression and leaving a value on the stack for 
consumption within the method.


I think a statement form for the new construct is ideal. The purpose of 
the new construct is to complete abruptly in an attempt to transfer 
control back to the switch expression, which then completes normally 
with a value. Abrupt completion and an attempt to transfer control are 
the hallmarks of `break`, `continue`, and `return`; having `yield` as 
the junior member of that club is quite natural. Putting the junior and 
senior members side by side shows both similarity and difference:


-
A `yield` statement attempts to transfer control to the innermost 
enclosing switch expression; this expression ... then immediately 
completes normally and the value of the _Expression_ becomes the value 
of the switch expression.


A `return` statement attempts to transfer control to the invoker of the 
innermost enclosing constructor, method, or lambda expression ... In the 
case of a return statement with value _Expression_, the value of the 
_Expression_ becomes the value of the invocation.

-

Note that the aspect of _attempting_ to transfer control applies to 
`yield` just as much as to `break`, `continue`, and `return`. Below, the 
`finally` block "intercepts" the transfer of control started by `yield`. 
The `finally` block then completes normally, so the transfer of control 
proceeds and the switch expression completes normally, leaving 5 or 6 on 
the stack.


```
int result = switch (x) {
case 0 -> {
try {
...
if (...) yield 5;
...
yield 6;
 }
 finally {
 cleanUp();
 }
}

default -> 42;
};
```

Abrupt completion and transfer of control are not the hallmarks of 
operators. The purpose of an operator is to indicate the kind of 
expression to be evaluated (numeric addition, method invocation, etc), 
so an operator-like syntax such as `^` would suggest the imminent 
evaluation of a NEW expression. However, we are ALREADY in the process 
of evaluating a switch expression; in fact we would like to finish it up 
by transferring control from the {...} block (which has been happily 
executing statements sequentially) to the switch expression itself (so 
it can complete normally). So, I think an operator-like syntax is 
inappropriate.


Alex


Re: RFR: Multi-line String Literal (Preview) JEP [EG Draft]

2019-05-15 Thread Alex Buckley

On 5/15/2019 10:17 AM, Dan Smith wrote:

I think this:

~~~
String code = """
   public void print(""" + type + """
o) {
   System.out.println(Objects.toString(o));
   }
   """;
~~~

should be presented like this:

~~~
String code = """
   public void print(""" +
   type +
   """
o) {
   System.out.println(Objects.toString(o));
   }
   """;
~~~

It's not great, and replace/format is the "right" solution, but if
somebody wants to do concatenation, this style does a better job of
indicating where the indent prefix ends and the content begins. The
delimiter gives a visual indication of where the "block" is located.


I appreciate that you want to position an opening delimiter to the left 
of its content, but can you say why you want `type +` on its own line? 
What's the big deal with `...""" + type +\n` and then the next text 
block? (You don't seem to object to the closing delimiter sharing a line 
with content, since you have ` + ` after the first closing delimiter.)


Alex


Re: String reboot - (1a) incidental whitespace

2019-04-22 Thread Alex Buckley

On 4/22/2019 12:16 PM, Guy Steele wrote:

On Apr 22, 2019, at 3:04 PM, Alex Buckley 
wrote:

Nope, I don't think multi-line string literals are an attractive
nuisance in any way. We should NOT deem it incorrect to refactor a
sequence of concatenations into a single multi-line string
literal.


I didn’t say (or mean to imply that).  I think it’s a great thing to
refactor concatenations into a single multi-line string literal WHEN
IT IS DONE CORRECTLY.

However, if you blindly pull out the concatenations and thereby
introduce newlines into the string when they were not there before
and doing so violates some contract downstream, THAT IS AN INCORRECT
TRANSFORMATION.


Literally, yes, it's an incorrect transformation for the caller to 
perform if it violates the contract offered by the callee.



We certainly agree that it would be a good thing if everything that
might be downstream were in fact reasonably tolerant of newlines.


Yes.


BUT IF YOU DON’T KNOW FOR SURE THAT WHAT IS DOWNSTREAM IS TOLERANT OF
NEWLINES, AND YOU BLINDLY TRANSFORM A STRING CONCATENATION INTO A
MULTI-LINE STRING LITERAL THAT INCLUDES NEWLINES WHERE THERE WERE
NONE BEFORE, THAT IS A BAD THING.


If the callee's contract says "No newlines in the string argument to 
Customer::setName", then the caller would be doing a bad thing.


But the reason this topic is interesting(ish) is because we're dealing 
with something that the vast majority of callees never thought to specify.


(Well, maybe not "never". I browsed the Java SE API Specification to 
find a method that takes a String, and randomly clicked on something in 
JNDI -- 
https://docs.oracle.com/en/java/javase/12/docs/api/java.naming/javax/naming/Name.html#add(java.lang.String) 
-- which happens to be strict about the string passed to it, so perhaps 
someone is about to get an InvalidNameException when they try to lay out 
a long LDAP query string over multiple lines.)



And a feature that makes it too easy to accidentally do a bad thing
_might_ be considered an attractive nuisance, AS OPPOSED TO MY
SCREAMING ALL-CAPS, WHICH ARE A REPULSIVE NUISANCE.


I can get 90% of the way to saying "OK, multi-line string literals 
_might_ be considered an attractive nuisance", but I can't get 100% of 
the way there because it's such a callee-centric view to take when the 
purpose of the feature is to simplify the life of the caller. If you 
crack open the door to give callees a hearing, you'll get requests to 
statically reject multi-line string literals (such as via a java.* 
annotation that programmatically indicates "not multi-line safe", or a 
java.lang.MultilineString type that's a sibling of String) and we don't 
want to go anywhere near there.


(I recall a library that took Runnable or somesuch, and fell over when 
the argument was a lambda expression; the library expected an anonymous 
inner class instance in order to do some peculiar introspection, which 
failed on the opaque object reifying a lambda expression. The library 
developer _might_ have considered lambda expressions an attractive 
nuisance for a few minutes, but who would have sympathy?)


Alex


Re: String reboot - (1a) incidental whitespace

2019-04-22 Thread Alex Buckley
Nope, I don't think multi-line string literals are an attractive 
nuisance in any way. We should NOT deem it incorrect to refactor a 
sequence of concatenations into a single multi-line string literal. 
Developers are chomping at the bit to do it, and if we cast doubt on the 
ability then we're wasting everyone's time. We should deem it correct, 
and 99% of the time no-one will care that newline characters exist in 
the string. The rare library that subtly misbehaves or (and this is the 
better option) actually blow ups when seeing newlines will feel great 
pressure to become more liberal in what it accepts, and that is a good 
thing.


Alex

On 4/19/2019 7:42 PM, Guy Steele wrote:

So is your point that multiline string literals may be an “attractive nuisance” 
in that they may make it too convenient for inattentive programmers to perform 
_incorrect_ refactoring?



On Apr 19, 2019, at 8:16 PM, Alex Buckley  wrote:


On 4/10/2019 8:22 AM, Jim Laskey wrote:
Line terminators:  When strings span lines, they do so using the line
terminators present in the source file, which may vary depending on what
operating system the file was authored.  Should this be an aspect of
multi-line-ness, or should we normalize these to a standard line
terminator?  It seems a little weird to treat string literals quite so
literally; the choice of line terminator is surely an incidental one.  I
think we're all comfortable saying "these should be normalized", but its
worth bringing this up because it is merely one way in which incidental
artifacts of how the string is embedded in the source program force us
to interpret what the user meant.


No-one has commented on this, but it's important because some libraries are 
going to be surprised by the presence of line terminators, of any kind, in 
strings denoted by multi-line string literals.

To be clear, I agree with normalizing line terminators. And, I understand that 
any string could have contained line terminators thanks to escape sequences in 
traditional string literals. But, it was not common to see a \n except where 
multi-line-ness was expected or harmless. Going forward, who can guarantee that 
refactoring the argument of `prepareStatement` from a sequence of 
concatenations:

  try (PreparedStatement s = connection.prepareStatement(
  "SELECT * "
+ "FROM my_table "
+ "WHERE a = b "
  )) {
  ...
  }

to a multi-line string literal:

  try (PreparedStatement s = connection.prepareStatement(
  """SELECT *
 FROM my_table
 WHERE a = b"""
  )) {
  ...
  }

is behaviorally compatible for `prepareStatement`? It had no reason to expect 
\n in its string argument before.

(Hat tip: 
https://blog.jooq.org/2015/12/29/please-java-do-finally-support-multiline-strings/)

Maybe `prepareStatement` will work fine. But someone somewhere is going to take a program 
with a sequence of 2000 concatenations and turn them into a huge multi-line string 
literal, and the inserted line terminators are going to cause memory pressure, and GC is 
going to take a little longer, and eventually this bug will be filed: "My system 
runs 5% slower because the source code changed a teeny tiny bit."

In reality, a few libraries will need fixing, and that will happen quickly 
because developers are very keen to use multi-line string literals. But it's 
fair to point out that while everyone is worrying about whitespace on the left 
of the literal, the line terminators to the right are a novel artifact too.

Alex




Re: To align, or not to align?

2019-04-19 Thread Alex Buckley

On 4/18/2019 11:32 AM, Brian Goetz wrote:

One view is that a string literal is the sequence of characters between
the delimiters, and a multi-line string literal is just a string literal
that happens to be able to span lines.  This is also the simplest
extension of existing string literals to multi-line; adding only the
ability to span lines.   In this view, implicit alignment can feel like
conflating two things.

An alternate view is that a multi-line string is a literal that is
embedded spatially in the Java source code; therefore it inherently has
some 2D structure to it, which gives us permission to muck with it in
certain ways that are consistent with that structure.

...

So I think the question really comes down to: what _is_ a multi-line
string literal.


I have a lot of time for the "alterate" view. Multi-line string literals 
are not meant to be raw; some inference about the developer's intent for 
the sea of whitespace on the left is fine (such as, "the developer is 
not interested in it at all").


I do, however, think that a box-of-quotes (or even a lighterweight 
marker for margins) makes the 2D denotation of a string overwhelming.


Alex


Re: String reboot - (1a) incidental whitespace

2019-04-19 Thread Alex Buckley

On 4/10/2019 8:22 AM, Jim Laskey wrote:

Line terminators:  When strings span lines, they do so using the line
terminators present in the source file, which may vary depending on what
operating system the file was authored.  Should this be an aspect of
multi-line-ness, or should we normalize these to a standard line
terminator?  It seems a little weird to treat string literals quite so
literally; the choice of line terminator is surely an incidental one.  I
think we're all comfortable saying "these should be normalized", but its
worth bringing this up because it is merely one way in which incidental
artifacts of how the string is embedded in the source program force us
to interpret what the user meant.


No-one has commented on this, but it's important because some libraries 
are going to be surprised by the presence of line terminators, of any 
kind, in strings denoted by multi-line string literals.


To be clear, I agree with normalizing line terminators. And, I 
understand that any string could have contained line terminators thanks 
to escape sequences in traditional string literals. But, it was not 
common to see a \n except where multi-line-ness was expected or 
harmless. Going forward, who can guarantee that refactoring the argument 
of `prepareStatement` from a sequence of concatenations:


  try (PreparedStatement s = connection.prepareStatement(
  "SELECT * "
+ "FROM my_table "
+ "WHERE a = b "
  )) {
  ...
  }

to a multi-line string literal:

  try (PreparedStatement s = connection.prepareStatement(
  """SELECT *
 FROM my_table
 WHERE a = b"""
  )) {
  ...
  }

is behaviorally compatible for `prepareStatement`? It had no reason to 
expect \n in its string argument before.


(Hat tip: 
https://blog.jooq.org/2015/12/29/please-java-do-finally-support-multiline-strings/)


Maybe `prepareStatement` will work fine. But someone somewhere is going 
to take a program with a sequence of 2000 concatenations and turn them 
into a huge multi-line string literal, and the inserted line terminators 
are going to cause memory pressure, and GC is going to take a little 
longer, and eventually this bug will be filed: "My system runs 5% slower 
because the source code changed a teeny tiny bit."


In reality, a few libraries will need fixing, and that will happen 
quickly because developers are very keen to use multi-line string 
literals. But it's fair to point out that while everyone is worrying 
about whitespace on the left of the literal, the line terminators to the 
right are a novel artifact too.


Alex


Re: String reboot (plain text)

2019-04-05 Thread Alex Buckley

On 4/5/2019 7:15 AM, Jim Laskey wrote:

Following example works as expected:

public class Test {
 public static void main(String... args) {
 String result = """
 public class Main {
 public static void main(String... args) {
 System.out.println("Hello World!");
 }
 }
 """.align();
 System.out.println(result);
 }
}

Empty string is both "" and "".  Escape sequences and unicode escapes are 
always translated.


As someone who was nervous about how raw string literals effectively 
sidelined Unicode, I'm pleased that \u escapes are back. It's also 
great that the traditional escape sequence \" will be interpreted as a 
single " like it would be in a traditional string literal. Because, as 
we all know, the code above started life as this painful noisy code:


String result = "public class Main {\n" +
"  public static void main(String... args) {\n" +
"System.out.println(\"Hello World!\");\n" +
"  }\n" +
"}\n";

Now a developer can move forward in steps: today remove all the 
end-of-line cruft involving \n and + that multi-line strings do for 
free, and don't worry about mid-line escape sequences such as \" -- 
convert them to " tomorrow, or next week, or not at all, your choice.


Alex


Re: Switch expressions spec

2019-03-21 Thread Alex Buckley

Leonid, thanks for checking JCK12, and for looking ahead to JCK13.

Alex

On 3/21/2019 8:25 AM, Leonid Arbuzov wrote:

Hi Manoj,

Thanks for the example.
The  JCK12 doesn't have this particular testcase and can add it in the
next release.

Regards,
-leonid

On 3/20/2019 8:07 PM, Manoj Palat wrote:


Hi Leonid,
The original query was based on this case:


Consider the following code:
public class X {
@SuppressWarnings("preview")
public static int foo(int i) throws MyException {
int v = switch (i) {
default -> throw new MyException(); // no error?
};
return v;
}
public static void main(String argv[]) {
try {
System.out.println(X.foo(1));
} catch (MyException e) {
System.out.println("Exception thrown as expected");
}
}
}
class MyException extends Exception {
private static final long serialVersionUID = 3461899582505930473L;
}

As per spec, JLS 15.28.1

It is a compile-time error if a switch expression has no result
expressions.
but javac (12) does not flag an error.

Regards,
Manoj


Inactive hide details for Leonid Arbuzov ---03/21/2019 06:35:33
AM---Hi Alex, There are negative tests with missed result expreLeonid
Arbuzov ---03/21/2019 06:35:33 AM---Hi Alex, There are negative tests
with missed result expressions:

From: Leonid Arbuzov 
To: Alex Buckley , amber-spec-experts

Cc: Stephan Herrmann 
Date: 03/21/2019 06:35 AM
Subject: Re: Switch expressions spec
Sent by: "amber-spec-experts"






Hi Alex,

There are negative tests with missed result expressions:

int a = switch (selectorExpr) {
   case 0 -> 1;
   case 1 -> 1;
   default -> ;
   };
int a = switch (selectorExpr) {
   case 0 -> { break 1; }
   case 1 -> { break 1; }
   default -> { fun(); }
   };

If you meant no result expressions at all then I couldn't find such
test yet.
It can be added in JCK13.

Thanks,
Leonid

On 3/19/2019 5:28 PM, Alex Buckley wrote:

Hi Leonid,

So there are no negative tests that check what happens if a
switch expression has no result expressions?

Alex

On 3/19/2019 5:24 PM, _leonid.arbouzov@oracle.com_
<mailto:leonid.arbou...@oracle.com> wrote:
There are tests on switch expression with cases that
throwing exception,
causing division by zero, index out of range, etc.
These are all positive tests i.e. compile fine.

Thanks,
    Leonid


On 3/15/19 1:20 PM, Alex Buckley wrote:
OK, we intend at least one result expression
to be required, so the
spec is correct as is.

(I should have been clearer that my belief was
about the intent of the
spec, rather than about how I personally think
completion should occur.)

Manoj didn't say what javac build he is
testing with, but this is a
substantial discrepancy between compiler and
spec. I hope that Leonid
Arbouzov (cc'd) can tell us what conformance
tests exist in this area.

Alex

On 3/15/2019 12:09 PM, Brian Goetz wrote:
At the same time, we also reaffirmed
our choice to _not_ allow throw
from one half of a conditional:

 int x = foo ? 3 : throw new
FooException()

But John has this right — the high
order bit is that every expression
should have a defined normal
completion, and a type, even if
computing sub-expressions (or in this
case, sub-statements) might
throw.  And without at least one arm
yielding a value, it would be
impossible to infer the type of the
expression.
On Mar 15, 2019, at 3:01 PM,
John Rose
__
<mailto:john.r.r...@oracle.com> wrote:


        On Mar 15, 2019, at 11:39 AM,
Alex Buckley
__
<mailto:alex.buck...@oracle.com>
  

Re: Switch expressions spec

2019-03-19 Thread Alex Buckley

Hi Leonid,

So there are no negative tests that check what happens if a switch 
expression has no result expressions?


Alex

On 3/19/2019 5:24 PM, leonid.arbou...@oracle.com wrote:

There are tests on switch expression with cases that throwing exception,
causing division by zero, index out of range, etc.
These are all positive tests i.e. compile fine.

Thanks,
Leonid


On 3/15/19 1:20 PM, Alex Buckley wrote:

OK, we intend at least one result expression to be required, so the
spec is correct as is.

(I should have been clearer that my belief was about the intent of the
spec, rather than about how I personally think completion should occur.)

Manoj didn't say what javac build he is testing with, but this is a
substantial discrepancy between compiler and spec. I hope that Leonid
Arbouzov (cc'd) can tell us what conformance tests exist in this area.

Alex

On 3/15/2019 12:09 PM, Brian Goetz wrote:

At the same time, we also reaffirmed our choice to _not_ allow throw
from one half of a conditional:

 int x = foo ? 3 : throw new FooException()

But John has this right — the high order bit is that every expression
should have a defined normal completion, and a type, even if
computing sub-expressions (or in this case, sub-statements) might
throw.  And without at least one arm yielding a value, it would be
impossible to infer the type of the expression.


On Mar 15, 2019, at 3:01 PM, John Rose  wrote:

On Mar 15, 2019, at 11:39 AM, Alex Buckley 
wrote:


In a switch expression, I believe it should be legal for every
`case`/`default` arm to complete abruptly _for a reason other than
a break with value_.


My reading of Gavin's draft is that he is doing something very
subtle there, which is to retain an existing feature in the language
that an expression always has a defined normal completion.

We also don't have expressions of the form "throw e". Allowing
a switch expression to complete without a value on *every* arm
raises the same question as "throw e" as an expression.  How do
you type "f(throw e)"?  If you can answer that, then you can also
have switch expressions that refuse to break with any values.

BTW, if an expression has a defined normal completion, it also
has a possible type.  By possible type I mean at least one correct
typing (poly-expressions can have many).  So one obvious
result of Gavin's draft is that you derive possible types from
the arms of the switch expression that break with values.

But the root requirement, I think, is to preserve the possible
normal normal of every expression.

"What about some form of 1/0?"  That's a good question.
What about it?  It completes normally with a type of int.
Dynamically, the normal completion is never taken.
Gavin might call that a "notional normal completion"
(I like that word) provided to uphold the general principle
even where static analysis proves that the Turing machine
fails to return normally.

— John






Re: Switch expressions spec

2019-03-15 Thread Alex Buckley
OK, we intend at least one result expression to be required, so the spec 
is correct as is.


(I should have been clearer that my belief was about the intent of the 
spec, rather than about how I personally think completion should occur.)


Manoj didn't say what javac build he is testing with, but this is a 
substantial discrepancy between compiler and spec. I hope that Leonid 
Arbouzov (cc'd) can tell us what conformance tests exist in this area.


Alex

On 3/15/2019 12:09 PM, Brian Goetz wrote:

At the same time, we also reaffirmed our choice to _not_ allow throw from one 
half of a conditional:

 int x = foo ? 3 : throw new FooException()

But John has this right — the high order bit is that every expression should 
have a defined normal completion, and a type, even if computing sub-expressions 
(or in this case, sub-statements) might throw.  And without at least one arm 
yielding a value, it would be impossible to infer the type of the expression.


On Mar 15, 2019, at 3:01 PM, John Rose  wrote:

On Mar 15, 2019, at 11:39 AM, Alex Buckley  wrote:


In a switch expression, I believe it should be legal for every `case`/`default` 
arm to complete abruptly _for a reason other than a break with value_.


My reading of Gavin's draft is that he is doing something very
subtle there, which is to retain an existing feature in the language
that an expression always has a defined normal completion.

We also don't have expressions of the form "throw e".  Allowing
a switch expression to complete without a value on *every* arm
raises the same question as "throw e" as an expression.  How do
you type "f(throw e)"?  If you can answer that, then you can also
have switch expressions that refuse to break with any values.

BTW, if an expression has a defined normal completion, it also
has a possible type.  By possible type I mean at least one correct
typing (poly-expressions can have many).  So one obvious
result of Gavin's draft is that you derive possible types from
the arms of the switch expression that break with values.

But the root requirement, I think, is to preserve the possible
normal normal of every expression.

"What about some form of 1/0?"  That's a good question.
What about it?  It completes normally with a type of int.
Dynamically, the normal completion is never taken.
Gavin might call that a "notional normal completion"
(I like that word) provided to uphold the general principle
even where static analysis proves that the Turing machine
fails to return normally.

— John




Re: Switch expressions spec

2019-03-15 Thread Alex Buckley
// The mail below doesn't appear to have made it to the 
amber-spec-experts web archive, even though Manoj is a member of the list.


Hey Gavin,

In a switch expression, I believe it should be legal for every 
`case`/`default` arm to complete abruptly _for a reason other than a 
break with value_.


That is, a legal switch expression may have zero rules like `case a -> 
5;` (completes normally) or `case b -> { break 6; }` (completes abruptly 
for reason of break with value). Instead, it only has rules like `case c 
-> { throw new Exc(); }` or `case d -> throw new Exc();`, both of which 
complete abruptly for reason othan than break with value. (Extend to 
switch labeled statement groups.)


I suspect that the strong rule flagged by Manoj:

It is a compile-time error if a switch expression has no result 
expressions.


is trying to require a value `break` statement being present in every 
switch labeled block, because a rule earlier in 15.28.1 did not quite go 
that far:


If the switch block consists of switch labeled rules, then any 
switch labeled block (14.11.1) must complete abruptly.


Alex

On 3/14/2019 4:14 PM, Manoj Palat wrote:

Hi Alex, Gavin,

One more clarification of the spec:

Consider the following code:
public class X {
@SuppressWarnings("preview")
public static int foo(int i) throws MyException {
int v = switch (i) {
default -> throw new MyException(); // error or no error?
};
return v;
}
public static void main(String argv[]) {
try {
System.out.println(X.foo(1));
} catch (MyException e) {
System.out.println("Exception thrown as expected");
}
}
}
class MyException extends Exception {
private static final long serialVersionUID = 3461899582505930473L;
}

As per spec, JLS 15.28.1

It is a compile-time error if a switch expression has no result expressions.



Throw statement is not a result expression and hence as per the spec we
should be giving this error.

Is this an omission in the spec? Should we be flagging an error?

In Eclipse ECJ we are flagging an error but I observed javac does not -
Want to get clarity on what does the spec mean.

Regards,

Manoj

Eclipse Java Dev,

IBM.


Re: Switch expressions spec

2019-03-06 Thread Alex Buckley

Hi Gavin,

On 3/6/2019 1:51 AM, Manoj Palat wrote:

*1: In section, *14.15 The breakStatement

A breakstatement transfers control out of an enclosing statement_, or
causes an enclosing __switch__expression to produce a specified value_.


/BreakStatement:/
break[~~ /Identifier/~~];
_break___/_Expression_/___;_
_break;_

the identifier is dropped – That looks like a typographical issue (since
it was mentioned that there was not functional difference) – Identifier
is mentioned in the statements following the above para as well. Similar
issue is displayed in "continue" section also.


The dropping of the `break [Identifier]` alternative looks like an 
editing error when the spec document was being reformatted; compare:


old format: 
http://cr.openjdk.java.net/~gbierman/switch-expressions-2019-01.html#jep325-14.15


new format: 
http://cr.openjdk.java.net/~gbierman/switch-expressions.html#jep325-14.15



2. A related query, though a bit late, but better late than never:) - :
In the Eclipse Compiler implementation we assume expression encompasses
identifier (in the syntax context), and then deduce whether this is a
label or an expression later in the resolution context. From the grammar
above, it does not look like we can distinguish whether an identifier is
a label or an expression in the first place? An explicit statement in
the spec about how to distinguish would be helpful.


This will become moot if the change anticipated by Brian happens (change 
“break value” to “break-with value”). Until then, Manoj is asking a 
great question. Per 6.2, a label is not a name, but per 14.7, a label 
does have scope, and:


"There is no restriction against using the same identifier as a label 
and as the name of a package, class, interface, method, field, 
parameter, or local variable. Use of an identifier to label a statement 
does not obscure (§6.4.2) a package, class, interface, method, field, 
parameter, or local variable with the same name. Use of an identifier as 
a class, interface, method, field, local variable or as the parameter of 
an exception handler (§14.20) does not obscure a statement label with 
the same name."


I seem to recall a discussion recognizing and accepting the source 
incompatibility of recasting `break X;` from "Jump to label X" to 
"Evaluate X and yield the result". Such acceptance would suggest an edit 
to the last sentence quoted above.



3. In section, 5.6 *– “*_A _*/_unary numeric promotion_/*_applies
numeric promotion to an operand expression and a notional non-constant
expression of type _*int*_.”_
It will be nice to explain in the spec a little more as to what is meant
by “a notional non-constant expression” ?


I believe more polishing is already on the way for the recast definition 
of numeric promotion?


Alex


Re: Switch expressions spec

2019-03-04 Thread Alex Buckley

For clarity, we have renamed the January 2019 version from:

http://cr.openjdk.java.net/~gbierman/switch-expressions-old.html

to:

http://cr.openjdk.java.net/~gbierman/switch-expressions-2019-01.html

The CSR for switch expressions (JDK-8207241) has the spec as an 
attachment, but also links to an online version for reader convenience. 
Originally it linked to `switch-expressions.html`, but that is a living 
document and hence unsuitable for a CSR, so now it links to 
`switch-expressions-2019-01.html`.


Alex

On 2/27/2019 4:43 AM, Gavin Bierman wrote:

I have uploaded a revised switch expressions spec at:

http://cr.openjdk.java.net/~gbierman/switch-expressions.html

This is functionally equivalent to the spec uploaded last month. The change is 
in how we specify the type checking of switch expressions. We have make 
simplifications to make it more consistent with the specification of 
conditional expressions. The behaviour of type checking is unchanged.

Thanks,
Gavin

PS: I have left the January version at 
http://cr.openjdk.java.net/~gbierman/switch-expressions-old.html for reference.


On 17 Jan 2019, at 10:14, Gavin Bierman  wrote:

Thank you Alex and Tagir. I have uploaded a new version of the spec at:

http://cr.openjdk.java.net/~gbierman/switch-expressions.html

This contains all the changes you suggested below. In addition, there is a 
small bug fix in 5.6.3 concerning widening 
(https://bugs.openjdk.java.net/browse/JDK-8213180). I have also taken the 
opportunity to reorder chapter 15 slightly, so switch expressions are now 
section 15.28 and constant expressions are now section 15.29 (the last section 
in the chapter).

Comments welcome!
Gavin




Re: Updated document on data classes and sealed types

2019-03-01 Thread Alex Buckley

On 3/1/2019 12:14 PM, Brian Goetz wrote:

While the previous version was mostly about tradeoffs, this version
takes a much more opinionated interpretation of the feature, offering
more examples of use cases of where it is intended to be used (and not
used).


(Setting aside value records throughout.) A record type "codes like a 
class", not "codes like an int" -- you can `new` it, and the resulting 
record object has identity and can refer directly or indirectly to other 
record objects of the same type. (Contrast with LW1, "Value types may 
not declare fields of its own type directly or indirectly".) A variable 
of record type may even be null. The on-ramp to records looks smooth -- 
but can I compatibly turn a class type into a record type? It's 
potentially a source-compatible change (make the state description match 
the class's ctor), but binary-compatible?


Similarly, if a record type is chafing at the restrictions of "state 
only!", then can I (source|binary)-compatibly turn it into a class type?


Alex


Re: Hyphenated keywords and switch expressions

2019-01-17 Thread Alex Buckley

Thanks Gavin. The "Jan 2019" edition looks good.

The relative shapes of switch statements and switch expressions can be 
easily discerned by reading [1] and [2] side by side.


The renumbering, which fits with my plans for the JLS, is also welcome 
in advance of the public commentary that we can expect on this spec come 
JDK 12 GA.


Alex

[1] 
http://cr.openjdk.java.net/~gbierman/switch-expressions.html#jep325-14.11.2


[2] 
http://cr.openjdk.java.net/~gbierman/switch-expressions.html#jep325-15.28.1


On 1/17/2019 1:14 AM, Gavin Bierman wrote:

Thank you Alex and Tagir. I have uploaded a new version of the spec
at:

http://cr.openjdk.java.net/~gbierman/switch-expressions.html

This contains all the changes you suggested below. In addition, there
is a small bug fix in 5.6.3 concerning widening
(https://bugs.openjdk.java.net/browse/JDK-8213180). I have also taken
the opportunity to reorder chapter 15 slightly, so switch expressions
are now section 15.28 and constant expressions are now section 15.29
(the last section in the chapter).

Comments welcome! Gavin



On 14 Jan 2019, at 21:40, Alex Buckley 
wrote:

Hi Gavin,

Some points driven partly by the discussion with Tagir:

1. In 14.11.1, SwitchLabeledBlock should not end with a `;` --
there is no indication in JEP 325 that a semicolon is desired after
`-> {...}` and javac in JDK 12 does not accept one there. Also,
SwitchLabeledThrowStatement should not end with a `;` because
ThrowStatement includes a `;`.

2. In 14.11.1, "This block can either be empty, or take one of two
forms:" is wrong for switch expressions. The emptiness allowed by
the grammar will be banned semantically in 15.29.1, so 14.11.1
should avoid trouble by speaking broadly of the forms in an
educational tone: "A switch block can consist of either: - _Switch
labeled rules_, which use `->` to introduce either a _switch
labeled expression_, ..."Also, "optionally followed by switch
labels." is wrong for switch expressions, so prefer: "- _Switch
labeled statement groups_, which use `:` to introduce block
statements."

3. In 15.29.1: (this is mainly driven by eyeballing against
14.11.2)

- Incorrect Markdown in section header.

- The error clause in the following bullet is redundant because the
list header already called for an error: "The switch block must be
compatible with the type of the selector expression, *or a
compile-time error occurs*."

- I would prefer to pull the choice of {default label, enum typed
selector expression} into a fourth bullet of the prior list, to
align how 14.11.2's list has a bullet concerning default label.

- The significant rule from 14.11.2 that "If the switch block
consists of switch labeled rules, then any switch labeled
expression must be a statement expression (14.8)." has no parallel
in 15.29.1. Instead, for switch labeled rules, 15.29.1 has a rule
for switch labeled blocks. (1) We haven't seen switch labeled
blocks for ages, so a cross-ref to 14.11.1 is due. (2) A note that
switch exprs allow `-> ANY_EXPRESSION` while switch statements
allow `-> NOT_ANY_EXPRESSION` is due in both sections; grep ch.8
for "In this respect" to see what I mean. (3) The semantic
constraints on switch labeled rules+statement groups in 15.29.1
should be easily contrastable with those in 14.11.2 -- one approach
is to pull the following constraints into 15.29.1's "all conditions
true, or error" list:

- - If the switch block consists of switch labeled rules, then
any switch labeled block (14.11.1) MUST COMPLETE ABRUPTLY. - If the
switch block consists of switch labeled statement groups, then the
last statement in the switch block MUST COMPLETE ABRUPTLY, and the
switch block MUST NOT HAVE ANY SWITCH LABELS AFTER THE LAST SWITCH
LABELED STATEMENT GROUP. -

If you prefer to keep these semantic constraints standalone so that
they have negative polarity, then 14.11.2 should to the same for
its significant-but-easily-missed "must be a statement expression"
constraint.

Alex

On 1/13/2019 2:53 AM, Tagir Valeev wrote:

Hello!


I'm concerned about any claim of ambiguity in the grammar,
though I'm not sure I'm following you correctly. I agree that
your first fragment is parsed as two statements -- a switch
statement and an empty statement -- but I don't know what you
mean about "inside switch expression rule" for your second
fragment. A switch expression is not an expression statement
(JLS 14.8). In your second fragment, the leftmost default label
is followed not by a block or a throw statement but by an
expression (`switch (0) {...}`, a unary expression) and a
semicolon.


Ah, ok, we moved away slightly from the spec draft [1]. I was
not aware, because I haven't wrote parser by myself. The draft
says:

SwitchLabeledRule: SwitchLabeledExpression SwitchLabeledBlock
SwitchLabeledThrowStatement

SwitchLabeledEx

Re: Hyphenated keywords and switch expressions

2019-01-14 Thread Alex Buckley

Hi Gavin,

Some points driven partly by the discussion with Tagir:

1. In 14.11.1, SwitchLabeledBlock should not end with a `;` -- there is 
no indication in JEP 325 that a semicolon is desired after `-> {...}` 
and javac in JDK 12 does not accept one there. Also, 
SwitchLabeledThrowStatement should not end with a `;` because 
ThrowStatement includes a `;`.


2. In 14.11.1, "This block can either be empty, or take one of two 
forms:" is wrong for switch expressions. The emptiness allowed by the 
grammar will be banned semantically in 15.29.1, so 14.11.1 should avoid 
trouble by speaking broadly of the forms in an educational tone: "A 
switch block can consist of either: - _Switch labeled rules_, which use 
`->` to introduce either a _switch labeled expression_, ..."Also, 
"optionally followed by switch labels." is wrong for switch expressions, 
so prefer: "- _Switch labeled statement groups_, which use `:` to 
introduce block statements."


3. In 15.29.1: (this is mainly driven by eyeballing against 14.11.2)

- Incorrect Markdown in section header.

- The error clause in the following bullet is redundant because the list 
header already called for an error: "The switch block must be compatible 
with the type of the selector expression, *or a compile-time error 
occurs*."


- I would prefer to pull the choice of {default label, enum typed 
selector expression} into a fourth bullet of the prior list, to align 
how 14.11.2's list has a bullet concerning default label.


- The significant rule from 14.11.2 that "If the switch block consists 
of switch labeled rules, then any switch labeled expression must be a 
statement expression (14.8)." has no parallel in 15.29.1. Instead, for 
switch labeled rules, 15.29.1 has a rule for switch labeled blocks. (1) 
We haven't seen switch labeled blocks for ages, so a cross-ref to 
14.11.1 is due. (2) A note that switch exprs allow `-> ANY_EXPRESSION` 
while switch statements allow `-> NOT_ANY_EXPRESSION` is due in both 
sections; grep ch.8 for "In this respect" to see what I mean. (3) The 
semantic constraints on switch labeled rules+statement groups in 15.29.1 
should be easily contrastable with those in 14.11.2 -- one approach is 
to pull the following constraints into 15.29.1's "all conditions true, 
or error" list:


-
- If the switch block consists of switch labeled rules, then any switch 
labeled block (14.11.1) MUST COMPLETE ABRUPTLY.
- If the switch block consists of switch labeled statement groups, then 
the last statement in the switch block MUST COMPLETE ABRUPTLY, and the 
switch block MUST NOT HAVE ANY SWITCH LABELS AFTER THE LAST SWITCH 
LABELED STATEMENT GROUP.

-

If you prefer to keep these semantic constraints standalone so that they 
have negative polarity, then 14.11.2 should to the same for its 
significant-but-easily-missed "must be a statement expression" constraint.


Alex

On 1/13/2019 2:53 AM, Tagir Valeev wrote:

Hello!


I'm concerned about any claim of ambiguity in the grammar, though I'm
not sure I'm following you correctly. I agree that your first fragment
is parsed as two statements -- a switch statement and an empty statement
-- but I don't know what you mean about "inside switch expression rule"
for your second fragment. A switch expression is not an expression
statement (JLS 14.8). In your second fragment, the leftmost default
label is followed not by a block or a throw statement but by an
expression (`switch (0) {...}`, a unary expression) and a semicolon.


Ah, ok, we moved away slightly from the spec draft [1]. I was not
aware, because I haven't wrote parser by myself. The draft says:

SwitchLabeledRule:
   SwitchLabeledExpression
   SwitchLabeledBlock
   SwitchLabeledThrowStatement

SwitchLabeledExpression:
   SwitchLabel -> Expression ;
SwitchLabeledBlock:
   SwitchLabel -> Block ;
SwitchLabeledThrowStatement:
   SwitchLabel -> ThrowStatement ;

(by the way I think that ; after block and throw should not be
present: current implementation does not require it after the block
and throw statement already includes a ; inside it).

Instead we implement it like:

SwitchLabeledRule:
   SwitchLabel -> SwitchLabeledRuleStatement
SwitchLabeledRuleStatement:
   ExpressionStatement
   Block
   ThrowStatement

So we assume that the right part of SwitchLabeledRule is always a
statement and reused ExpressionStatement to express Expression plus
semicolon, because syntactically it looks the same. Strictly following
a spec draft here looks even more ugly, because it requires more
object types in our code model and reduces the flexibility when we
need to perform code transformation. E.g. if we want to wrap
expression into block, currently we just need to replace an
ExpressionStatement with a Block not touching a SwitchLabel at all.
Had we mirrored the spec in our code model, we would need to replace
SwitchLabeledExpression with SwitchLabeledBlock which looks more
annoying.

With best regards,
Tagir Valeev

[1] http://cr.openjdk.jav

Re: Hyphenated keywords and switch expressions

2019-01-14 Thread Alex Buckley

Hi Gavin,

Some points driven partly by the discussion with Tagir:

1. In 14.11.1, SwitchLabeledBlock should not end with a `;` -- there is 
no indication in JEP 325 that a semicolon is desired after `-> {...}` 
and javac in JDK 12 does not accept one there. Also, 
SwitchLabeledThrowStatement should not end with a `;` because 
ThrowStatement includes a `;`.


2. In 14.11.1, "This block can either be empty, or take one of two 
forms:" is wrong for switch expressions. The emptiness allowed by the 
grammar will be banned semantically in 15.29.1, so 14.11.1 should avoid 
trouble by speaking broadly of the forms in an educational tone: "A 
switch block can consist of either: - _Switch labeled rules_, which use 
`->` to introduce either a _switch labeled expression_, ..."Also, 
"optionally followed by switch labels." is wrong for switch expressions, 
so prefer: "- _Switch labeled statement groups_, which use `:` to 
introduce block statements."


3. In 15.29.1: (this is mainly driven by eyeballing against 14.11.2)

- Incorrect Markdown in section header.

- The error clause in the following bullet is redundant because the list 
header already called for an error: "The switch block must be compatible 
with the type of the selector expression, *or a compile-time error 
occurs*."


- I would prefer to pull the choice of {default label, enum typed 
selector expression} into a fourth bullet of the prior list, to align 
how 14.11.2's list has a bullet concerning default label.


- The significant rule from 14.11.2 that "If the switch block consists 
of switch labeled rules, then any switch labeled expression must be a 
statement expression (14.8)." has no parallel in 15.29.1. Instead, for 
switch labeled rules, 15.29.1 has a rule for switch labeled blocks. (1) 
We haven't seen switch labeled blocks for ages, so a cross-ref to 
14.11.1 is due. (2) A note that switch exprs allow `-> ANY_EXPRESSION` 
while switch statements allow `-> NOT_ANY_EXPRESSION` is due in both 
sections; grep ch.8 for "In this respect" to see what I mean. (3) The 
semantic constraints on switch labeled rules+statement groups in 15.29.1 
should be easily contrastable with those in 14.11.2 -- one approach is 
to pull the following constraints into 15.29.1's "all conditions true, 
or error" list:


-
- If the switch block consists of switch labeled rules, then any switch 
labeled block (14.11.1) MUST COMPLETE ABRUPTLY.
- If the switch block consists of switch labeled statement groups, then 
the last statement in the switch block MUST COMPLETE ABRUPTLY, and the 
switch block MUST NOT HAVE ANY SWITCH LABELS AFTER THE LAST SWITCH 
LABELED STATEMENT GROUP.

-

If you prefer to keep these semantic constraints standalone so that they 
have negative polarity, then 14.11.2 should to the same for its 
significant-but-easily-missed "must be a statement expression" constraint.


Alex

On 1/13/2019 2:53 AM, Tagir Valeev wrote:

Hello!


I'm concerned about any claim of ambiguity in the grammar, though I'm
not sure I'm following you correctly. I agree that your first fragment
is parsed as two statements -- a switch statement and an empty statement
-- but I don't know what you mean about "inside switch expression rule"
for your second fragment. A switch expression is not an expression
statement (JLS 14.8). In your second fragment, the leftmost default
label is followed not by a block or a throw statement but by an
expression (`switch (0) {...}`, a unary expression) and a semicolon.


Ah, ok, we moved away slightly from the spec draft [1]. I was not
aware, because I haven't wrote parser by myself. The draft says:

SwitchLabeledRule:
   SwitchLabeledExpression
   SwitchLabeledBlock
   SwitchLabeledThrowStatement

SwitchLabeledExpression:
   SwitchLabel -> Expression ;
SwitchLabeledBlock:
   SwitchLabel -> Block ;
SwitchLabeledThrowStatement:
   SwitchLabel -> ThrowStatement ;

(by the way I think that ; after block and throw should not be
present: current implementation does not require it after the block
and throw statement already includes a ; inside it).

Instead we implement it like:

SwitchLabeledRule:
   SwitchLabel -> SwitchLabeledRuleStatement
SwitchLabeledRuleStatement:
   ExpressionStatement
   Block
   ThrowStatement

So we assume that the right part of SwitchLabeledRule is always a
statement and reused ExpressionStatement to express Expression plus
semicolon, because syntactically it looks the same. Strictly following
a spec draft here looks even more ugly, because it requires more
object types in our code model and reduces the flexibility when we
need to perform code transformation. E.g. if we want to wrap
expression into block, currently we just need to replace an
ExpressionStatement with a Block not touching a SwitchLabel at all.
Had we mirrored the spec in our code model, we would need to replace
SwitchLabeledExpression with SwitchLabeledBlock which looks more
annoying.

With best regards,
Tagir Valeev

[1] http://cr.openjdk.jav

Re: Hyphenated keywords and switch expressions

2019-01-14 Thread Alex Buckley

Hi Tagir,

On 1/13/2019 2:53 AM, Tagir Valeev wrote:

Ah, ok, we moved away slightly from the spec draft [1]. I was not
aware, because I haven't wrote parser by myself. The draft says:

SwitchLabeledRule:
   SwitchLabeledExpression
   SwitchLabeledBlock
   SwitchLabeledThrowStatement

SwitchLabeledExpression:
   SwitchLabel -> Expression ;
SwitchLabeledBlock:
   SwitchLabel -> Block ;
SwitchLabeledThrowStatement:
   SwitchLabel -> ThrowStatement ;

Instead we implement it like:

SwitchLabeledRule:
   SwitchLabel -> SwitchLabeledRuleStatement
SwitchLabeledRuleStatement:
   ExpressionStatement
   Block
   ThrowStatement

So we assume that the right part of SwitchLabeledRule is always a
statement and reused ExpressionStatement to express Expression plus
semicolon, because syntactically it looks the same.


That's an odd assumption, because SwitchLabeledRule appears in the 
SwitchBlock of a SwitchExpression, and we obviously intend any kind of 
expression to be allowed after the -> in a switch expression. That is, 
in a switch expression, what comes after the -> is not just an 
expression statement (x=y, ++x, --x, x++, x--, x.m(), new X()) but any 
expression (including this, X.class, x.f, x[i], x::m). Only in a switch 
statement do we restrict the kind of expression allowed after the -> but 
that's a semantic rule (14.11.2), not syntactic, in order to share the 
grammar between switch expressions and switch statements.



Strictly following a spec draft here looks even more ugly, because it
requires more object types in our code model and reduces the
flexibility when we need to perform code transformation. E.g. if we
want to wrap expression into block, currently we just need to replace
an ExpressionStatement with a Block not touching a SwitchLabel at
all. Had we mirrored the spec in our code model, we would need to
replace SwitchLabeledExpression with SwitchLabeledBlock which looks
more annoying.


Understood. The grammar is specified like it is in order to introduce 
the critical terms "switch labeled expression", "switch labeled block", 
and "switch labeled throw statement". Aligning production names with 
critical terms is longstanding JLS style.


Alex


Re: Hyphenated keywords and switch expressions

2019-01-11 Thread Alex Buckley

Hi Tagir,

On 1/11/2019 5:32 AM, Tagir Valeev wrote:

On the other hand, from IDE developer point of view, having expression
and statement with so similar syntax definitely adds a confusion to
the parsing (and probably to users). E.g. suppose we want to parse a
fragment which consists of a number of statements, isolated from other
code:

switch(0) { default -> throw new Exception(); };

In normal context it's two statements: switch-statement followed by an
empty statement. However inside switch expression rule it's one
statement: an expression statement containing a switch expression:

int x = switch(0) { default -> switch(0) { default -> throw new
Exception(); }; };

Normally if we take a textual representation of single statement, it
could be parsed back into the same single statement, when isolated
from other code (the same works for expressions). Here this rule is
violated: the expression statement taken from switch expression rule
could be reparsed in isolation as two statements.


I'm concerned about any claim of ambiguity in the grammar, though I'm 
not sure I'm following you correctly. I agree that your first fragment 
is parsed as two statements -- a switch statement and an empty statement 
-- but I don't know what you mean about "inside switch expression rule" 
for your second fragment. A switch expression is not an expression 
statement (JLS 14.8). In your second fragment, the leftmost default 
label is followed not by a block or a throw statement but by an 
expression (`switch (0) {...}`, a unary expression) and a semicolon. 
Yes, the phrase `switch (0) {...}` is parsed as a switch statement in 
one context and as a unary expression in another context. Is that the 
ambiguity you wished to highlight?


Alex


Re: Concise method body + type inference ?

2018-10-03 Thread Alex Buckley

On 10/3/2018 4:06 PM, John Rose wrote:

On Oct 3, 2018, at 3:56 PM, Alex Buckley mailto:alex.buck...@oracle.com>> wrote:


Let's say that Java 5 had the right idea by coupling an overriding
method to an overridden method, via @Override. Then, your proposal is
at odds with Java 5, because omitting the method signature of the
overriding method also means omitting @Override. (I assume you
intended for there to be no annotations on your lambda-like method
bodies such as `hashNext() -> index < end;`)


Put another way, Remi's suggestion is for a coupling stronger
than @Override, which is purely advisory.  An 'override'
modifier would upgrade the advice to something mandatory,
at which point types might from from the super.


If you write @Override, then it's not just advice to the compiler that 
you intend to override something; it's a statement that you DO override 
something. You get a compile-time error if you don't. It's an `override` 
modifier in all but spelling.


But yes, Remi is saying that some method declarations have become so 
full of redundant information -- the signature types, the accessibility 
modifier, the @Override annotation, the {} and `return` for the body -- 
that it's time to reject the redundancy and go for concision not only in 
the body but in the declaration as a whole.


Brian has already made the argument to infer only within the method's 
implementation -- meaning within the method's body, not for the method 
declaration as a whole. I think Remi views the whole method declaration 
as "implementation", of the overridden method, and concludes that 
inferring the signature et al is reasonable. But it's rather arbitrary 
to give an overriding method the sole right to a concise declaration, 
and not a `private` method, a `final` method, a `default` method, etc.


Alex


Re: Concise method body + type inference ?

2018-10-03 Thread Alex Buckley

On 10/3/2018 3:32 PM, fo...@univ-mlv.fr wrote:

Here the coupling is not between accessibility and inference, it's
between overridden methods and inference, this coupling already
exists, we have even introduced the annotation @Override in Java 5
and 6 to make the coupling stronger, to be sure that people see that
an overridden method and its implementation are strongly linked.


Let's say that Java 5 had the right idea by coupling an overriding 
method to an overridden method, via @Override. Then, your proposal is at 
odds with Java 5, because omitting the method signature of the 
overriding method also means omitting @Override. (I assume you intended 
for there to be no annotations on your lambda-like method bodies such as 
`hashNext() -> index < end;`)


Alex


Re: New JEP: Concise Method Bodies

2018-09-24 Thread Alex Buckley

On 9/21/2018 4:07 PM, Kevin Bourrillion wrote:

"The method reference form can use most kinds of method references after
the = sign: static and unbound method references, bound method
references (if the receiver *variable* is in scope for the method
declaration)"

Can we still bind to any expression?

   private String a(String b, int c)
   throws SomeCheckedException = *d.e()*::f;


No limitation is intended on the receiver, as long as the method 
reference expression would be legal if it appeared in the traditional 
body of method `a`. (Details of this translation remain to be worked out.)


Scope isn't the right concept to appeal to; for example, an instance 
variable is in scope throughout a static method, but that doesn't mean 
the variable can be used in the body of the static method. I'll remove 
the troublesome clause. Array creation references should also be 
mentioned alongside constructor references.



... and would this be valid whether it is `e()` or `f()` (or both) that
throws SomeCheckedException?


I expect so. Do you have a situation that suggests otherwise?

Alex


Re: JEP draft: Concise Method Bodies - extend this to local functions?

2018-09-20 Thread Alex Buckley

On 9/20/2018 2:16 PM, Remi Forax wrote:

yes, but in your example the return type is not the same, i prefer mine

   class Utils {
 Function fun() = this::bar;
 Function fun2() -> this::bar;

 Function bar() { return null; }
 String bar(String s) { return null; }
   }


Yes, it's good to omit parameters for the methods with concise bodies. 
I've updated the JEP. (Hope I got it right!)


Alex


Re: JEP draft: Concise Method Bodies - extend this to local functions?

2018-09-20 Thread Alex Buckley

On 9/20/2018 1:28 PM, Maurizio Cimadamore wrote:

Function fun() = Utils::bar;
Function fun = Utils::bar;

(first is method body, second is variable initializer)


I think Remi is noting the fact that, when using `->`, the single
expression can be a method reference expression. I have already
recorded this situation near the end of the JEP.

Ok - then I added another :-) [not sure we should be worried about it,
but perhaps worth mentioning in the JEP]


The puzzler practically writes itself. We distinguish:

1.  boolean isEmpty(String s)   =  String::isEmpty;
2.  Predicate isEmpty   =  String::isEmpty;
3.  Predicate isEmpty(String s) -> String::isEmpty;

1.  Method declaration, using method reference form.
2.  Field declaration,  initialized with a method reference exprssion.
3.  Method declaration, using single expression form.

Alex


Re: JEP draft: Concise Method Bodies - extend this to local functions?

2018-09-20 Thread Alex Buckley

On 9/20/2018 1:08 PM, Maurizio Cimadamore wrote:

On 20/09/18 17:32, Remi Forax wrote:

There is also a potential confusion between
  Function fun() = Utils::bar;
and
  Function fun() -> Utils::bar;


You meant between

Function fun() = Utils::bar;

and

Function fun = Utils::bar;

?

(first is method body, second is variable initializer)


I think Remi is noting the fact that, when using `->`, the single 
expression can be a method reference expression. I have already recorded 
this situation near the end of the JEP.


Alex


Re: New JEP: Concise Method Bodies

2018-09-20 Thread Alex Buckley

On 9/20/2018 12:05 PM, Kevin Bourrillion wrote:

In this case, I think the `=/` /form /might/ also clear that bar because
of the automatic parameter pass-through. But I cannot currently see how
the `->` form comes close to clearing it.


(Data welcome.)


Ooh, you know the magic words! Okay, we will analyze cases amenable to
the `=` form, but I don't currently see a reason to gather stats on `->`
form applicability.


As ever, thank you for any and all analysis that you can provide. FWIW 
I've updated the JEP with an example of using `->` in anonymous classes 
-- it's quite fun.


Alex


Re: New JEP: Concise Method Bodies

2018-09-19 Thread Alex Buckley

Hi Kevin,

On 9/19/2018 12:31 PM, Kevin Bourrillion wrote:

In other cases, it looks like you're gaining a very /small/ amount of
syntactic conciseness (mostly omitting `return`) and not much else? Is
there any actual /conceptual/ simplicity or clarity being gained?


Let me focus on the single expression form. The argument we've made is: 
statement lambdas have a concise form, expression lambdas, that improves 
clarity, and method bodies are like statement lambdas, so they should 
have a concise form too. Maybe that's stretching the connection between 
statement lambdas and method bodies too thin, but it's a fair starting 
point. Perhaps the outcome will be classes with a mishmash of method 
body forms, where the jumping between {} and -> acts to hurt readability 
overall even if individual methods are simpler. (Shades of switch 
expressions having both : and -> cases.) And, empirically, perhaps there 
are very few method bodies in your codebase simple enough to adopt the 
-> form. (Data welcome.)


Alex


JEPs don't seem to often include any discussion of what the costs of the
feature change. So evaluating benefit vs. cost is not easy. For example,
in this case, it becomes harder to understand and explain what a method
reference even /is/. I've been saying "it's just a lambda expression",
but either that's gone, or it's now becoming harder to understand and
explain what a lambda expression is.

I think moral-hazard arguments also deserve a bit of thought.

   public A b(C c, D d, E e, F f) { return g.h(c, d, e, f); }

If I forgot an "if e is empty, throw" check in here, I'll just insert
it. But if it was this:

   public A b(C c, D d, E e, F f) = g::h;

I'm probably less likely to do that.

This is a bit similar to why our style guide requires braces around
single-statement if blocks. It's too annoying to deal with inserting
them and removing them all the time as conditions change.

Perhaps these costs don't add up to anything massive, but we should
still get a fix on them because if it turns out the benefits
/also/ don't add up to something massive then?

I hope this is helpful.


On Wed, Sep 19, 2018 at 11:58 AM mailto:mark.reinh...@oracle.com>> wrote:

2018/9/19 11:42:16 -0700, alex.buck...@oracle.com
:
 > https://bugs.openjdk.java.net/browse/JDK-8209434

Or, more readably: http://openjdk.java.net/jeps/8209434

- Mark



--
Kevin Bourrillion | Java Librarian | Google, Inc. |kev...@google.com



Re: New JEP: Concise Method Bodies

2018-09-19 Thread Alex Buckley
Right. A concise method body is introduced with an arrow or equals sign, 
and concluded with a semicolon. It just so happens that the concise 
method body below features a switch expression, which (unusually among 
expressions) has a block of its own. (And that's why I chose it as an 
example, to stress the concise-method-body feature.)


Alex

On 9/19/2018 12:49 PM, Remi Forax wrote:

oops,
no, the ';' is necessary here, sorry.

Rémi

- Mail original -

De: "Remi Forax" 
À: "mark reinhold" 
Cc: "amber-spec-experts" 
Envoyé: Mercredi 19 Septembre 2018 21:40:41
Objet: Re: New JEP: Concise Method Bodies



There is a ';' at the end of dayOfWeek that should not be there.

  String dayOfWeek(int d) -> switch (d) {
case 1 -> "SUNDAY";
case 2 -> "MONDAY";
...
  };  // <---- should be removed


Rémi

- Mail original -

De: "mark reinhold" 
À: "Alex Buckley" 
Cc: "amber-spec-experts" 
Envoyé: Mercredi 19 Septembre 2018 20:54:46
Objet: Re: New JEP: Concise Method Bodies



2018/9/19 11:42:16 -0700, alex.buck...@oracle.com:

https://bugs.openjdk.java.net/browse/JDK-8209434


Or, more readably: http://openjdk.java.net/jeps/8209434

- Mark


New JEP: Concise Method Bodies

2018-09-19 Thread Alex Buckley

https://bugs.openjdk.java.net/browse/JDK-8209434


Re: JEP325: Switch expressions spec

2018-05-10 Thread Alex Buckley

On 5/10/2018 10:36 AM, Dan Smith wrote:

On May 10, 2018, at 6:28 AM, Gavin Bierman
 wrote:

15.29 "the switch expression completes normally": More
conventionally, "the value of the switch expression is …"


That phrase occurs in several places, so you’ll have to tell me
which one you don’t like.


"If execution of the Statement completes abruptly for the reason of a
break with a value, then the switch expression completes normally
with that value."

I'd suggest changing to "for the reason of a break with a value _V_,
then the value of the switch expression is _V_."

For comparison, in all of Chapter 15, I only find four usages of the
phrase "complete[s] normally". Lots of usages of "complete[s]
abruptly", though. Switch _bodies_ are special, because they contain
statements, so it makes sense to say "completes normally" here. But
once we've left the body and we're talking about the switch
expression as a whole, it's better to use the expression-oriented
terminology.


I agree that expressions completing normally has rarely been spelled out 
in Ch.15, and that the value resulting from evaluation of the expression 
is usually more interesting.


However, JEP 325 is the first time that the JLS will explicitly pend an 
expression's evalation on a statement's completion. Usually it's the 
other way round, per 14.1: "If a statement evaluates an expression, 
abrupt completion of the expression always causes ...". And lambda 
expressions, with bodies containing statements, didn't need to do it 
explicitly in 15.27.2 or 15.27.4.


So, since a value-break statement is a common way out of a switch 
expression, I would like the JLS to put the completes-abruptly side by 
side with a completes-normally. The abrupt completion of `break e;` is 
"swallowed" by the normal completion of the enclosing switch expression. 
(This appeals to the disrupted completion that `catch` and `finally` 
clauses arrange.)


AND, the JLS should specify the value of the switch expression, based on 
evaluating the argument of the value-break statement, as you have proposed.


Alex


Re: JEP325: Switch expressions spec

2018-04-27 Thread Alex Buckley

On 4/27/2018 8:03 AM, Gavin Bierman wrote:

I have uploaded the latest draft of the spec for JEP 325 at 
http://cr.openjdk.java.net/~gbierman/switch-expressions.html


14.16 is right to say that:

  A break statement with value Expression ***attempts to cause the
  evaluation of the immediately enclosing switch expression***
  to complete normally ...

because the following is legal (x will become 200) :

  int x = switch (e) {
case 1  -> {
  try { break 100; } finally { break 200; }
}
default -> 0;
  };

Therefore, in the discussion section, please say that:

  The preceding descriptions say "attempts to transfer control"
  ***and "attempts to cause evaluation to complete normally",***
  rather than just "transfers control" ***and "causes evaluation
  to complete normally",*** because if there are any try statements ...

  ... innermost to outermost, before control is transferred to the
  break target ***or evaluation of the break target completes***.

  [Notice we don't yet know if evaluation of the break target
   will complete normally or abruptly. If the finally clause above
   was to throw an exception instead of break-200, then the
   switch expression would complete abruptly by reason of the
   exception, rather than completing normally with the value 100.]

(Separately: Please flag the new text in 15.15's opening line.)

Alex


Re: JEP325: Switch expressions spec

2018-04-18 Thread Alex Buckley

On 4/18/2018 11:16 AM, Kevin Bourrillion wrote:

Evaluation of an expression can produce side effects, because
expressions may contain embedded assignments, increment operators,
decrement operators, and method invocations. *In addition, lambda
expressions and switch expressions have bodies that may contain
arbitrary statements.

A lambda "contains" statements /physically/, but nothing gets
executed. If anything, it is anonymous /classes/ that belong here
(though maybe, arguably, that would be covered if "method invocations"
was changed to "method or constructor invocations"?).


The goal was to highlight that a lambda/switch expression is not like 
(say) a field access expression, because of the ability to have a body 
of statements rather than merely a tree of subexpressions ... but you're 
right, "Evaluation of a lambda expression is distinct from execution of 
the lambda body." (JLS 15.27.4)



Suggestion: "... because expressions may contain embedded assignments,
increment operators, decrement operators, and method or constructor
invocations, as well as arbitrary statements nested inside a switch
expression."


Yes, limiting the arbitrariness to switch expressions (the sole "home" 
for something-resembling-block-expressions) is right.


Alex


Re: Raw string literals and Unicode escapes

2018-02-26 Thread Alex Buckley

On 2/25/2018 4:19 AM, Remi Forax wrote:

I'm late in the game but why not using the same system as Perl, PHP,
Ruby to solve the Lts [1], i.e
you have a sequence that says this is the starts of a raw string (%Q,
qq, m) then a character (in a predefined list), the raw string and at
the end of the raw string the same character as at the beginning (or its
mirror).

By example, this 'raw' as prefix for a raw string
raw`this is a raw string`
raw'this is another raw string'
raw[yet another raw string]


See "Choice of Delimiters" in the "Alternatives" section of the JEP.

Alex


Re: Raw string literals and Unicode escapes

2018-02-14 Thread Alex Buckley

On 2/14/2018 1:48 PM, John Rose wrote:

P.S. I posted another version that takes a slightly different
tack on the restriction of "cannot begin with a backquote".
It basically lifts the whole design of Markdown code quotes.

http://cr.openjdk.java.net/~jrose/jls/raw-string-pages-v5.pdf


The inclusion of RawSP means that you are fully delivering on your 
trailer from Jan 30: "Spoiler: I think I can prove that Markdown code 
quoting is appropriately minimal in its design, in a way Jim's is not."


Let me first recognize the power of RawSP in lifting TWO restrictions: 
cannot begin with a backtick, and cannot end with a backtick:


  String s = ``Hi `Bob```;   // Error, unbalanced delimiters
  String s = ``Hi `Bob`` + "`";  // OK
  String s = `` Hi `Bob` ``; // OK with RawSP trick

However, since the JEP's goal is to allow copy-paste of arbitrary text 
without interpretation, I think the RawSP trick of assigning meaning to 
whitespace is out of place. To most people, the raw string literal:


  ` and `

denotes a perfectly good five-character string that will probably be 
inserted between two other strings. Explaining that, no, it's really a 
three-character string will not be popular.


Also, the inclusion of RawSP makes the lexing of RawStringLiteral 
ambiguous, since RawStringBody allows opening and closing whitespace. No 
doubt this can be fixed with rules involving "If the first character 
after RawSP is a backtick ...", but now being like Markdown is getting 
expensive.


Alex


Re: Raw string literals and Unicode escapes

2018-02-14 Thread Alex Buckley

On 2/14/2018 12:42 PM, John Rose wrote:

On Feb 14, 2018, at 12:24 PM, Alex Buckley mailto:alex.buck...@oracle.com>> wrote:


There is plenty of precedent for semantic rules


In my draft version this is done with "where" clauses on the
grammar rules:


RawStringLiteral:

  RawQuote RawStringBody RawQuote
  where the two raw-quotes are constrained to be identical

RawQuote:
  ` {`}
  where the preimage is constrained to be unescaped


We're dancing on the head of a pin now, but as a matter of 
specificational style I'm wary of too many rules in the grammar itself, 
especially a context-sensitive rule like raw-quotes-must-balance.


JLS 3.10.5 is a good specimen to study: there is a context-free rule in 
the grammar:


  StringCharacter:
InputCharacter but not " or \

and a context-sensitive semantic rule:

  It is a compile-time error for a line terminator to appear
  after the opening " and before the closing matching ".

Strictly speaking, the semantic rule is unnecessary because 
InputCharacter is DEFINED to exclude the CR and LF line terminators! But 
the semantic rule makes the intent very very clear. Writing rules in 
this form also prevents the spec from becoming a soup of statements that 
are more than just observations but less than full-throated assertions.


Anyway, the draft was very useful, thanks!

Alex


Re: Raw string literals and Unicode escapes

2018-02-14 Thread Alex Buckley

On 2/13/2018 2:19 PM, Jim Laskey wrote:

10a. String s = `abc`; 10b. String s = \u0060abc`;
...
So, change the scanner to

A) Peek back to make sure the first open backtick was exactly a
backtick. B) Turn off Unicode escapes immediately so that only
backtick characters can be part of the delimiter. C) Turn on Unicode
escapes only after a valid closing delimiter is encountered.

Based on this all your examples are illegal.


I am not opposed to saying that a delimiter must be constructed from 
actual ` characters (that is, the RawInputCharacter ` rather than the 
UnicodeEscape \u0060). It would be silly if the opening delimiter was 
\u0060 because the closing delimiter cannot be identical -- that hurts 
readability. (Clearly the six characters \ u 0 0 6 0 inside a raw string 
literal get no special processing.)


Unfortunately, there is nothing in the lexical grammar that prevents 
\u0060Hello` or \u0060Hello\u0060 or in fact any of the examples below
from being lexed as a RawStringLiteral. The JLS will need a semantic 
rule to force each RawStringDelimiter to be composed of actual ` 
characters. As you say, this will make all the examples below illegal.


There is plenty of precedent for semantic rules ("It is a compile-time 
error ...") in the interpretation of Literal tokens, so that's fine. In 
fact, JLS 3.10.4 already has a semantic rule that appears to constrain a 
delimiter in a CharacterLiteral token:


  It is a compile-time error for the character following the
  SingleCharacter or EscapeSequence to be other than a '.

although it doesn't mean to force an actual ' character (that is, the 
RawInputCharacter ' and not the UnicodeEscape \u0027). It means:


  It is a compile-time error for the character following the
  SingleCharacter or EscapeSequence to be other than a ' (or the
  Unicode escape thereof).

Alex


On Feb 13, 2018, at 1:58 PM, Alex Buckley 
wrote:

I suspect the trickiest part of specifying raw string literals will
be the lexer's modal behavior for Unicode escapes. As such, I am
going to put the behavior under the microscope. Here is what the
JEP has to say:

- Unicode escapes, in the form \u, are processed as part of
character input prior to interpretation by the lexer. To support
the raw string literal as-is requirement, Unicode escape processing
is disabled when the lexer encounters an opening backtick and
reenabled when encountering a closing backtick. -

I would like to assume that if the lexer comes across the six
tokens \ u 0 0 6 0  then it should interpret them as a Unicode
escape representing a backtick _and then continue as if consuming
the tokens of a raw string literal_. However, the mention of _an_
opening backtick and _a_ closing backtick gave me pause, given that
repeated backticks can serve as the opening delimiter and the
closing delimiter. For absolute clarity, let's write out examples
to confirm intent: (Jim, please confirm or deny as you see fit!)

1.  String s = \u0060`;

Illegal. The RHS is lexed as ``;   which is disallowed by the
grammar.

2.  String s = \u0060Hello\u0060;

Illegal. The RHS is lexed as `Hello\u0060;   and so on for the rest
of the compilation unit -- the six tokens \ u 0 0 6 0 are not
treated as a Unicode escape since we're lexing a raw string
literal. And without a closing delimiter before the end of the
compilation unit, a compile-time error occurs.

3a.  String s = \u0060Hello`;

Legal. The RHS is lexed as `Hello`;   which is well formed.

3b.  String s = \u0060\u0060Hello`;

Depends! If you take the JEP literally, then just the Unicode
escape which serves as the first opening backtick ("_an_ opening
backtick") is enough to enter raw-string mode. That makes the code
legal: the RHS is lexed as `\u0060Hello`;   which is well formed.
On the other hand, you might think that we shouldn't enter
raw-string mode until the lexer in traditional mode has lexed the
opening delimiter fully (i.e. ALL the opening backticks). Then, the
code in 3b is illegal, because the opening delimiter (``) and the
closing delimiter (`) are not symmetric.

I think we should take the JEP literally, so that 3b is legal. And
then, some more examples:

4a.  String s = \u0060`Hello``;

Legal. The RHS is lexed as ``Hello``;   which is well formed.

4b.  String s = \u0060\u0060Hello``;

Illegal. The RHS is lexed as `\u0060Hello``;   which is disallowed
by the grammar. A raw string literal containing 11 tokens is
immediately followed by a ` token and a ; token which are not
expected.

4c.  String s = \u0060\u0060Hello`\u0060;

Depends! If you take the JEP literally, where _a_ closing backtick
is enough to re-enable Unicode escape processing, then the RHS is
lexed as `\u0060Hello``;  which is illegal per 4b. On the other
hand, if you think that we shouldn't re-enter traditional mode
until the lexer in raw-string mode has lexed the closing delimiter
fully (i.e. ALL the closing backticks), th

Re: Raw string literals and Unicode escapes

2018-02-14 Thread Alex Buckley

On 2/13/2018 2:11 PM, John Rose wrote:

On Feb 13, 2018, at 9:58 AM, Alex Buckley mailto:alex.buck...@oracle.com>> wrote:


I suspect the trickiest part of specifying raw string literals will be
the lexer's modal behavior for Unicode escapes. As such, I am going to
put the behavior under the microscope.


For an approach to this see:
http://cr.openjdk.java.net/~jrose/jls/raw-string-pages-v4.pdf

In short:  We define a so-called "preimage" for each token,
which is the unambiguously defined sequence of UTF-16
code points that translate to that token via \u substitution
and line terminator normalization.

For raw strings (only) the preimage of a token is significant.
The backticks of a raw string (both opening and closing)
are required to be their own preimage (no \u0060 allowed).
And the raw string body contents are the preimage of the
string token, not the normal token image.

I think preimage is the trick we need here, and it settles
a number of questions, such as those you raised.
All of the tricky examples you raised are uniformly illegal,
under the preimage rule for raw-string quotes.


I agree that holding on to the preimage of each InputElement (JLS 3.5) 
is necessary because ` can legitimately appear in some kinds of 
InputElement as an ordinary InputCharacter (derived from either the 
RawInputCharacter ` or the UnicodeEscape \u0060):


1.  Comment

// This Markdown processor treats ` specially.
/* This Markdown processor treats \u0060 specially. */

2.  Token (and more specifically, StringLiteral)

"Hi `Bob`"
"Hi \u0060Bob\u0060"

Only if the InputElement is a Token, and more specifically a 
RawStringLiteral, do we need to take the sequence of InputCharacters and 
LineTerminators that constitute its RawStringBody and replace that 
sequence with its preimage.


I want to say something about the delimiters of the raw string literal 
now, but I'll do that in response to Jim's mail.


Alex


Raw string literals and Unicode escapes

2018-02-13 Thread Alex Buckley
I suspect the trickiest part of specifying raw string literals will be 
the lexer's modal behavior for Unicode escapes. As such, I am going to 
put the behavior under the microscope. Here is what the JEP has to say:


-
Unicode escapes, in the form \u, are processed as part of character 
input prior to interpretation by the lexer. To support the raw string 
literal as-is requirement, Unicode escape processing is disabled when 
the lexer encounters an opening backtick and reenabled when encountering 
a closing backtick.

-

I would like to assume that if the lexer comes across the six tokens \ u 
0 0 6 0  then it should interpret them as a Unicode escape representing 
a backtick _and then continue as if consuming the tokens of a raw string 
literal_. However, the mention of _an_ opening backtick and _a_ closing 
backtick gave me pause, given that repeated backticks can serve as the 
opening delimiter and the closing delimiter. For absolute clarity, let's 
write out examples to confirm intent: (Jim, please confirm or deny as 
you see fit!)


1.  String s = \u0060`;

Illegal. The RHS is lexed as ``;   which is disallowed by the grammar.

2.  String s = \u0060Hello\u0060;

Illegal. The RHS is lexed as `Hello\u0060;   and so on for the rest of 
the compilation unit -- the six tokens \ u 0 0 6 0 are not treated as a 
Unicode escape since we're lexing a raw string literal. And without a 
closing delimiter before the end of the compilation unit, a compile-time 
error occurs.


3a.  String s = \u0060Hello`;

Legal. The RHS is lexed as `Hello`;   which is well formed.

3b.  String s = \u0060\u0060Hello`;

Depends! If you take the JEP literally, then just the Unicode escape 
which serves as the first opening backtick ("_an_ opening backtick") is 
enough to enter raw-string mode. That makes the code legal: the RHS is 
lexed as `\u0060Hello`;   which is well formed. On the other hand, you 
might think that we shouldn't enter raw-string mode until the lexer in 
traditional mode has lexed the opening delimiter fully (i.e. ALL the 
opening backticks). Then, the code in 3b is illegal, because the opening 
delimiter (``) and the closing delimiter (`) are not symmetric.


I think we should take the JEP literally, so that 3b is legal. And then, 
some more examples:


4a.  String s = \u0060`Hello``;

Legal. The RHS is lexed as ``Hello``;   which is well formed.

4b.  String s = \u0060\u0060Hello``;

Illegal. The RHS is lexed as `\u0060Hello``;   which is disallowed by 
the grammar. A raw string literal containing 11 tokens is immediately 
followed by a ` token and a ; token which are not expected.


4c.  String s = \u0060\u0060Hello`\u0060;

Depends! If you take the JEP literally, where _a_ closing backtick is 
enough to re-enable Unicode escape processing, then the RHS is lexed as 
`\u0060Hello``;  which is illegal per 4b. On the other hand, if you 
think that we shouldn't re-enter traditional mode until the lexer in 
raw-string mode has lexed the closing delimiter fully (i.e. ALL the 
closing backticks), then presumably you think analogously about the 
opening delimiter, so the RHS would be lexed as ``Hello`\u0060;   which 
is illegal per 2 (no closing delimiter `` before the end of the 
compilation unit).


5.  String s = \u0060`Hello`\u0060;

I put this here because it looks nice. It hits the same issues as 3b and 4c.

Alex


Re: Fix Parameter Runtime*ParameterAnnotations spec

2017-10-31 Thread Alex Buckley

On 10/31/2017 1:47 AM, Remi Forax wrote:

Hi all, the spec of the Runtime*ParameterAnnotations attribute [1],
allow the number of parameter annotations to be different from the
number of parameter from the method descriptor but it fails to
provide a way to retrieve/compute the mapping between a parameter and
a parameter annotation.

So people try to guess and fail, here is by example the ASM bug when
we tried to provide such mapping to our user [2].

We (the ASM team) believe the only way to fix that is to require that
if the number of parameters from the descriptor and the number of
parameter annotations doesn't match then compilers should also emit a
Parameter attribute which already indicate if a parameter is
synthetic or not.


I recognize that constructor parameters are a painful cause of 
discrepancy between the method descriptor and various attributes -- not 
just Runtime*ParameterAnnotations but also Signature (see the note "A 
method signature encoded by ..." in 
https://docs.oracle.com/javase/specs/jvms/se9/html/jvms-4.html#jvms-4.7.9.1).


However, the decision in JDK-8067975 was to loosen the JVMS' description 
so that it admitted the class files emitted by javac and ecj. If you 
want javac and ecj to emit something different than they do today, then 
your best bet is to write up a "State of the Parameters" page that shows 
the "interesting" programs, and what javac and ecj emit. Only then can 
suggestions like "Emit a MethodParameters attribute if ..." be evaluated.


Alex


  1   2   >