Re: An alternative to "restricted keywords" + helping automatic modules

2017-05-19 Thread Stephan Herrmann

Meanwhile we seem to have (at least) 4 proposals on the table.

Here's my biased summary:

(A) JLS up until 2017-05-18
PRO:
+ we have a specification
+ spec can be interpreted as allowing all module words to be mentionable
  in all relevant directives.
CON:
- the specification is interpreted differently by different experts
- the interpretation that Alex intends, requires parsing to be done
  ahead of scanning, which breaks established compiler technology.


(B) Remi: in case of ambiguity interpret as keyword, not identifier
PRO:
+ removes the need to parse more than you have scanned.
CON:
- makes "transitive" unmentionable as the first segment in a module name,
  and will cause the same effect on any modifier that may be added in
  the future.
- still requires stateful scanning, so typically syntax highlighting
  will be partly broken, still. (Isn't that an aesthetic aspect, too?)
  Other IDE functions are affected, too.


(C) Remi + Stephen: disambiguate by adding "module" before each module 
reference.
PRO:
+ removes the need to parse more than you have scanned.
+ avoids restriction regarding "transitive" and future modifiers.
CON:
- still requires stateful scanning. Implications see above


(D) Stephan: Escape module words to use them as identifier
PRO:
+ all module words can be used in package & module references
+ allows referring to modules which have Java keywords in their name
+ avoids adding new technical complexity
+ clearly specified in the proposal, mostly using standard concepts
CON:
- some find the occasional escape character (aesthetically) unpleasant


People favoring (B) or (C) could further promote their case by
providing a (near) formal specification.


For anybody interested in further technical implications, I'd be happy
to provide pointers to plenty of IDE functions that would be broken
(to different degrees) by (A) - (C).
When I spoke about bad error recovery, I wasn't complaining about
a few more man days of engineering work, but about a conceptual
impossibility to achieve the quality that users should expect.


Let me add my interpretation of why we are in this strange situation
in the first place:
The language of module-info is an unusual mix of:
- something like a DSL for declaring API and dependencies of modules
- a subset of Java
If we would take away the complexities of Java annotations and Java comments,
nobody would mind hand-coding an arbitrarily tricky parser that easily meets
all relevant goals. But nobody will hand-code a parser that is able to parse
a significant sub-set of Java.
It's the mix of both natures in one language that creates the conflict.

Finally, please don't take this as an issue of
language design *vs.* tool implementation.
We can only make our users happy, if both aspects smoothly integrate,

Stephan


On 19.05.2017 18:51, fo...@univ-mlv.fr wrote:

- Mail original -

De: "Stephan Herrmann" <stephan.herrm...@berlin.de>
À: fo...@univ-mlv.fr, jigsaw-dev@openjdk.java.net
Envoyé: Vendredi 19 Mai 2017 17:26:02
Objet: Re: An alternative to "restricted keywords" + helping automatic modules



Inline

On 19.05.2017 15:53, fo...@univ-mlv.fr wrote:





*De: *"Stephan Herrmann" <stephan.herrm...@berlin.de>
*À: *"John Rose" <john.r.r...@oracle.com>, jigsaw-dev@openjdk.java.net
*Cc: *"Rémi Forax" <fo...@univ-mlv.fr>
*Envoyé: *Vendredi 19 Mai 2017 12:37:07
*Objet: *Re: Re: An alternative to "restricted keywords" + helping automatic
modules

A quick question to keep the ball rolling:

Do we agree on the following assessment of the status quo?

  The definition of "restricted keywords" implies (without explicitly 
saying so),
  that classification of a word as keyword vs. identifier can only be made
  *after* parsing has accepted the enclosing ModuleDeclaration.
  (With some tweaks, this can be narrowed down to
   "after the enclosing ModuleDirective has been accepted")

  This definition is not acceptable.


I agree that this is not acceptable but this is not what we are proposing.


Who is "we"?

Note that your proposal let me conclude that "transitive" is not a legal
start of a module reference. If that is not what you intend, please provide
a specification-like description of what you have in mind.
Probably Stephen's proposal will come in handy for this issue?


transitive is not a valid start of a module name if you want to use it in a 
requires directive in Java,
but it's a valid module name for the JVM, you can create a module-info.class in 
another language than Java.



Your notes about possible implementation may help when we come to implementing,
but right now they may also distract from understandin

Re: An alternative to "restricted keywords" + helping automatic modules

2017-05-19 Thread forax
- Mail original -
> De: "Stephan Herrmann" <stephan.herrm...@berlin.de>
> À: fo...@univ-mlv.fr, jigsaw-dev@openjdk.java.net
> Envoyé: Vendredi 19 Mai 2017 17:26:02
> Objet: Re: An alternative to "restricted keywords" + helping automatic modules

> Inline
> 
> On 19.05.2017 15:53, fo...@univ-mlv.fr wrote:
>>
>>
>> 
>>
>> *De: *"Stephan Herrmann" <stephan.herrm...@berlin.de>
>> *À: *"John Rose" <john.r.r...@oracle.com>, jigsaw-dev@openjdk.java.net
>> *Cc: *"Rémi Forax" <fo...@univ-mlv.fr>
>> *Envoyé: *Vendredi 19 Mai 2017 12:37:07
>> *Objet: *Re: Re: An alternative to "restricted keywords" + helping 
>> automatic
>> modules
>>
>> A quick question to keep the ball rolling:
>>
>> Do we agree on the following assessment of the status quo?
>>
>>   The definition of "restricted keywords" implies (without explicitly 
>> saying so),
>>   that classification of a word as keyword vs. identifier can only be 
>> made
>>   *after* parsing has accepted the enclosing ModuleDeclaration.
>>   (With some tweaks, this can be narrowed down to
>>"after the enclosing ModuleDirective has been accepted")
>>
>>   This definition is not acceptable.
>>
>>
>> I agree that this is not acceptable but this is not what we are proposing.
> 
> Who is "we"?
> 
> Note that your proposal let me conclude that "transitive" is not a legal
> start of a module reference. If that is not what you intend, please provide
> a specification-like description of what you have in mind.
> Probably Stephen's proposal will come in handy for this issue?

transitive is not a valid start of a module name if you want to use it in a 
requires directive in Java,
but it's a valid module name for the JVM, you can create a module-info.class in 
another language than Java.

> 
> Your notes about possible implementation may help when we come to 
> implementing,
> but right now they may also distract from understanding the intention.

We have gone into the rabbit hole of talking about implementation because you 
ask to it's your point (3) ""restricted keywords" pose three problems to tool 
implementations".
The intention is to introduce restricted keywords (i prefer local keywords), to 
quote the the current draft of the JLS: "They are keywords solely where they 
appear as terminals in the ModuleDeclaration production (§7.7), and are 
identifiers everywhere else", so developers will not have to change all their 
Java codes because open, module, requires, transitive, exports, opens, to, 
uses, provides, and with are only keywords activated locally in 
module-info.java.

> 
> Stephan
> 
> 

Rémi

>>
>> You do not have to wait the reduction of ModuleDeclaration (or 
>> ModuleDirective),
>> the parser know its parsing state (the LR item)
>> during the parsing not at the end.
>> The LR analysis is not able to know at some point during the parsing which
>> production will be reduced later but it is able to know
>> which terminals will not lead to an error when shifting the next terminal.
>>
>> When you are in the middle of the parsing, the parser shift a terminal to go
>> from one state to another, so for a state the parser
>> knows if it can shift by a terminal which is among the set of restricted
>> keywords or not then either it can instruct the lexer
>> before scanning the token to activate the restricted keyword automata or 
>> after
>> having scanned the token it can classify the token as
>> a keyword instead of as an identifier.
>>
>> The idea is that the parser will not only tell when it reduces a production 
>> but
>> also when it is about to shift a restricted keyword.
>> So you can classify a token as an identifier or as a keyword because the 
>> parser
>> is able to bubble up that its parser state (the LR
>> item) may recognize a keyword.
>>
>>
>>
>> comments?
>> Stephan
>>
>>
>> Rémi
>>
>>
>> - ursprüngliche Nachricht -
>>
>> Subject: Re: An alternative to "restricted keywords" + helping automatic 
>> modules
>> Date: Fr 19 Mai 2017 07:27:31 CEST
>> From: John Rose<john.r.r...@oracle.com>
>> To: Stephan Herrmann<stephan.herrm...@berlin.de>
>>
>> On May 18, 20

Re: An alternative to "restricted keywords" + helping automatic modules

2017-05-19 Thread Stephan Herrmann

I haven't seen any reaction to this sub-topic:

- If an automatic module would have a name containing a Java keyword,
  is it OK to simply refuse handling this artifact as an automatic module?

Stephan

On 18.05.2017 10:59, Stephan Herrmann wrote:

Remi,

I see your proposal as a minimal compromise, avoiding the worst
of difficulties, but I think we can do better.

Trade-off:
In all posts I could not find a real reason against escaping,
aside from aesthetics. I don't see this as sufficient motivation
for a less-then-perfect solution.


Clarity:
I'm still not completely following your explanations, partly because
of the jargon you are using. I'll leave it to Alex to decide if he
likes the idea that JLS would have to explain terms like dotted
production.

Compare this to just adding a few more rules to the grammar,
where no hand-waving is needed for an explanation.
No, I did not say that escaping is a pervasive change.
I never said that the grammar for ordinary compilation units
should be changed.
If you like we only need to extend one rule for the scope of
modular compilation units: Identifier. It can't get simpler.


Completeness:
I understand you as saying, module names cannot start with
"transitive". Mind you, that every modifier that will be added
to the grammar for modules in the future will cause conflicts for
names that are now legal, and you won't have a means to resolve this.

By contrast, we can use the escaping approach even to solve one
more problem that has been briefly touched on this list before:

Automatic modules suffer from the fact that some artifact names may
have Java keywords in their name, which means that these artifacts
simply cannot be used as automatic modules, right?
Why not apply escaping also here? *Any* dot-separated sequence
of words could be used as module name, as long as module references
have a means to escape any keywords in that sequence.


Suitability for implementation:
As said, your proposal resolves one problem, but still IDE
functionality suffers from restricted keywords, because scanning
and parsing need more context information than normal.
- Recovery after a syntax error will regress.
- Scanning arbitrary regions of code is not possible.
Remember:
In an IDE code with syntax errors is the norm, not an exception,
as the IDE provides functionality to work on incomplete code.


Stephan


On 18.05.2017 00:34, Remi Forax wrote:

I want to answer this before we start the meetings because i really think that 
restricted keyword as i propose solve the issues
Stephan raised.


- Mail original -

De: "Stephan Herrmann" <stephan.herrm...@berlin.de>
À: jigsaw-dev@openjdk.java.net
Envoyé: Mardi 16 Mai 2017 11:49:45
Objet: Re: An alternative to "restricted keywords"



Thanks, Remi, for taking this to the EG list.

Some collected responses:


Remi: "from the user point of view, '^' looks like a hack"

This is, of course, a subjective statement. I don't share this view
and in years of experience with Xtext-languages (where this concept
is used by default) I never heard any user complain about this.

More importantly, I hold that such aesthetic considerations are of
much lesser significance than the question, whether we can explain
- unambiguously explain - the concept in a few simple sentences.
Explaining must be possible at two levels: in a rigorous specification
and in simple words for users of the language.


I'm not against ^, or ` as it has already asked to escape an identifier, but as 
you said it's a pervasive change that applies on
the whole grammar while i think that with restricted keyword (that really 
should be called local keywords) the changes only impact
the grammar that specifies a module-info.java



Remi: "a keyword which is activated if you are at a position in the
 grammar where it can be recognized".

I don't think 'being at a position in the grammar' is a good way of
explaining. Parsing doesn't generally have one position in a grammar,
multiple productions can be active in the same parser state.
Also speaking of a "loop" for modifiers seems to complicate matters
more than necessary.

Under these considerations I still see '^' as the clearest of all
solutions. Clear as a specification, simple to explain to users.


Eclipse uses a LR parser, for a LR parser, position == dotted production as i 
have written earlier, so no problem because it
corresponds to only one parser state.  Note that even if one do not use an LR 
or a LL parser, most hand written parser i've seen,
javac is one of them, also refers to dotted production in the comments of the 
corresponding methods.





Peter spoke about module names vs. package names.

I think we agree, that module names cannot use "module words",
whereas package names should be expected to contain them.


yes, that the main issue, package names may contains unqualified name like 
'transitive, ''with' or 'to'.
but i think people will a

Re: An alternative to "restricted keywords" + helping automatic modules

2017-05-19 Thread Stephan Herrmann

Inline

On 19.05.2017 15:53, fo...@univ-mlv.fr wrote:





*De: *"Stephan Herrmann" <stephan.herrm...@berlin.de>
*À: *"John Rose" <john.r.r...@oracle.com>, jigsaw-dev@openjdk.java.net
*Cc: *"Rémi Forax" <fo...@univ-mlv.fr>
*Envoyé: *Vendredi 19 Mai 2017 12:37:07
    *Objet: *Re: Re: An alternative to "restricted keywords" + helping 
automatic modules

A quick question to keep the ball rolling:

Do we agree on the following assessment of the status quo?

  The definition of "restricted keywords" implies (without explicitly 
saying so),
  that classification of a word as keyword vs. identifier can only be made
  *after* parsing has accepted the enclosing ModuleDeclaration.
  (With some tweaks, this can be narrowed down to
   "after the enclosing ModuleDirective has been accepted")

  This definition is not acceptable.


I agree that this is not acceptable but this is not what we are proposing.


Who is "we"?

Note that your proposal let me conclude that "transitive" is not a legal
start of a module reference. If that is not what you intend, please provide
a specification-like description of what you have in mind.
Probably Stephen's proposal will come in handy for this issue?

Your notes about possible implementation may help when we come to implementing,
but right now they may also distract from understanding the intention.

Stephan




You do not have to wait the reduction of ModuleDeclaration (or 
ModuleDirective), the parser know its parsing state (the LR item)
during the parsing not at the end.
The LR analysis is not able to know at some point during the parsing which 
production will be reduced later but it is able to know
which terminals will not lead to an error when shifting the next terminal.

When you are in the middle of the parsing, the parser shift a terminal to go 
from one state to another, so for a state the parser
knows if it can shift by a terminal which is among the set of restricted 
keywords or not then either it can instruct the lexer
before scanning the token to activate the restricted keyword automata or after 
having scanned the token it can classify the token as
a keyword instead of as an identifier.

The idea is that the parser will not only tell when it reduces a production but 
also when it is about to shift a restricted keyword.
So you can classify a token as an identifier or as a keyword because the parser 
is able to bubble up that its parser state (the LR
item) may recognize a keyword.



    comments?
Stephan


Rémi


    - ursprüngliche Nachricht -

Subject: Re: An alternative to "restricted keywords" + helping automatic 
modules
Date: Fr 19 Mai 2017 07:27:31 CEST
From: John Rose<john.r.r...@oracle.com>
To: Stephan Herrmann<stephan.herrm...@berlin.de>

On May 18, 2017, at 1:59 AM, Stephan Herrmann <stephan.herrm...@berlin.de 
<mailto:stephan.herrm...@berlin.de>> wrote:


In all posts I could not find a real reason against escaping,
aside from aesthetics. I don't see this as sufficient motivation
for a less-then-perfect solution.


So, by disregarding esthetics...


Clarity:
I'm still not completely following your explanations, partly because
of the jargon you are using. I'll leave it to Alex to decide if he
likes the idea that JLS would have to explain terms like dotted
production.

Compare this to just adding a few more rules to the grammar,
where no hand-waving is needed for an explanation.
No, I did not say that escaping is a pervasive change.
I never said that the grammar for ordinary compilation units
should be changed.
If you like we only need to extend one rule for the scope of
modular compilation units: Identifier. It can't get simpler.


Completeness:
I understand you as saying, module names cannot start with
"transitive". Mind you, that every modifier that will be added
to the grammar for modules in the future will cause conflicts for
names that are now legal, and you won't have a means to resolve this.

By contrast, we can use the escaping approach even to solve one
more problem that has been briefly touched on this list before:

Automatic modules suffer from the fact that some artifact names may
have Java keywords in their name, which means that these artifacts
simply cannot be used as automatic modules, right?
Why not apply escaping also here? *Any* dot-separated sequence
of words could be used as module name, as long as module references
have a means to escape any keywords in tha

Re: An alternative to "restricted keywords" + helping automatic modules

2017-05-19 Thread forax
> De: "Stephan Herrmann" <stephan.herrm...@berlin.de>
> À: "John Rose" <john.r.r...@oracle.com>, jigsaw-dev@openjdk.java.net
> Cc: "Rémi Forax" <fo...@univ-mlv.fr>
> Envoyé: Vendredi 19 Mai 2017 12:37:07
> Objet: Re: Re: An alternative to "restricted keywords" + helping automatic
> modules

> A quick question to keep the ball rolling:

> Do we agree on the following assessment of the status quo?

> The definition of "restricted keywords" implies (without explicitly saying 
> so),
> that classification of a word as keyword vs. identifier can only be made
> *after* parsing has accepted the enclosing ModuleDeclaration.
> (With some tweaks, this can be narrowed down to
> "after the enclosing ModuleDirective has been accepted")

> This definition is not acceptable.
I agree that this is not acceptable but this is not what we are proposing. 

You do not have to wait the reduction of ModuleDeclaration (or 
ModuleDirective), the parser know its parsing state (the LR item) during the 
parsing not at the end. 
The LR analysis is not able to know at some point during the parsing which 
production will be reduced later but it is able to know which terminals will 
not lead to an error when shifting the next terminal. 

When you are in the middle of the parsing, the parser shift a terminal to go 
from one state to another, so for a state the parser knows if it can shift by a 
terminal which is among the set of restricted keywords or not then either it 
can instruct the lexer before scanning the token to activate the restricted 
keyword automata or after having scanned the token it can classify the token as 
a keyword instead of as an identifier. 

The idea is that the parser will not only tell when it reduces a production but 
also when it is about to shift a restricted keyword. 
So you can classify a token as an identifier or as a keyword because the parser 
is able to bubble up that its parser state (the LR item) may recognize a 
keyword. 

> comments?
> Stephan

Rémi 

> - ursprüngliche Nachricht -

> Subject: Re: An alternative to "restricted keywords" + helping automatic 
> modules
> Date: Fr 19 Mai 2017 07:27:31 CEST
> From: John Rose<john.r.r...@oracle.com>
> To: Stephan Herrmann<stephan.herrm...@berlin.de>

> On May 18, 2017, at 1:59 AM, Stephan Herrmann < stephan.herrm...@berlin.de >
> wrote:

>> In all posts I could not find a real reason against escaping,
>> aside from aesthetics. I don't see this as sufficient motivation
>> for a less-then-perfect solution.

> So, by disregarding esthetics...

>> Clarity:
>> I'm still not completely following your explanations, partly because
>> of the jargon you are using. I'll leave it to Alex to decide if he
>> likes the idea that JLS would have to explain terms like dotted
>> production.

>> Compare this to just adding a few more rules to the grammar,
>> where no hand-waving is needed for an explanation.
>> No, I did not say that escaping is a pervasive change.
>> I never said that the grammar for ordinary compilation units
>> should be changed.
>> If you like we only need to extend one rule for the scope of
>> modular compilation units: Identifier. It can't get simpler.

>> Completeness:
>> I understand you as saying, module names cannot start with
>> "transitive". Mind you, that every modifier that will be added
>> to the grammar for modules in the future will cause conflicts for
>> names that are now legal, and you won't have a means to resolve this.

>> By contrast, we can use the escaping approach even to solve one
>> more problem that has been briefly touched on this list before:

>> Automatic modules suffer from the fact that some artifact names may
>> have Java keywords in their name, which means that these artifacts
>> simply cannot be used as automatic modules, right?
>> Why not apply escaping also here? *Any* dot-separated sequence
>> of words could be used as module name, as long as module references
>> have a means to escape any keywords in that sequence.

>> Suitability for implementation:
>> As said, your proposal resolves one problem, but still IDE
>> functionality suffers from restricted keywords, because scanning
>> and parsing need more context information than normal.

> …we obtain the freedom for IDEs to disregard abnormal
> amounts of context, saving uncounted machine cycles,

>> - Recovery after a syntax error will regress.

> …and we make life easier for all ten writers of error recovery
> functions,

>> - Scanning arbitrary regions of code is not possible.

> …we unleash the power of an army of grad students to study
> bidirectional parsing of module files,

>> Remember:
>> In an IDE code with syntax errors is the norm, not an exception,
>> as the IDE provides functionality to work on incomplete code.

> …and ease the burdens of the thousands who must spend their
> time looking at syntax errors for their broken module files.

> Nope, not for me. Give me esthetics, please. Really.

> — John

>  ursprüngliche Nachricht Ende 


Re: Re: An alternative to "restricted keywords" + helping automatic modules

2017-05-19 Thread Stephan Herrmann
A quick question to keep the ball rolling:

Do we agree on the following assessment of the status quo?

  The definition of "restricted keywords" implies (without explicitly saying 
so),
  that classification of a word as keyword vs. identifier can only be made
  *after* parsing has accepted the enclosing ModuleDeclaration.
  (With some tweaks, this can be narrowed down to 
   "after the enclosing ModuleDirective has been accepted")

  This definition is not acceptable.

comments?
Stephan




- ursprüngliche Nachricht -----

Subject: Re: An alternative to "restricted keywords" + helping automatic modules
Date: Fr 19 Mai 2017 07:27:31 CEST
From: John Rose<john.r.r...@oracle.com>
To: Stephan Herrmann<stephan.herrm...@berlin.de>

On May 18, 2017, at 1:59 AM, Stephan Herrmann <stephan.herrm...@berlin.de> 
wrote:

In all posts I could not find a real reason against escaping,
aside from aesthetics. I don't see this as sufficient motivation
for a less-then-perfect solution.

So, by disregarding esthetics...

Clarity:
I'm still not completely following your explanations, partly because
of the jargon you are using. I'll leave it to Alex to decide if he
likes the idea that JLS would have to explain terms like dotted
production.

Compare this to just adding a few more rules to the grammar,
where no hand-waving is needed for an explanation.
No, I did not say that escaping is a pervasive change.
I never said that the grammar for ordinary compilation units
should be changed.
If you like we only need to extend one rule for the scope of
modular compilation units: Identifier. It can't get simpler.


Completeness:
I understand you as saying, module names cannot start with
"transitive". Mind you, that every modifier that will be added
to the grammar for modules in the future will cause conflicts for
names that are now legal, and you won't have a means to resolve this.

By contrast, we can use the escaping approach even to solve one
more problem that has been briefly touched on this list before:

Automatic modules suffer from the fact that some artifact names may
have Java keywords in their name, which means that these artifacts
simply cannot be used as automatic modules, right?
Why not apply escaping also here? *Any* dot-separated sequence
of words could be used as module name, as long as module references
have a means to escape any keywords in that sequence.


Suitability for implementation:
As said, your proposal resolves one problem, but still IDE
functionality suffers from restricted keywords, because scanning
and parsing need more context information than normal.

…we obtain the freedom for IDEs to disregard abnormalamounts of context, saving 
uncounted machine cycles,
- Recovery after a syntax error will regress.

…and we make life easier for all ten writers of error recoveryfunctions,
- Scanning arbitrary regions of code is not possible.

…we unleash the power of an army of grad students to studybidirectional parsing 
of module files,
Remember:
In an IDE code with syntax errors is the norm, not an exception,
as the IDE provides functionality to work on incomplete code.

…and ease the burdens of the thousands who must spend theirtime looking at 
syntax errors for their broken module files.
Nope, not for me.  Give me esthetics, please.  Really.
— John

 ursprüngliche Nachricht Ende 



Re: An alternative to "restricted keywords" + helping automatic modules

2017-05-19 Thread Stephen Colebourne
I don't support the ^ element or escaping like that either.

However, would adding a "module" keyword help?

module com.foo.lib {
  requires module com.foo.bar;
}

thus:

module com.foo.lib {
  requires static module blah;
  requires transitive module transitive;
}

ie. the module name is always prefixed by "module" in a "requires"
statement. But does this help?

Stephen


On 18 May 2017 at 09:59, Stephan Herrmann <stephan.herrm...@berlin.de> wrote:
> Remi,
>
> I see your proposal as a minimal compromise, avoiding the worst
> of difficulties, but I think we can do better.
>
> Trade-off:
> In all posts I could not find a real reason against escaping,
> aside from aesthetics. I don't see this as sufficient motivation
> for a less-then-perfect solution.
>
>
> Clarity:
> I'm still not completely following your explanations, partly because
> of the jargon you are using. I'll leave it to Alex to decide if he
> likes the idea that JLS would have to explain terms like dotted
> production.
>
> Compare this to just adding a few more rules to the grammar,
> where no hand-waving is needed for an explanation.
> No, I did not say that escaping is a pervasive change.
> I never said that the grammar for ordinary compilation units
> should be changed.
> If you like we only need to extend one rule for the scope of
> modular compilation units: Identifier. It can't get simpler.
>
>
> Completeness:
> I understand you as saying, module names cannot start with
> "transitive". Mind you, that every modifier that will be added
> to the grammar for modules in the future will cause conflicts for
> names that are now legal, and you won't have a means to resolve this.
>
> By contrast, we can use the escaping approach even to solve one
> more problem that has been briefly touched on this list before:
>
> Automatic modules suffer from the fact that some artifact names may
> have Java keywords in their name, which means that these artifacts
> simply cannot be used as automatic modules, right?
> Why not apply escaping also here? *Any* dot-separated sequence
> of words could be used as module name, as long as module references
> have a means to escape any keywords in that sequence.
>
>
> Suitability for implementation:
> As said, your proposal resolves one problem, but still IDE
> functionality suffers from restricted keywords, because scanning
> and parsing need more context information than normal.
> - Recovery after a syntax error will regress.
> - Scanning arbitrary regions of code is not possible.
> Remember:
> In an IDE code with syntax errors is the norm, not an exception,
> as the IDE provides functionality to work on incomplete code.
>
>
> Stephan
>
>
> On 18.05.2017 00:34, Remi Forax wrote:
>>
>> I want to answer this before we start the meetings because i really think
>> that restricted keyword as i propose solve the issues Stephan raised.
>>
>>
>> - Mail original -
>>>
>>> De: "Stephan Herrmann" <stephan.herrm...@berlin.de>
>>> À: jigsaw-dev@openjdk.java.net
>>> Envoyé: Mardi 16 Mai 2017 11:49:45
>>> Objet: Re: An alternative to "restricted keywords"
>>
>>
>>> Thanks, Remi, for taking this to the EG list.
>>>
>>> Some collected responses:
>>>
>>>
>>> Remi: "from the user point of view, '^' looks like a hack"
>>>
>>> This is, of course, a subjective statement. I don't share this view
>>> and in years of experience with Xtext-languages (where this concept
>>> is used by default) I never heard any user complain about this.
>>>
>>> More importantly, I hold that such aesthetic considerations are of
>>> much lesser significance than the question, whether we can explain
>>> - unambiguously explain - the concept in a few simple sentences.
>>> Explaining must be possible at two levels: in a rigorous specification
>>> and in simple words for users of the language.
>>
>>
>> I'm not against ^, or ` as it has already asked to escape an identifier,
>> but as you said it's a pervasive change that applies on the whole grammar
>> while i think that with restricted keyword (that really should be called
>> local keywords) the changes only impact the grammar that specifies a
>> module-info.java
>>
>>>
>>> Remi: "a keyword which is activated if you are at a position in the
>>>  grammar where it can be recognized".
>>>
>>> I don't think 'being at a position in the grammar' is a good way of
>>> explaining. Parsing doesn't generally have one position

Re: An alternative to "restricted keywords" + helping automatic modules

2017-05-19 Thread forax
- Mail original -
> De: "Stephan Herrmann" <stephan.herrm...@berlin.de>
> À: "Remi Forax" <fo...@univ-mlv.fr>, jigsaw-dev@openjdk.java.net
> Envoyé: Jeudi 18 Mai 2017 10:59:09
> Objet: Re: An alternative to "restricted keywords" + helping automatic modules

> Remi,

Stephan,

>
> I see your proposal as a minimal compromise, avoiding the worst
> of difficulties, but I think we can do better.

better is usually a bitter enemy

>
> Trade-off:
> In all posts I could not find a real reason against escaping,
> aside from aesthetics. I don't see this as sufficient motivation
> for a less-then-perfect solution.
>
>
> Clarity:
> I'm still not completely following your explanations, partly because
> of the jargon you are using. I'll leave it to Alex to decide if he
> likes the idea that JLS would have to explain terms like dotted
> production.

Sorry for the jargon, dotted production is the same thing as a LR parser item 
[1],
the dot mark the parsing position inside a production.
i used 'dotted production' instead of parser item because usually it's clearer 
for my students.

>
> Compare this to just adding a few more rules to the grammar,
> where no hand-waving is needed for an explanation.
> No, I did not say that escaping is a pervasive change.
> I never said that the grammar for ordinary compilation units
> should be changed.
> If you like we only need to extend one rule for the scope of
> modular compilation units: Identifier. It can't get simpler.
>

I do not like ^ because
- as John said, esthetics is important
- it pushes the burden to the developers and not to the guys that implements 
the grammar.
  I'm Ok to makes your life and the life of people that implement the Java 
grammar (me included) less fun if for all other Java developers it just works, 
given the scale of the Java community, it seems to be a good compromise.
- ^ has to be a pervasive change, i mean it can be specified as a change only 
for module-info but from the developers point of view, it will be weird if you 
introduce ^ in module-info and not introduce it in the whole grammar
  so it's a global solution to local problem.  

so in my opinion, it's not that ^ does not work, as you said, it works in 
Xtend, it's that ^ is a escape hatch, it's better to use it when all other 
solutions do not work.

>
> Completeness:
> I understand you as saying, module names cannot start with
> "transitive". Mind you, that every modifier that will be added
> to the grammar for modules in the future will cause conflicts for
> names that are now legal, and you won't have a means to resolve this.
>
> By contrast, we can use the escaping approach even to solve one
> more problem that has been briefly touched on this list before:
>
> Automatic modules suffer from the fact that some artifact names may
> have Java keywords in their name, which means that these artifacts
> simply cannot be used as automatic modules, right?
> Why not apply escaping also here? *Any* dot-separated sequence
> of words could be used as module name, as long as module references
> have a means to escape any keywords in that sequence.
>
>
> Suitability for implementation:
> As said, your proposal resolves one problem, but still IDE
> functionality suffers from restricted keywords, because scanning
> and parsing need more context information than normal.
> - Recovery after a syntax error will regress.

Error recovery will not regress in all existing java file because restricted 
keyword only works when parsing the module-info.
And technically, there is no regression possible because the module-info was 
not existing before.
So error recovery after a syntax error in a module-info may be less fun to 
handle, as i said above, i'm ok with that.


> - Scanning arbitrary regions of code is not possible.

Scanning an arbitrary region is not easy in general, by example, you have if 
you are inside or outside a string, so you have to keep some information to be 
able to scan a region, why not trying to keep the parser state when necessary.
As John said, it seems to be a nice problem for grad students and at worst, you 
can use the existing code, it will display a restricted keyword in bold in the 
middle of a package name, that's all. 

> Remember:
> In an IDE code with syntax errors is the norm, not an exception,
> as the IDE provides functionality to work on incomplete code.
>
>
> Stephan

Rémi

[1] https://en.wikipedia.org/wiki/Canonical_LR_parser

>
>
> On 18.05.2017 00:34, Remi Forax wrote:
>> I want to answer this before we start the meetings because i really think 
>> that
>> restricted keyword as i propose solve the issues Stephan raised.
>>
>>
>> - Mail original -
>>> De: "Stephan Herrm

Re: An alternative to "restricted keywords" + helping automatic modules

2017-05-18 Thread John Rose
On May 18, 2017, at 1:59 AM, Stephan Herrmann  
wrote:
> 
> In all posts I could not find a real reason against escaping,
> aside from aesthetics. I don't see this as sufficient motivation
> for a less-then-perfect solution.

So, by disregarding esthetics...
> 
> Clarity:
> I'm still not completely following your explanations, partly because
> of the jargon you are using. I'll leave it to Alex to decide if he
> likes the idea that JLS would have to explain terms like dotted
> production.
> 
> Compare this to just adding a few more rules to the grammar,
> where no hand-waving is needed for an explanation.
> No, I did not say that escaping is a pervasive change.
> I never said that the grammar for ordinary compilation units
> should be changed.
> If you like we only need to extend one rule for the scope of
> modular compilation units: Identifier. It can't get simpler.
> 
> 
> Completeness:
> I understand you as saying, module names cannot start with
> "transitive". Mind you, that every modifier that will be added
> to the grammar for modules in the future will cause conflicts for
> names that are now legal, and you won't have a means to resolve this.
> 
> By contrast, we can use the escaping approach even to solve one
> more problem that has been briefly touched on this list before:
> 
> Automatic modules suffer from the fact that some artifact names may
> have Java keywords in their name, which means that these artifacts
> simply cannot be used as automatic modules, right?
> Why not apply escaping also here? *Any* dot-separated sequence
> of words could be used as module name, as long as module references
> have a means to escape any keywords in that sequence.
> 
> 
> Suitability for implementation:
> As said, your proposal resolves one problem, but still IDE
> functionality suffers from restricted keywords, because scanning
> and parsing need more context information than normal.

…we obtain the freedom for IDEs to disregard abnormal
amounts of context, saving uncounted machine cycles,

> - Recovery after a syntax error will regress.

…and we make life easier for all ten writers of error recovery
functions,

> - Scanning arbitrary regions of code is not possible.

…we unleash the power of an army of grad students to study
bidirectional parsing of module files,

> Remember:
> In an IDE code with syntax errors is the norm, not an exception,
> as the IDE provides functionality to work on incomplete code.

…and ease the burdens of the thousands who must spend their
time looking at syntax errors for their broken module files.

Nope, not for me.  Give me esthetics, please.  Really.

— John

Re: An alternative to "restricted keywords" + helping automatic modules

2017-05-18 Thread Stephan Herrmann

Remi,

I see your proposal as a minimal compromise, avoiding the worst
of difficulties, but I think we can do better.

Trade-off:
In all posts I could not find a real reason against escaping,
aside from aesthetics. I don't see this as sufficient motivation
for a less-then-perfect solution.


Clarity:
I'm still not completely following your explanations, partly because
of the jargon you are using. I'll leave it to Alex to decide if he
likes the idea that JLS would have to explain terms like dotted
production.

Compare this to just adding a few more rules to the grammar,
where no hand-waving is needed for an explanation.
No, I did not say that escaping is a pervasive change.
I never said that the grammar for ordinary compilation units
should be changed.
If you like we only need to extend one rule for the scope of
modular compilation units: Identifier. It can't get simpler.


Completeness:
I understand you as saying, module names cannot start with
"transitive". Mind you, that every modifier that will be added
to the grammar for modules in the future will cause conflicts for
names that are now legal, and you won't have a means to resolve this.

By contrast, we can use the escaping approach even to solve one
more problem that has been briefly touched on this list before:

Automatic modules suffer from the fact that some artifact names may
have Java keywords in their name, which means that these artifacts
simply cannot be used as automatic modules, right?
Why not apply escaping also here? *Any* dot-separated sequence
of words could be used as module name, as long as module references
have a means to escape any keywords in that sequence.


Suitability for implementation:
As said, your proposal resolves one problem, but still IDE
functionality suffers from restricted keywords, because scanning
and parsing need more context information than normal.
- Recovery after a syntax error will regress.
- Scanning arbitrary regions of code is not possible.
Remember:
In an IDE code with syntax errors is the norm, not an exception,
as the IDE provides functionality to work on incomplete code.


Stephan


On 18.05.2017 00:34, Remi Forax wrote:

I want to answer this before we start the meetings because i really think that 
restricted keyword as i propose solve the issues Stephan raised.


- Mail original -

De: "Stephan Herrmann" <stephan.herrm...@berlin.de>
À: jigsaw-dev@openjdk.java.net
Envoyé: Mardi 16 Mai 2017 11:49:45
Objet: Re: An alternative to "restricted keywords"



Thanks, Remi, for taking this to the EG list.

Some collected responses:


Remi: "from the user point of view, '^' looks like a hack"

This is, of course, a subjective statement. I don't share this view
and in years of experience with Xtext-languages (where this concept
is used by default) I never heard any user complain about this.

More importantly, I hold that such aesthetic considerations are of
much lesser significance than the question, whether we can explain
- unambiguously explain - the concept in a few simple sentences.
Explaining must be possible at two levels: in a rigorous specification
and in simple words for users of the language.


I'm not against ^, or ` as it has already asked to escape an identifier, but as 
you said it's a pervasive change that applies on the whole grammar while i 
think that with restricted keyword (that really should be called local 
keywords) the changes only impact the grammar that specifies a module-info.java



Remi: "a keyword which is activated if you are at a position in the
 grammar where it can be recognized".

I don't think 'being at a position in the grammar' is a good way of
explaining. Parsing doesn't generally have one position in a grammar,
multiple productions can be active in the same parser state.
Also speaking of a "loop" for modifiers seems to complicate matters
more than necessary.

Under these considerations I still see '^' as the clearest of all
solutions. Clear as a specification, simple to explain to users.


Eclipse uses a LR parser, for a LR parser, position == dotted production as i 
have written earlier, so no problem because it corresponds to only one parser 
state.  Note that even if one do not use an LR or a LL parser, most hand 
written parser i've seen, javac is one of them, also refers to dotted 
production in the comments of the corresponding methods.





Peter spoke about module names vs. package names.

I think we agree, that module names cannot use "module words",
whereas package names should be expected to contain them.


yes, that the main issue, package names may contains unqualified name like 
'transitive, ''with' or 'to'.
but i think people will also want to use existing package or more exactly 
prefix of existing package as module name, so we should also support having 
restricted keyword name as part of a module name.

The grammar is:

  open? module module_name {
requires (transitiv

Re: An alternative to "restricted keywords"

2017-05-17 Thread Remi Forax
I want to answer this before we start the meetings because i really think that 
restricted keyword as i propose solve the issues Stephan raised.


- Mail original -
> De: "Stephan Herrmann" <stephan.herrm...@berlin.de>
> À: jigsaw-dev@openjdk.java.net
> Envoyé: Mardi 16 Mai 2017 11:49:45
> Objet: Re: An alternative to "restricted keywords"

> Thanks, Remi, for taking this to the EG list.
> 
> Some collected responses:
> 
> 
> Remi: "from the user point of view, '^' looks like a hack"
> 
> This is, of course, a subjective statement. I don't share this view
> and in years of experience with Xtext-languages (where this concept
> is used by default) I never heard any user complain about this.
> 
> More importantly, I hold that such aesthetic considerations are of
> much lesser significance than the question, whether we can explain
> - unambiguously explain - the concept in a few simple sentences.
> Explaining must be possible at two levels: in a rigorous specification
> and in simple words for users of the language.

I'm not against ^, or ` as it has already asked to escape an identifier, but as 
you said it's a pervasive change that applies on the whole grammar while i 
think that with restricted keyword (that really should be called local 
keywords) the changes only impact the grammar that specifies a module-info.java

> 
> Remi: "a keyword which is activated if you are at a position in the
>  grammar where it can be recognized".
> 
> I don't think 'being at a position in the grammar' is a good way of
> explaining. Parsing doesn't generally have one position in a grammar,
> multiple productions can be active in the same parser state.
> Also speaking of a "loop" for modifiers seems to complicate matters
> more than necessary.
> 
> Under these considerations I still see '^' as the clearest of all
> solutions. Clear as a specification, simple to explain to users.

Eclipse uses a LR parser, for a LR parser, position == dotted production as i 
have written earlier, so no problem because it corresponds to only one parser 
state.  Note that even if one do not use an LR or a LL parser, most hand 
written parser i've seen, javac is one of them, also refers to dotted 
production in the comments of the corresponding methods.

> 
> 
> 
> Peter spoke about module names vs. package names.
> 
> I think we agree, that module names cannot use "module words",
> whereas package names should be expected to contain them.

yes, that the main issue, package names may contains unqualified name like 
'transitive, ''with' or 'to'.
but i think people will also want to use existing package or more exactly 
prefix of existing package as module name, so we should also support having 
restricted keyword name as part of a module name.

The grammar is:

  open? module module_name {
requires (transitive | static)* module_name;
exports package_name;
exports package_name to module_name1, module_name2;
opens package_name;
opens package_name to module_name1, module_name2;
uses xxx;
provides xxx with xxx, yyy;
  }

If we just consider package name, only 'opens' and 'exports' are followed by a 
package name and a package name can only been followed by ';' or 'to', so once 
'opens' is parsed, you know that you can have only an identifier so if it's not 
an identifier by one of the restricted keywords, it should be considered as an 
identifier.

As i said earlier, the scanner can see the restricted keyword as keyword and 
before feeding the token to the parser, you can check the parser state to see 
if the keyword as to be lowered to an identifier or not.

For module name, there is the supplementary problem of transitive, because if a 
module starts with transitive, you can have a conflict. As i said earlier, 
instead of using the next token to know if transitive is the keyword or part of 
the module name, i think we should consider it as a keyword, as the JLS said a 
restricted keyword is activated when it can appear, so "requires transitive" is 
not a valid directive.

> 
> Remi: "you should use reverse DNS naming for package so no problem :)"
> 
> "to" is a "module word" and a TLD.
> I think we should be very careful in judging that a existing conflict
> is not a real problem. Better to clearly and rigorously avoid the
> conflict in the first place.

to as the first part of a package/module and to as in exports ... to can not be 
present on the same dotted production, because exports as to be followed by a 
package_name so 'to' here means the start of a package name and then because a 
package name can not ends with '.' you always know if you are inside the 
production recognizing the package_name or outside matching the to of the 
directive exports. 

> 
> 
> 

Re: An alternative to "restricted keywords"

2017-05-16 Thread Stephan Herrmann

Thanks, Remi, for taking this to the EG list.

Some collected responses:


Remi: "from the user point of view, '^' looks like a hack"

This is, of course, a subjective statement. I don't share this view
and in years of experience with Xtext-languages (where this concept
is used by default) I never heard any user complain about this.

More importantly, I hold that such aesthetic considerations are of
much lesser significance than the question, whether we can explain
- unambiguously explain - the concept in a few simple sentences.
Explaining must be possible at two levels: in a rigorous specification
and in simple words for users of the language.

Remi: "a keyword which is activated if you are at a position in the
 grammar where it can be recognized".

I don't think 'being at a position in the grammar' is a good way of
explaining. Parsing doesn't generally have one position in a grammar,
multiple productions can be active in the same parser state.
Also speaking of a "loop" for modifiers seems to complicate matters
more than necessary.

Under these considerations I still see '^' as the clearest of all
solutions. Clear as a specification, simple to explain to users.



Peter spoke about module names vs. package names.

I think we agree, that module names cannot use "module words",
whereas package names should be expected to contain them.

Remi: "you should use reverse DNS naming for package so no problem :)"

"to" is a "module word" and a TLD.
I think we should be very careful in judging that a existing conflict
is not a real problem. Better to clearly and rigorously avoid the
conflict in the first place.



Some additional notes from my side:

In the escape-approach, it may be prudent to technically allow
escaping even words that are identifiers in Java 9, but could become
keywords in a future version. This ensures that modules which need
more escaping in Java 9+X can still be parsed in Java 9.


Current focus was on names of modules, packages and types.
A complete solution must also give an answer for annotations on modules.
Some possible solutions:
a. Assume that annotations for modules are designed with modules in mind
   and thus have to avoid any module words in their names.
b. Support escaping also in annotations
c. Refine the scope where "module words" are keywords, let it start only
   when the word "module" or the group "open module" has been consumed.
   This would make the words "module" and "open" special, as being
   switch words, where we switch from one language to another.
   (For this I previously coined the term "scoped keywords" [1])


I think we all agree that the conflicts we are solving here are rare
corner cases. Most names do not contain module words. Still, from a
conceptual and technical p.o.v. the solution must be bullet proof.
But there's no need to be afraid of module declarations being spammed
with dozens of '^' characters. Realistically, this will not happen.

Stephan

[1] http://www.objectteams.org/def/1.3/sA.html#sA.0.1

On 12.05.2017 21:21, Remi Forax wrote:

Hi Peter,

On May 12, 2017 6:08:58 PM GMT+02:00, Peter Levart  
wrote:

Hi Remi,

On 05/12/2017 08:17 AM, Remi Forax wrote:

[CC JPMS expert mailing list because, it's an important issue IMO]

I've a counter proposition.

I do not like your proposal because from the user point of view, '^'

looks like a hack, it's not used anywhere else in the grammar.

I agree that restricted keywords are not properly specified in JLS.

Reading your mail, i've discovered that what i was calling restricted
keywords is not what javac implements :(

I agree that restricted keywords should be only enabled when parsing

module-info.java

I agree that doing error recovery on the way the grammar for

module-info is currently implemented in javac leads to less than ideal
error messages.


In my opinion, both
module m { requires transitive transitive; }
module m { requires transitive; }
should be rejected because what javac implements something more close

to the javascript ASI rules than restricted keywords as currently
specified by Alex.


For me, a restricted keyword is a keyword which is activated if you

are at a position in the grammar where it can be recognized and because
it's a keyword, it tooks over an identifier.

by example for
   module m {
if the next token is 'requires', it should be recognized as a keyword

because you can parse a directive 'required ...' so there is a
production that will starts with the 'required' keyword.


so
   module m { requires transitive; }
should be rejected because transitive should be recognized as a

keyword after requires and the compiler should report a missing module
name.


and
   module m { requires transitive transitive; }
should be rejected because the grammar that parse the modifiers is

defined as "a loop" so from the grammar point of view it's like

   module m { requires Modifier Modifier; }
so the the front end of the compiler should report a missing 

Re: An alternative to "restricted keywords"

2017-05-12 Thread Remi Forax
Hi Peter,

On May 12, 2017 6:08:58 PM GMT+02:00, Peter Levart  
wrote:
>Hi Remi,
>
>On 05/12/2017 08:17 AM, Remi Forax wrote:
>> [CC JPMS expert mailing list because, it's an important issue IMO]
>>
>> I've a counter proposition.
>>
>> I do not like your proposal because from the user point of view, '^'
>looks like a hack, it's not used anywhere else in the grammar.
>> I agree that restricted keywords are not properly specified in JLS.
>Reading your mail, i've discovered that what i was calling restricted
>keywords is not what javac implements :(
>> I agree that restricted keywords should be only enabled when parsing
>module-info.java
>> I agree that doing error recovery on the way the grammar for
>module-info is currently implemented in javac leads to less than ideal
>error messages.
>>
>> In my opinion, both
>> module m { requires transitive transitive; }
>> module m { requires transitive; }
>> should be rejected because what javac implements something more close
>to the javascript ASI rules than restricted keywords as currently
>specified by Alex.
>>
>> For me, a restricted keyword is a keyword which is activated if you
>are at a position in the grammar where it can be recognized and because
>it's a keyword, it tooks over an identifier.
>> by example for
>>module m {
>> if the next token is 'requires', it should be recognized as a keyword
>because you can parse a directive 'required ...' so there is a
>production that will starts with the 'required' keyword.
>>
>> so
>>module m { requires transitive; }
>> should be rejected because transitive should be recognized as a
>keyword after requires and the compiler should report a missing module
>name.
>>   
>> and
>>module m { requires transitive transitive; }
>> should be rejected because the grammar that parse the modifiers is
>defined as "a loop" so from the grammar point of view it's like
>>module m { requires Modifier Modifier; }
>> so the the front end of the compiler should report a missing module
>name and a later phase should report that there is twice the same
>modifier 'transitive'.
>>
>> I believe that with this definition of 'restricted keyword', compiler
>can recover error more easily and offers meaningful error message and
>the module-info part of the grammar is LR(1).
>
>This will make "requires", "uses", "provides", "with", "to", "static", 
>"transitive", "exports", etc  all illegal module names. Ok, no big 
>deal, because there are no module names yet (apart from JDK modules and
>
>those are named differently). But...

you should use reverse DNS naming for module name, so no problem. 

>
>What about:
>
>module m { exports transitive; }
>
>Here 'transitive' is an existing package name for example. Who 
>guarantees that there are no packages out there with names matching 
>restricted keywords? Current restriction for modules is that they can 
>not have an unnamed package. Do we want to restrict package names a 
>module can export too?

you should use reverse DNS naming for package so no problem :)

>
>Stephan's solution does not have this problem.
>
>Regards, Peter

I think those issues are not real problem. 

Rémi 

>
>>
>> regards,
>> Rémi
>>
>> - Mail original -
>>> De: "Stephan Herrmann" 
>>> À: jigsaw-dev@openjdk.java.net
>>> Envoyé: Mardi 9 Mai 2017 16:56:11
>>> Objet: An alternative to "restricted keywords"
>>> (1) I understand the need for avoiding that new module-related
>>> keywords conflict with existing code, where these words may be used
>>> as identifiers. Moreover, it must be possible for a module
>declaration
>>> to refer to packages or types thusly named.
>>>
>>> However,
>>>
>>> (2) The currently proposed "restricted keywords" are not
>appropriately
>>> specified in JLS.
>>>
>>> (3) The currently proposed "restricted keywords" pose difficulties
>to
>>> the implementation of all tools that need to parse a module
>declaration.
>>>
>>> (4) A simple alternative to "restricted keywords" exists, which has
>not
>>> received the attention it deserves.
>>>
>>> Details:
>>>
>>> (2) The current specification implicitly violates the assumption
>that
>>> parsing can be performed on the basis of a token stream produced by
>>> a scanner (aka lexer). From discussion on this list we learned that
>>> the following examples are intended to be syntactically legal:
>>> module m { requires transitive transitive; }
>>> module m { requires transitive; }
>>> (Please for the moment disregard heuristic solutions, while we are
>>>   investigating whether generally "restricted keywords" is a
>well-defined
>>>   concept, or not.)
>>> Of the three occurrences of "transitive", #1 is a keyword, the
>others
>>> are identifiers. At the point when the parser has consumed
>"requires"
>>> and now asks about classification of the word "transitive", the
>scanner
>>> cannot possible answer this classification. It can only answer for
>sure,
>>> after the *parser* has 

Re: An alternative to "restricted keywords"

2017-05-12 Thread Peter Levart



On 05/12/2017 06:08 PM, Peter Levart wrote:
For me, a restricted keyword is a keyword which is activated if you 
are at a position in the grammar where it can be recognized and 
because it's a keyword, it tooks over an identifier.

by example for
   module m {
if the next token is 'requires', it should be recognized as a keyword 
because you can parse a directive 'required ...' so there is a 
production that will starts with the 'required' keyword.


so
   module m { requires transitive; }
should be rejected because transitive should be recognized as a 
keyword after requires and the compiler should report a missing 
module name.

  and
   module m { requires transitive transitive; }
should be rejected because the grammar that parse the modifiers is 
defined as "a loop" so from the grammar point of view it's like

   module m { requires Modifier Modifier; }
so the the front end of the compiler should report a missing module 
name and a later phase should report that there is twice the same 
modifier 'transitive'.


I believe that with this definition of 'restricted keyword', compiler 
can recover error more easily and offers meaningful error message and 
the module-info part of the grammar is LR(1).


This will make "requires", "uses", "provides", "with", "to", "static", 
"transitive", "exports", etc  all illegal module names. Ok, no big 
deal, because there are no module names yet (apart from JDK modules 
and those are named differently). But...


What about:

module m { exports transitive; } 


...ok, I realized there's no problem in exports or opens as there are no 
exports/opens modifiers in the current syntax, so what follows 'exports' 
or 'opens' can always be interpreted as package name only. But what if 
some future extension of the language wants to define an exports or 
opens modifier? Or some new yet unexistent requires modifier?


Peter


Re: An alternative to "restricted keywords"

2017-05-12 Thread Gregg Wonderly
Or, would it make sense to make the module name require quotes around it?  The 
subtlety of this notation looking JSON like, and yet being something new, makes 
me wonder if it should not just be a JSON based structure

module : [
{
m : { 
requires : {
transitive :  [ “transitive” ]
}
}
},
{
n : {
requires : {
transitive : [ “m” ]
}
}
}
]

It would certainly provide a huge change in tooling possibilities.  But, it 
would add some additional requirements for a JSON parser.

Gregg

> On May 12, 2017, at 11:08 AM, Peter Levart  wrote:
> 
> Hi Remi,
> 
> On 05/12/2017 08:17 AM, Remi Forax wrote:
>> [CC JPMS expert mailing list because, it's an important issue IMO]
>> 
>> I've a counter proposition.
>> 
>> I do not like your proposal because from the user point of view, '^' looks 
>> like a hack, it's not used anywhere else in the grammar.
>> I agree that restricted keywords are not properly specified in JLS. Reading 
>> your mail, i've discovered that what i was calling restricted keywords is 
>> not what javac implements :(
>> I agree that restricted keywords should be only enabled when parsing 
>> module-info.java
>> I agree that doing error recovery on the way the grammar for module-info is 
>> currently implemented in javac leads to less than ideal error messages.
>> 
>> In my opinion, both
>>module m { requires transitive transitive; }
>>module m { requires transitive; }
>> should be rejected because what javac implements something more close to the 
>> javascript ASI rules than restricted keywords as currently specified by Alex.
>> 
>> For me, a restricted keyword is a keyword which is activated if you are at a 
>> position in the grammar where it can be recognized and because it's a 
>> keyword, it tooks over an identifier.
>> by example for
>>   module m {
>> if the next token is 'requires', it should be recognized as a keyword 
>> because you can parse a directive 'required ...' so there is a production 
>> that will starts with the 'required' keyword.
>> 
>> so
>>   module m { requires transitive; }
>> should be rejected because transitive should be recognized as a keyword 
>> after requires and the compiler should report a missing module name.
>>  and
>>   module m { requires transitive transitive; }
>> should be rejected because the grammar that parse the modifiers is defined 
>> as "a loop" so from the grammar point of view it's like
>>   module m { requires Modifier Modifier; }
>> so the the front end of the compiler should report a missing module name and 
>> a later phase should report that there is twice the same modifier 
>> 'transitive'.
>> 
>> I believe that with this definition of 'restricted keyword', compiler can 
>> recover error more easily and offers meaningful error message and the 
>> module-info part of the grammar is LR(1).
> 
> This will make "requires", "uses", "provides", "with", "to", "static", 
> "transitive", "exports", etc  all illegal module names. Ok, no big deal, 
> because there are no module names yet (apart from JDK modules and those are 
> named differently). But...
> 
> What about:
> 
> module m { exports transitive; }
> 
> Here 'transitive' is an existing package name for example. Who guarantees 
> that there are no packages out there with names matching restricted keywords? 
> Current restriction for modules is that they can not have an unnamed package. 
> Do we want to restrict package names a module can export too?
> 
> Stephan's solution does not have this problem.
> 
> Regards, Peter
> 
>> 
>> regards,
>> Rémi
>> 
>> - Mail original -
>>> De: "Stephan Herrmann" 
>>> À: jigsaw-dev@openjdk.java.net
>>> Envoyé: Mardi 9 Mai 2017 16:56:11
>>> Objet: An alternative to "restricted keywords"
>>> (1) I understand the need for avoiding that new module-related
>>> keywords conflict with existing code, where these words may be used
>>> as identifiers. Moreover, it must be possible for a module declaration
>>> to refer to packages or types thusly named.
>>> 
>>> However,
>>> 
>>> (2) The currently proposed "restricted keywords" are not appropriately
>>> specified in JLS.
>>> 
>>> (3) The currently proposed "restricted keywords" pose difficulties to
>>> the implementation of all tools that need to parse a module declaration.
>>> 
>>> (4) A simple alternative to "restricted keywords" exists, which has not
>>> received the attention it deserves.
>>> 
>>> Details:
>>> 
>>> (2) The current specification implicitly violates the assumption that
>>> parsing can be performed on the basis of a token stream produced by
>>> a scanner 

Re: An alternative to "restricted keywords"

2017-05-12 Thread Peter Levart

Hi Remi,

On 05/12/2017 08:17 AM, Remi Forax wrote:

[CC JPMS expert mailing list because, it's an important issue IMO]

I've a counter proposition.

I do not like your proposal because from the user point of view, '^' looks like 
a hack, it's not used anywhere else in the grammar.
I agree that restricted keywords are not properly specified in JLS. Reading 
your mail, i've discovered that what i was calling restricted keywords is not 
what javac implements :(
I agree that restricted keywords should be only enabled when parsing 
module-info.java
I agree that doing error recovery on the way the grammar for module-info is 
currently implemented in javac leads to less than ideal error messages.

In my opinion, both
module m { requires transitive transitive; }
module m { requires transitive; }
should be rejected because what javac implements something more close to the 
javascript ASI rules than restricted keywords as currently specified by Alex.

For me, a restricted keyword is a keyword which is activated if you are at a 
position in the grammar where it can be recognized and because it's a keyword, 
it tooks over an identifier.
by example for
   module m {
if the next token is 'requires', it should be recognized as a keyword because 
you can parse a directive 'required ...' so there is a production that will 
starts with the 'required' keyword.

so
   module m { requires transitive; }
should be rejected because transitive should be recognized as a keyword after 
requires and the compiler should report a missing module name.
  
and

   module m { requires transitive transitive; }
should be rejected because the grammar that parse the modifiers is defined as "a 
loop" so from the grammar point of view it's like
   module m { requires Modifier Modifier; }
so the the front end of the compiler should report a missing module name and a 
later phase should report that there is twice the same modifier 'transitive'.

I believe that with this definition of 'restricted keyword', compiler can 
recover error more easily and offers meaningful error message and the 
module-info part of the grammar is LR(1).


This will make "requires", "uses", "provides", "with", "to", "static", 
"transitive", "exports", etc  all illegal module names. Ok, no big 
deal, because there are no module names yet (apart from JDK modules and 
those are named differently). But...


What about:

module m { exports transitive; }

Here 'transitive' is an existing package name for example. Who 
guarantees that there are no packages out there with names matching 
restricted keywords? Current restriction for modules is that they can 
not have an unnamed package. Do we want to restrict package names a 
module can export too?


Stephan's solution does not have this problem.

Regards, Peter



regards,
Rémi

- Mail original -

De: "Stephan Herrmann" 
À: jigsaw-dev@openjdk.java.net
Envoyé: Mardi 9 Mai 2017 16:56:11
Objet: An alternative to "restricted keywords"
(1) I understand the need for avoiding that new module-related
keywords conflict with existing code, where these words may be used
as identifiers. Moreover, it must be possible for a module declaration
to refer to packages or types thusly named.

However,

(2) The currently proposed "restricted keywords" are not appropriately
specified in JLS.

(3) The currently proposed "restricted keywords" pose difficulties to
the implementation of all tools that need to parse a module declaration.

(4) A simple alternative to "restricted keywords" exists, which has not
received the attention it deserves.

Details:

(2) The current specification implicitly violates the assumption that
parsing can be performed on the basis of a token stream produced by
a scanner (aka lexer). From discussion on this list we learned that
the following examples are intended to be syntactically legal:
module m { requires transitive transitive; }
module m { requires transitive; }
(Please for the moment disregard heuristic solutions, while we are
  investigating whether generally "restricted keywords" is a well-defined
  concept, or not.)
Of the three occurrences of "transitive", #1 is a keyword, the others
are identifiers. At the point when the parser has consumed "requires"
and now asks about classification of the word "transitive", the scanner
cannot possible answer this classification. It can only answer for sure,
after the *parser* has accepted the full declaration. Put differently,
the parser must consume more tokens than have been classified by the
Scanner. Put differently, to faithfully parse arbitrary grammars using
a concept of "restricted keywords", scanners must provide speculative
answers, which may later need to be revised by backtracking or similar
exhaustive exploration of the space of possible interpretations.

The specification is totally silent about this fundamental change.


(3) "restricted keywords" pose three problems to tool implementations:

(3.a) Any known 

Re: An alternative to "restricted keywords"

2017-05-12 Thread Remi Forax
[CC JPMS expert mailing list because, it's an important issue IMO]

I've a counter proposition.

I do not like your proposal because from the user point of view, '^' looks like 
a hack, it's not used anywhere else in the grammar. 
I agree that restricted keywords are not properly specified in JLS. Reading 
your mail, i've discovered that what i was calling restricted keywords is not 
what javac implements :(
I agree that restricted keywords should be only enabled when parsing 
module-info.java
I agree that doing error recovery on the way the grammar for module-info is 
currently implemented in javac leads to less than ideal error messages.

In my opinion, both
   module m { requires transitive transitive; }
   module m { requires transitive; }
should be rejected because what javac implements something more close to the 
javascript ASI rules than restricted keywords as currently specified by Alex.

For me, a restricted keyword is a keyword which is activated if you are at a 
position in the grammar where it can be recognized and because it's a keyword, 
it tooks over an identifier.
by example for 
  module m { 
if the next token is 'requires', it should be recognized as a keyword because 
you can parse a directive 'required ...' so there is a production that will 
starts with the 'required' keyword.

so 
  module m { requires transitive; }
should be rejected because transitive should be recognized as a keyword after 
requires and the compiler should report a missing module name.
 
and
  module m { requires transitive transitive; }
should be rejected because the grammar that parse the modifiers is defined as 
"a loop" so from the grammar point of view it's like
  module m { requires Modifier Modifier; }
so the the front end of the compiler should report a missing module name and a 
later phase should report that there is twice the same modifier 'transitive'.

I believe that with this definition of 'restricted keyword', compiler can 
recover error more easily and offers meaningful error message and the 
module-info part of the grammar is LR(1).

regards,
Rémi

- Mail original -
> De: "Stephan Herrmann" 
> À: jigsaw-dev@openjdk.java.net
> Envoyé: Mardi 9 Mai 2017 16:56:11
> Objet: An alternative to "restricted keywords"

> (1) I understand the need for avoiding that new module-related
> keywords conflict with existing code, where these words may be used
> as identifiers. Moreover, it must be possible for a module declaration
> to refer to packages or types thusly named.
> 
> However,
> 
> (2) The currently proposed "restricted keywords" are not appropriately
> specified in JLS.
> 
> (3) The currently proposed "restricted keywords" pose difficulties to
> the implementation of all tools that need to parse a module declaration.
> 
> (4) A simple alternative to "restricted keywords" exists, which has not
> received the attention it deserves.
> 
> Details:
> 
> (2) The current specification implicitly violates the assumption that
> parsing can be performed on the basis of a token stream produced by
> a scanner (aka lexer). From discussion on this list we learned that
> the following examples are intended to be syntactically legal:
>module m { requires transitive transitive; }
>module m { requires transitive; }
> (Please for the moment disregard heuristic solutions, while we are
>  investigating whether generally "restricted keywords" is a well-defined
>  concept, or not.)
> Of the three occurrences of "transitive", #1 is a keyword, the others
> are identifiers. At the point when the parser has consumed "requires"
> and now asks about classification of the word "transitive", the scanner
> cannot possible answer this classification. It can only answer for sure,
> after the *parser* has accepted the full declaration. Put differently,
> the parser must consume more tokens than have been classified by the
> Scanner. Put differently, to faithfully parse arbitrary grammars using
> a concept of "restricted keywords", scanners must provide speculative
> answers, which may later need to be revised by backtracking or similar
> exhaustive exploration of the space of possible interpretations.
> 
> The specification is totally silent about this fundamental change.
> 
> 
> (3) "restricted keywords" pose three problems to tool implementations:
> 
> (3.a) Any known practical approach to implement a parser with
> "restricted keywords" requires to leverage heuristics, which are based
> on the exact set of rules defined in the grammar. Such heuristics
> reduce the look-ahead that needs to be performed by the scanner,
> in order to avoid the full exhaustive exploration mentioned above.
> A set of such heuristic is extremely fragile and can easily break when
> later more rules are added to the grammar. This means small future
> language changes can easily break any chosen strategy.
> 
> (3.b) If parsing works for error-free input, this doesn't imply that
> a parser will be able to give any