Hi Peter, On May 12, 2017 6:08:58 PM GMT+02:00, Peter Levart <peter.lev...@gmail.com> wrote: >Hi Remi, > >On 05/12/2017 08:17 AM, Remi Forax wrote: >> [CC JPMS expert mailing list because, it's an important issue IMO] >> >> I've a counter proposition. >> >> I do not like your proposal because from the user point of view, '^' >looks like a hack, it's not used anywhere else in the grammar. >> I agree that restricted keywords are not properly specified in JLS. >Reading your mail, i've discovered that what i was calling restricted >keywords is not what javac implements :( >> I agree that restricted keywords should be only enabled when parsing >module-info.java >> I agree that doing error recovery on the way the grammar for >module-info is currently implemented in javac leads to less than ideal >error messages. >> >> In my opinion, both >> module m { requires transitive transitive; } >> module m { requires transitive; } >> should be rejected because what javac implements something more close >to the javascript ASI rules than restricted keywords as currently >specified by Alex. >> >> For me, a restricted keyword is a keyword which is activated if you >are at a position in the grammar where it can be recognized and because >it's a keyword, it tooks over an identifier. >> by example for >> module m { >> if the next token is 'requires', it should be recognized as a keyword >because you can parse a directive 'required ...' so there is a >production that will starts with the 'required' keyword. >> >> so >> module m { requires transitive; } >> should be rejected because transitive should be recognized as a >keyword after requires and the compiler should report a missing module >name. >> >> and >> module m { requires transitive transitive; } >> should be rejected because the grammar that parse the modifiers is >defined as "a loop" so from the grammar point of view it's like >> module m { requires Modifier Modifier; } >> so the the front end of the compiler should report a missing module >name and a later phase should report that there is twice the same >modifier 'transitive'. >> >> I believe that with this definition of 'restricted keyword', compiler >can recover error more easily and offers meaningful error message and >the module-info part of the grammar is LR(1). > >This will make "requires", "uses", "provides", "with", "to", "static", >"transitive", "exports", etc .... all illegal module names. Ok, no big >deal, because there are no module names yet (apart from JDK modules and > >those are named differently). But...
you should use reverse DNS naming for module name, so no problem. > >What about: > >module m { exports transitive; } > >Here 'transitive' is an existing package name for example. Who >guarantees that there are no packages out there with names matching >restricted keywords? Current restriction for modules is that they can >not have an unnamed package. Do we want to restrict package names a >module can export too? you should use reverse DNS naming for package so no problem :) > >Stephan's solution does not have this problem. > >Regards, Peter I think those issues are not real problem. Rémi > >> >> regards, >> Rémi >> >> ----- Mail original ----- >>> De: "Stephan Herrmann" <stephan.herrm...@berlin.de> >>> À: jigsaw-dev@openjdk.java.net >>> Envoyé: Mardi 9 Mai 2017 16:56:11 >>> Objet: An alternative to "restricted keywords" >>> (1) I understand the need for avoiding that new module-related >>> keywords conflict with existing code, where these words may be used >>> as identifiers. Moreover, it must be possible for a module >declaration >>> to refer to packages or types thusly named. >>> >>> However, >>> >>> (2) The currently proposed "restricted keywords" are not >appropriately >>> specified in JLS. >>> >>> (3) The currently proposed "restricted keywords" pose difficulties >to >>> the implementation of all tools that need to parse a module >declaration. >>> >>> (4) A simple alternative to "restricted keywords" exists, which has >not >>> received the attention it deserves. >>> >>> Details: >>> >>> (2) The current specification implicitly violates the assumption >that >>> parsing can be performed on the basis of a token stream produced by >>> a scanner (aka lexer). From discussion on this list we learned that >>> the following examples are intended to be syntactically legal: >>> module m { requires transitive transitive; } >>> module m { requires transitive; } >>> (Please for the moment disregard heuristic solutions, while we are >>> investigating whether generally "restricted keywords" is a >well-defined >>> concept, or not.) >>> Of the three occurrences of "transitive", #1 is a keyword, the >others >>> are identifiers. At the point when the parser has consumed >"requires" >>> and now asks about classification of the word "transitive", the >scanner >>> cannot possible answer this classification. It can only answer for >sure, >>> after the *parser* has accepted the full declaration. Put >differently, >>> the parser must consume more tokens than have been classified by the >>> Scanner. Put differently, to faithfully parse arbitrary grammars >using >>> a concept of "restricted keywords", scanners must provide >speculative >>> answers, which may later need to be revised by backtracking or >similar >>> exhaustive exploration of the space of possible interpretations. >>> >>> The specification is totally silent about this fundamental change. >>> >>> >>> (3) "restricted keywords" pose three problems to tool >implementations: >>> >>> (3.a) Any known practical approach to implement a parser with >>> "restricted keywords" requires to leverage heuristics, which are >based >>> on the exact set of rules defined in the grammar. Such heuristics >>> reduce the look-ahead that needs to be performed by the scanner, >>> in order to avoid the full exhaustive exploration mentioned above. >>> A set of such heuristic is extremely fragile and can easily break >when >>> later more rules are added to the grammar. This means small future >>> language changes can easily break any chosen strategy. >>> >>> (3.b) If parsing works for error-free input, this doesn't imply that >>> a parser will be able to give any useful answer for input with >syntax >>> errors. As a worst-case example consider an arbitrary input sequence >>> consisting of just the two words "requires" and "transitive" in >random >>> order and with no punctuation. >>> A parser will not be able to detect any structure in this sequence. >>> By comparison, normal keywords serve as a baseline, where parsing >>> typically can resume regardless of any leading garbage. >>> While this is not relevant for normal compilation, it is paramount >>> for assistive functions, which most of the time operate on >incomplete >>> text, likely to contain even syntax errors. >>> Strictly speaking, any "module declaration" with syntax errors is >>> not a ModuleDeclaration, and thus none of the "restrictive keywords" >>> can be interpreted as keywords (which per JLS can only happen inside >>> a ModuleDeclaration). >>> All this means, that functionality like code completion is >>> systematically broken in a language using "restricted keywords". >>> >>> (3.c) Other IDE functionality assumes that small fragments of the >>> input text can be scanned out of context. The classical example here >>> is syntax highlighting but there are more examples. >>> Any such functionality has to be re-implemented, replacing the >>> highly efficient local scanning with full parsing of the input text. >>> For functionality that is implicitly invoked per keystroke, or on >>> mouse hover etc, this difference in efficiency negatively affects >>> the overall user experience of an IDE. >>> >>> >>> (4) The following proposal avoids all difficulties described above: >>> >>> * open, module, requires, transitive, exports, opens, to, uses, >>> provides, and with are "module words", to which the following >>> interpretation is applied: >>> * within any ordinary compilation unit, a module word is a >normal >>> identifier. >>> * within a modular compilation unit, all module words are >>> (unconditional) keywords. >>> * We introduce three new auxiliary non-terminals: >>> LegacyPackageName: >>> LegacyIdentifier >>> LegacyPackageName . LegacyIdentifier >>> LegacyTypeName: >>> LegacyIdentifier >>> LegacyTypeName . LegacyIdentifier >>> LegacyIdentifier: >>> Identifier >>> ^open >>> ^module >>> ... >>> ^with >>> * We modify all productions in 7.7, replacing PackageName with >>> LegacyPackageName and replacing TypeName with LegacyTypeName. >>> * After parsing, each of the words '^open', '^module' etc. >>> is interpreted by removing the leading '^' (escape character). >>> >>> Here, '^' is chosen as the escape character following the precedent >>> of Xtext. Plenty of other options for this purpose are possible, >too. >>> >>> >>> >>> This proposal completely satisfies the requirements (1), and avoids >>> all of the problems (2) and (3). There's an obvious price to pay: >>> users will have to add the escape character when referring to code >>> that uses a module word as a package name or type name. >>> >>> Not only is this a very low price compared to the benefits; one can >>> even argue that it also helps the human reader of a module >declaration, >>> because it clearly marks which occurrences of a module word are >indeed >>> identifiers. >>> >>> An IDE can easily help in interactively adding escapes where >necessary. >>> >>> Finally, in this trade-off it is relevant to consider the expected >>> frequencies: legacy names (needing escape) will surely be the >exception >>> - by magnitudes. So, the little price needing to be paid, will only >>> affect a comparatively small number of locations. >>> >>> >>> Stephan -- Sent from my Android device with K-9 Mail. Please excuse my brevity.