My favorite hyphen keyword is short-circuit, i don't know where to use it, but it's so good that we have to find a new feature to introduce it :)
As i said, i really like this proposal. The hyphen keywords nicely solve the issue when you want to introduce a keyword in the middle of the code, at a place where an identifier may occur. For a keyword at a declaration site (class, field, method), we can use either a contextual keyword or a hyphen keyword. (other comments inlined) ----- Mail original ----- > De: "Brian Goetz" <brian.go...@oracle.com> > À: "amber-spec-experts" <amber-spec-experts@openjdk.java.net> > Envoyé: Mardi 8 Janvier 2019 16:22:17 > Objet: We need more keywords, captain! > This document proposes a possible move that will buy us some breathing > room in the perpetual problem where the keyword-management tail wags the > programming-model dog. > > > ## We need more keywords, captain! > > Java has a fixed set of _keywords_ (JLS 3.9) which are not allowed to > be used as identifiers. This set has remained quite stable over the > years (for good reason), with the exceptions of `assert` added in 1.4, > `enum` added in 5, and `_` added in 9. In addition, there are also > several _reserved identifiers_ (`true`, `false`, and `null`) which > behave almost like keywords. > > Over time, as the language evolves, language designers face a > challenge; the set of keywords imagined in version 1.0 are rarely > suitable for expressing all the things we might ever want our language > to express. We have several tools at our disposal for addressing this > problem: > > - Eminent domain. Take words that were previously identifiers, and > turn them into keywords, as we did with `assert` in 1.4. > > - Recycle. Repurpose an existing keyword for something that it was > never really meant for (such as using `default` for annotation > values or default methods). > > - Do without. Find a way to pick a syntax that doesn't require a > new keyword, such as using `@interface` for annotations instead of > `annotation` -- or don't do the feature at all. > > - Smoke and mirrors. Create the illusion of context-dependent > keywords through various linguistic heroics (restricted keywords, > reserved type names.) > > In any given situation, all of these options are on the table -- but > most of the time, none of these options are very good. The lack of > reasonable options for extending the syntax of the language threatens > to become a significant impediment to language evolution. > > #### Why not "just" make new keywords? > > While it may be legal for us to declare `i` to be a keyword in a > future version of Java, this would likely break every program in the > world, since `i` is used so commonly as an identifier. (When the > `assert` keyword was added in 1.4, it broke every testing framework.) > The cost of remediating the effect of such incompatible changes varies > as well; invalidating a name choice for a local variable has a local > fix, but invalidating the name of a public type or an interface > method might well be fatal. > > Additionally, the keywords we're likely to want to reclaim are often > those that are popular as identifiers (e.g., `value`, `var`, > `method`), making such fatal collisions more likely. In some cases, > if the keyword candidate in question is sufficiently rarely used as an > identifier, we might still opt to take that source-compatibility hit > -- but names that are less likely to collide (e.g., > `usually_but_not_always_final`) are likely not the ones we want in our > language. Realistically, this is unlikely to be a well we can go to > very often, and the bar must be very high. > > #### Why not "just" live with the keywords we have? > > Reusing keywords in multiple contexts has ample precedent in > programming languages, including Java. (For example, we (ab)use `final` > for "not mutable", "not overridable", and "not extensible".) > Sometimes, using an existing keyword in a new context is natural and > sensible, but usually it's not our first choice. Over time, as the > range of demands we place on our keyword set expands, this may well > descend into the ridiculous; no one wants to use `null final` as a way > of negating finality. (While one might think such things are too > ridiculous to consider, note that we received serious-seeming > suggestions during JEP 325 to use `new switch` to describe a switch > with different semantics. Presumably to be followed by `new new > switch` in ten years.) > > Of course, one way to live without making new keywords is to stop > evolving the language entirely. While there are some who think this > is a fine idea, doing so because of the lack of available tokens would > be a silly reason. We are convinced that Java has a long life ahead of > it, and developers are excited about new features that enable to them > to write more expressive and reliable code. > > #### Why not "just" make contextual keywords? > > At first glance, contextual keywords (and their friends, such as > reserved type identifiers) may appear to be a magic wand; they let us > create the illusion of adding new keywords without breaking existing > programs. But the positive track record of contextual keywords hides > a great deal of complexity and distortion. > > Each grammar position is its own story; contextual keywords that might > be used as modifiers (e.g., `readonly`) have different ambiguity > considerations than those that might be use in code (e.g., a `matches` > expression). The process of selecting a contextual keyword is not a > simple matter of adding it to the grammar; each one requires an > analysis of potential current and future interactions. Similarly, > each token we try to repurpose may have its own special > considerations; for example, we could justify the use of `var` as a > reserved type name because because the naming conventions are so > broadly adhered to. Finally, the use of contextual keywords in > certain syntactic positions can create additional considerations for > extending the syntax later. > > Contextual keywords create complexity for specifications, compilers, > and IDEs. With one or two special cases, we can often deal well > enough, but if special cases were to become more pervasive, this would > likely result in more significant maintenance costs or bug tail. While > it is easy to dismiss this as “not my problem”, in reality, this is > everybody’s problem. IDEs often have to guess whether a use of a > contextual keyword is a keyword or identifier, and it may not have > enough information to make a good guess until it’s seen more input. > This results in worse user highlighting, auto-completion, and > refactoring abilities — or worse. These problems quickly become > everyone's problems. I fully agree on the cost for the specification cost, but contextual keywords have mostly a single cost in term of implementation that you pay once for every local keywords. Once a lexer/parser have pay that cost (which is not negligible), the cost of each new keywords is not a lot and very close to zero if you have a parser generator. Because a contextual keyword is recognized by the parser, it doesn't worsen any IDE/compiler features that is built on top of the parser, so auto-completion, refactoring, etc are not impacted. Syntax highlighting can be impacted depending how it's implemented (on top of the lexer vs on top of the parser). And i don't get what are the "additional considerations for extending the syntax later" ? > > So, while contextual keywords are one of the tools in our toolbox, > they should also be used sparingly. yes ! > > #### Why is this a problem? > > Aside from the obvious consequences of these problems (clunky syntax, > complexity, bugs), there is a more insidious hidden cost -- > distortion. The accidental details of keyword management pose a > constant risk of distortion in language design. > > One could consider the choice to use `@interface` instead of > `annotation` for annotations to be a distortion; having a descriptive > name rather than a funky combination of punctuation and keyword would > surely have made it easier for people to become familiar with > annotations. > > In another example, the set of modifiers (`public`, `private`, > `static`, `final`, etc) is not complete; there is no way to say “not > final” or “not static”. This, in turn, means that we cannot create > features where variables or classes are `final` by default, or members > are `static` by default, because there’s no way to denote the desire > to opt out of it. While there may be reasons to justify a locally > suboptimal default anyway (such as global consistency), we want to > make these choices deliberately, not have them made for us by the > accidental details of keyword management. Choosing to leave out a > feature for reasons of simplicity is fine; leaving it out because we > don't have a way to denote the obvious semantics is not. > > It may not be obvious from the outside, but this is a constant problem > in evolving the language, and an ongoing tax that we all pay, directly > or indirectly. > > ## We need a new source of keyword candidates > > Every time we confront this problem, the overwhelming tendency is to > punt and pick one of the bad options, because the problem only comes > along every once in a while. But, with the features in the pipeline, I > expect it will continue to come along with some frequency, and I’d > rather get ahead of it. Given that all of these current options are > problematic, and there is not even a least-problematic move that > applies across all situations, my inclination is to try to expand the > set of lexical forms that can be used as keywords. > > As a not-serious example, take the convention that we’ve used for > experimental features, where we prefix provisional keywords in > prototypes with two underscores, as we did with `__ByValue` in the > Valhalla prototype. (We commonly do this in feature proposals and > prototypes, mostly to signify “this keyword is a placeholder for a > syntax decision to be made later”, but also because it permits a > simple implementation that is unlikely to collide with existing code.) > We could, for example, carve out the space of identifiers that begin > with underscore as being reserved for keywords. Of course, this isn’t > so pretty, and it also means we'd have a mix of underscore and > non-underscore keywords, so it’s not a serious suggestion, as much as > an example of the sort of move we are looking for. > > But I do have a serious suggestion: allow _hyphenated_ keywords where > one or more of the terms are already keywords or reserved identifiers. > Unlike restricted keywords, this creates much less trouble for > parsing, as (for example) `non-null` cannot be confused for a > subtraction expression, and the lexer can always tell with fixed > lookahead whether `a-b` is three tokens or one. This gives us a lot > more room for creating new, less-conflicting keywords. And these new > keywords are likely to be good names, too, as many of the missing > concepts we want to add describe their relationship to existing > language constructs -- such as `non-null`. Technically, it's not a lookahead which is a parser thing, it's the lexer being greedy. > > Here’s some examples where this approach might yield credible > candidates. (Note: none of these are being proposed here; this is > merely an illustrative list of examples of how this mechanism could > form keywords that might, in some particular possible future, be > useful and better than the alternatives we have now.) > > - `non-null` > - `non-final` > - `package-private` (the default accessibility for class members, > currently not denotable) > - `public-read` (publicly readable, privately writable) > - `null-checked` > - `type-static` (a concept needed in Valhalla, which is static > relative to a particular specialization of a class, rather than the > class itself) > - `default-value` > - `eventually-final` (what the `@Stable` annotation currently suggests) > - `semi-final` (an alternative to `sealed`) > - `exhaustive-switch` (opting into exhaustiveness checking for statement > switches) > - `enum-class`, `annotation-class`, `record-class` (we might have > chosen these > as an alternative to `enum` and `@interface`, had we had the option) > - `this-class` (to describe the class literal for the current class) > - `this-return` (a common request is a way to mark a setter or > builder method > as returning its receiver) > > (Again, the point is not to debate the merits of any of these specific > examples; the point is merely to illustrate what we might be able to do > with such a mechanism.) > > Having this as an option doesn't mean we can't also use the other > approaches when they are suitable; it just means we have more, and > likely less fraught, options with which to make better decisions. > > There are likely to be other lexical schemes by which new keywords can > be created without impinging on existing code; this one seems credible > and reasonably parsable by both machines and humans. > > #### "But that's ugly" > > Invariably, some percentage of readers will have an immediate and > visceral reaction to this idea. Let's stipulate for the record that > some people will find this ugly. (At least, at first. Many such > reactions are possibly-transient (see what I did there?) responses > to unfamiliarity.) Rémi