On Dec 9, 2013, at 3:53 AM, Carsten Bormann wrote: >> So what's the reason you talk about two levels? > > If you interpret the first three racetracks as generating a sequence of > characters, or the last two as generating a sequence of tokens, you get the > wrong result. > >>> RFC 4627 does that implicitly by saying "The representation of numbers is >>> similar to that used in most programming languages.".) >> >> That's not very precise either, but it's at least telling the reader where >> to look further if s/he doesn't understand what's intended. > > Actually, to the extent that RFC 4627 does define JSON's data model, the > result of this simple statement is surprisingly precise. > It only stops helping you much when you reach the limits of precision or > range (e.g., what to do with 1e400.) > >> Another problem is that it's not scalable, in the sense that it won't work >> anymore if everybody would do it. > > Right. But then, section 11.8.3.1 of the ES6 draft is an example for why it > is tedious to do this. > (It is also, I believe, a nice example how easy it would be to get this wrong > and that nobody would actually notice a mistake buried in there, unless they > do the work to systematically check every detail or to translate it into a > machine-checkable form. Fortunately, our number system is relatively stable; > I’d hate to maintain a spec that has this level of tedium on something that > actually evolves. For added fun, compare with 7.1.3.1.1, which is mostly > saying the same thing, but does it in a subtly different way. That’s why ES6 > is 531 pages...) > >> I'm not planning to do any work. I was just trying to point out that the >> technical work is not that difficult (after some leaps of faith to take the >> 'most obvious' interpretation of racetracks,…). > > Yep. But if nobody does that work (or, more precisely, admits to having done > that work), we simply don’t know whether the statement that triggered this > little subthread is true or not. I have made too many stupid mistakes in > seemingly simple specs that became obvious only as soon as I used a tool to > check the spec. > > Grüße, Carsten
I want to address a few points brought up in this subthread, primarily between Carsten and Martin. First Syntax Diagrams (aks, RailRoad Diagrams and called racetracks in this thread) are a well known formalism for expressing a context free grammar. For example see http://en.wikipedia.org/wiki/Syntax_diagram Any competent software engineer should be able to recognize and read a syntax diagram of this sort. There is no mystery about them. Any grammar that can be expressed using BNF can also be expressed using a Syntax Diagram although I think most would agree that BNF is a better alternative for large grammars. This whole issue of the use of Syntax Diagrams rather than BNF is a stylist debate that is hard to take seriously. If TC39 informed you that we are converting the notation used in ECMA-404 to a BNF formalism would that end the objections to normatively referencing ECMA-404 from 4627bis? Unfortunately, I'm pretty sure it wouldn't. Regarding, using of a multiply level definition within ECMA-404. That is a standard practice within language specification where the "tokens" of a language are often described using a FSM level formalism and the syntactic structure is described using a PDA level formalism. However, there is nothing that prevents a PDA level abstraction such as a BNF from being using to describe "tokens" even with the full power of a PDA isn't used. The ECMA-262 specification is an example of a language specification that using a BNF to describe both its lexical and syntactic structure. In the case of ECMA-404, clause 4 is clearly defining the lexical level of the language (it is talking about "tokens") and it clearly states that numbers and strings are tokens. Hence there is no ambiguity about how to interpret the syntax diagrams for number and string in clauses 8 and 9. None of the subelements of diagrams are "tokens" so there is no plausible way they could be misconstrued as generating or recognizing a sequence of tokens. The only normative purpose of the first paragraph in clause 8 (Numbers) is to identify the code points that are symbolically referenced by the Syntax diagram. Everything else in that paragraph is either redundant (describe by the diagram) or pseudo-semantics that are outside the scope of what ECMA-404 defines. This is a common problem seen in many specification that try to clarify a formalism with supplementary prose and instead ends up sowing confusion. If a bug was filed against this for ECMA-404 it will probably be cleaned up in the next edition. Note that the current 4627bis draft is very similar in this regard. It talks about an "exponent part" with out defining that term. (it doesn't appear in the grammar). It doesn't specify how to actually interpret a number token as a mathematical value or how to generate one from a mathematical value. It only says that JSON numbers are similar to those in most programming languages (which includes a very wide range of possibilities). Specs. can have both technical and editorial bugs. If you think there are bugs in ECMA-404 the best thing to do is to submit a bug ticket at bugs.ecmascript.org. If there is a critical bug that you think prevents 4627bis from normatively referencing ECMA-404 say so and assign the bug a high priority in the initial ticket. But please, start with actual errors, ambiguities, inconsistencies, or similar substantive issue. Stylistic issues won't be ignore but they are less important and harder to reach agreement on. Allen
_______________________________________________ es-discuss mailing list [email protected] https://mail.mozilla.org/listinfo/es-discuss

