Re: [Json] Response to Statement from W3C TAG

Allen Wirfs-Brock Mon, 09 Dec 2013 16:33:19 -0800

On Dec 9, 2013, at 3:53 AM, Carsten Bormann wrote:

>> So what's the reason you talk about two levels?
> 
> If you interpret the first three racetracks as generating a sequence of 
> characters, or the last two as generating a sequence of tokens, you get the 
> wrong result.
> 
>>> RFC 4627 does that implicitly by saying "The representation of numbers is 
>>> similar to that used in most programming languages.".)
>> 
>> That's not very precise either, but it's at least telling the reader where 
>> to look further if s/he doesn't understand what's intended.
> 
> Actually, to the extent that RFC 4627 does define JSON's data model, the 
> result of this simple statement is surprisingly precise.
> It only stops helping you much when you reach the limits of precision or 
> range (e.g., what to do with 1e400.)
> 
>> Another problem is that it's not scalable, in the sense that it won't work 
>> anymore if everybody would do it.
> 
> Right.  But then, section 11.8.3.1 of the ES6 draft is an example for why it 
> is tedious to do this.
> (It is also, I believe, a nice example how easy it would be to get this wrong 
> and that nobody would actually notice a mistake buried in there, unless they 
> do the work to systematically check every detail or to translate it into a 
> machine-checkable form.  Fortunately, our number system is relatively stable; 
> I’d hate to maintain a spec that has this level of tedium on something that 
> actually evolves.  For added fun, compare with 7.1.3.1.1, which is mostly 
> saying the same thing, but does it in a subtly different way.  That’s why ES6 
> is 531 pages...)
> 
>> I'm not planning to do any work. I was just trying to point out that the 
>> technical work is not that difficult (after some leaps of faith to take the 
>> 'most obvious' interpretation of racetracks,…).
> 
> Yep.  But if nobody does that work (or, more precisely, admits to having done 
> that work), we simply don’t know whether the statement that triggered this 
> little subthread is true or not.  I have made too many stupid mistakes in 
> seemingly simple specs that became obvious only as soon as I used a tool to 
> check the spec.
> 
> Grüße, Carsten


I want to address a few points brought up in this subthread, primarily between 
Carsten and Martin.

First Syntax Diagrams (aks, RailRoad Diagrams and called racetracks in this 
thread) are a well known formalism for expressing a context free grammar.  For 
example see http://en.wikipedia.org/wiki/Syntax_diagram Any competent software 
engineer should be able to recognize and read a syntax diagram of this sort. 
There is no mystery about them. Any grammar that can be expressed using BNF can 
also be expressed using a Syntax Diagram although I think most would agree that 
 BNF is a better alternative for large grammars. 

This whole issue of the use of Syntax Diagrams rather than BNF is a stylist 
debate that is hard to take seriously. If TC39 informed you that we are 
converting the notation used in ECMA-404 to a BNF formalism would that end the 
objections  to normatively referencing  ECMA-404 from 4627bis?  Unfortunately, 
I'm pretty sure it wouldn't.

Regarding, using of a multiply level definition within ECMA-404.  That is a 
standard practice within language specification where the "tokens" of a 
language are often described using a FSM level formalism and the syntactic 
structure is described using a PDA level formalism.  However, there is nothing 
that prevents a PDA level abstraction such as a BNF from being using to 
describe "tokens" even with the full power of a PDA isn't used.  The ECMA-262 
specification is an example of a language specification that using a BNF to 
describe both its lexical and syntactic structure.

In the case of ECMA-404, clause 4 is clearly defining the lexical level of the 
language (it is talking about "tokens") and it clearly states that numbers and 
strings are tokens. Hence there is no ambiguity about how to interpret the 
syntax diagrams for number and string in clauses 8 and 9.  None of the 
subelements of diagrams are "tokens" so there is no plausible way they could be 
misconstrued as generating or recognizing a sequence of tokens.

The only normative purpose of the first paragraph in clause 8 (Numbers) is to 
identify the code points that  are symbolically referenced by the Syntax 
diagram. Everything else in that paragraph is either redundant (describe by the 
diagram) or pseudo-semantics that are outside the scope of what ECMA-404 
defines. 

This is a common problem seen in many specification that try to clarify a 
formalism with supplementary prose and instead ends up sowing confusion.  If a 
bug was filed against this for ECMA-404 it will probably be cleaned up in the 
next edition. Note that the current 4627bis draft is very similar in this 
regard.  It talks about an "exponent part" with out defining that term. (it 
doesn't appear in the grammar).  It doesn't specify how to actually interpret a 
number token as a mathematical  value or how to generate one from a 
mathematical value.  It only says that JSON numbers  are  similar to those in 
most programming languages (which includes a very wide range of possibilities).

Specs. can have both technical and editorial bugs.  If you think there are bugs 
in ECMA-404 the best thing to do is to submit a bug ticket at 
bugs.ecmascript.org. If there is a critical bug that you think prevents 4627bis 
from normatively referencing ECMA-404 say so and assign the bug a high priority 
in the initial ticket.  But please, start with actual errors, ambiguities, 
inconsistencies, or similar substantive issue.  Stylistic issues won't be 
ignore but they are less important and harder to reach agreement on.

Allen

_______________________________________________
es-discuss mailing list
[email protected]
https://mail.mozilla.org/listinfo/es-discuss

Re: [Json] Response to Statement from W3C TAG

Reply via email to