On 25.01.12 21:22, Michael Haberler wrote: > [this should move to emc-developers, which is why I'm cc'ing there]
It would be a pity if the rest of us were to be excluded. It is a very interesting discussion, and the "EMC" issue was secretive enough. Doing the same with possible user syntax improvements seems very unhelpful. > it just occured to me that a decent parser would give us the > opportunity for a significant language simplification while retaining > backwards compatibility. > An example for the current RS274NGC language with variable references, > expressions and control structures: > ---------------------- > #<var1> = [#<foo> + 1] > #<var2> = 10 > > o#<label1> if [#<var1> lt #<var2>] > .... > o#<label1> else > .... > o#<label1> endif > ---------------------- > > Note the pathetic amount of syntactic noise Yes, it is ridiculously primitive, presumably to keep parser code small, back when memory cost money. A greater weirdness is that it obstructs our ability to write readable gcode, in the 21st century. > - wouldnt it be more readable to write: > > ---------------------- > $var1 = $foo + 1 > $var2 = 10 > > if $var1 < $var 2 > ... > else > ... > endif > ---------------------- Indeedy, but even the '$' is unnecessary. My experience is also with lex/bison, and ditching the '$' would not challenge the parser. (Keywords such as "lt", "else", etc. would be identified in the lexer, obviating the need for a variable identifier. Various keyword classes would issue individual tokens, simplifying the parser.) > We have several noise chars per variable (#<>), useless labels > including noise (o#<label1>) which do not help in disambiguating, and > useless brackets around expressions, plus, well, fortranesque > operators +1 > now the major reason why this is so is that the current scanner only > does lookahead 1 character, and the parser is inadeaquate; if even > Perl can do it, so should RS274NGC Natively, yacc/bison is also limited to single character lookahead. If more is required, then I have found it necessary to do that manually, in explicit code. With the right amount of work done in the lexer, resulting in a useful set of tokens accompanying the values sent to the parser, virtually all practical problems are fairly easily soluble with a LALR(1) parser. It's not just my experience and O'Reilly's "Lex & Yacc" book, also Wikipedia seems to agree: » Real computer languages can often be expressed as LALR(1) grammars, and in cases where a LALR(1) grammar is insufficient, usually an LALR(2) grammar is adequate. If the parser generator handles only LALR(1) grammars, then the LALR parser will have to interface with some hand-written code when it encounters the special LALR(2) situation in the input language. « It would be interesting to go up against the knottiest "improved gcode" example that we can think of. > A combination of a say flex scanner, bison parser should be able to > parse both examples unambiguously. Moreover, it should tell during the > bison run wether there are any ambiguities or conflicts when such a > language simplification is introduced - it would give a reduce/reduce > message. For instance, one could experiment wether the '$' as variable > introducer is actually necessary (it probably is due to ambiguities > with words in a block). Betcha it's not. ;-) > I understand this is quite different from you pretty printer/lint goal > > If we were to go about this, I think the way to do this is: > > - have both parsers as alternatives Both in the code, selected in the config file or on the command line, perhaps? Or mandate a "gcode+" keyword on the first line of input, to allow either type at run-time? > - add a flag to sai/rs274 to parse a file with old and new parser That's perhaps what is meant here? > - compare outputs for regression tests Manually, a custom regression testing framework, or move to DejaGnu? If we could move to a BNF specification of our permissible grammar, then the problem would diminish, I think. > - when it is clear that no ambiguities are left, move it to mainline as the > default parser A journey of many steps. But interesting. (Perhaps more interesting than making swarf, but please don't tell anyone. ;) Erik -- Programs must be written for people to read, and only incidentally for machines to execute. - Abelson and Sussman ------------------------------------------------------------------------------ Try before you buy = See our experts in action! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-dev2 _______________________________________________ Emc-users mailing list Emc-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/emc-users