On 25.01.12 21:22, Michael Haberler wrote:
> [this should move to emc-developers, which is why I'm cc'ing there]

It would be a pity if the rest of us were to be excluded. It is a very
interesting discussion, and the "EMC" issue was secretive enough. Doing
the same with possible user syntax improvements seems very unhelpful.

> it just occured to me that a decent parser would give us the
> opportunity for a significant language simplification while retaining
> backwards compatibility.

> An example for the current RS274NGC language with variable references,
> expressions and control structures:

> ----------------------
> #<var1> = [#<foo> + 1]
> #<var2> = 10
> 
> o#<label1> if [#<var1> lt #<var2>]
>     .... 
> o#<label1> else
>     ....
> o#<label1> endif
> ----------------------
> 
> Note the pathetic amount of syntactic noise

Yes, it is ridiculously primitive, presumably to keep parser code small,
back when memory cost money. A greater weirdness is that it obstructs
our ability to write readable gcode, in the 21st century.

> - wouldnt it be more readable to write:
> 
> ----------------------
> $var1 = $foo + 1
> $var2 = 10
> 
> if $var1 < $var 2
>       ...
> else
>         ...
> endif
> ----------------------

Indeedy, but even the '$' is unnecessary.

My experience is also with lex/bison, and ditching the '$' would not
challenge the parser. (Keywords such as "lt", "else", etc. would be
identified in the lexer, obviating the need for a variable identifier.
Various keyword classes would issue individual tokens, simplifying the
parser.)

> We have several noise chars per variable (#<>), useless labels
> including noise (o#<label1>) which do not help in disambiguating, and
> useless brackets around expressions, plus, well, fortranesque
> operators

+1

> now the major reason why this is so is that the current scanner only
> does lookahead 1 character, and the parser is inadeaquate; if even
> Perl can do it, so should RS274NGC

Natively, yacc/bison is also limited to single character lookahead.
If more is required, then I have found it necessary to do that manually,
in explicit code.

With the right amount of work done in the lexer, resulting in a useful
set of tokens accompanying the values sent to the parser, virtually all
practical problems are fairly easily soluble with a LALR(1) parser. It's
not just my experience and O'Reilly's "Lex & Yacc" book, also Wikipedia
seems to agree:

» Real computer languages can often be expressed as LALR(1) grammars,
and in cases where a LALR(1) grammar is insufficient, usually an LALR(2)
grammar is adequate. If the parser generator handles only LALR(1)
grammars, then the LALR parser will have to interface with some
hand-written code when it encounters the special LALR(2) situation in
the input language. «

It would be interesting to go up against the knottiest "improved gcode"
example that we can think of.

> A combination of a say flex scanner, bison parser should be able to
> parse both examples unambiguously. Moreover, it should tell during the
> bison run wether there are any ambiguities or conflicts when such a
> language simplification is introduced - it would give a reduce/reduce
> message. For instance, one could experiment wether the '$' as variable
> introducer is actually necessary (it probably is due to ambiguities
> with words in a block).

Betcha it's not. ;-)

> I understand this is quite different from you pretty printer/lint goal
> 
> If we were to go about this, I think the way to do this is:
> 
> - have both parsers as alternatives

Both in the code, selected in the config file or on the command line,
perhaps? Or mandate a "gcode+" keyword on the first line of input, to
allow either type at run-time?

> - add a flag to sai/rs274 to parse a file with old and new parser

That's perhaps what is meant here?

> - compare outputs for regression tests

Manually, a custom regression testing framework, or move to DejaGnu?

If we could move to a BNF specification of our permissible grammar, then
the problem would diminish, I think.

> - when it is clear that no ambiguities are left, move it to mainline as the 
> default parser

A journey of many steps. But interesting. (Perhaps more interesting than
making swarf, but please don't tell anyone. ;)


Erik

-- 
Programs must be written for people to read, and only incidentally for        
machines to execute.                            - Abelson and Sussman

------------------------------------------------------------------------------
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
_______________________________________________
Emc-users mailing list
Emc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/emc-users

Reply via email to