"Ellery Newcomer" <[email protected]> wrote in message
news:[email protected]...
> Nick Sabalausky wrote:
>>
>> That mentioned, in the section about LL(k), "If both A<B and A<B> are
>> legal
>> expressions, where B can be of arbitrary length, then no finite amount of
>> look-ahead will allow this to be parsed."
>>
>> I did a quick test in gold (LALR), and this handles the above just fine:
>>
>> ------------------------------
>> "Name" = 'Test'
>> "Author" = 'Test'
>> "Version" = 'Test'
>> "About" = 'Test'
>>
>> "Start Symbol" = <Type1Or2>
>>
>> <Type1Or2> ::= <Type1> | <Type2>
>> <Type1> ::= 'A' '<' <Bs>
>> <Type2> ::= 'A' '<' <Bs> '>'
>> <Bs> ::= 'B' | <Bs> 'B'
>>
>> ------------------------------
>>
>> However, that one example alone doesn't necessarily prove that it's
>> always
>> doable.
>>
>
> Bad example. <Bs> is regular. Try making it nested parens or something
> like that. (It isn't LL(k), but it is pretty trivial anyways. ANTLR3 can
> handle it without fuss)
If you mean it's doing it using the dfa/regex lexing engine instead of the
parser, it isn't. GOLD has a different syntax for terminals that are to be
lexed. Anything with the "<Foo> ::= whatever" syntax is a nonterminal that's
handled by LALR parsing. For terminals that are to be handled by the
dfa/regex lexer, it's "Foo = whatever".
In any case, here's one that can successfully handle nested parens instead
of <Bs>:
-------------------------------------------------------
"Start Symbol" = <Type1Or2>
<Type1Or2> ::= <Type1> | <Type2>
<Type1> ::= 'A' '<' <Parens>
<Type2> ::= 'A' '<' <Parens> '>'
<Parens> ::= '(' ')' | '(' <Parens> ')'
-------------------------------------------------------
Or nested sequences of parens:
-------------------------------------------------------
"Start Symbol" = <Type1Or2>
<Type1Or2> ::= <Type1> | <Type2>
<Type1> ::= 'A' '<' <ParensList>
<Type2> ::= 'A' '<' <ParensList> '>'
<ParensList> ::= <Parens> | <ParensList> <Parens>
<Parens> ::= '(' ')' | '(' <ParensList> ')'
-------------------------------------------------------