Re: why is this string not recognized? trouble with complex L0 and quotes, I think.

Jeffrey Kegler Tue, 20 Feb 2018 17:27:06 -0800

The model for the relationship between G1 and L0 is the classic
parser/lexer divide of yacc, bison and the 1970s textbooks.  A lexer
divides the input into tokens and a higher-level parser parses the token
stream.


Marpa adds a new wrinkle.  The classic lexer was "blind" -- it has no idea
of the parsing context.  Marpa's L0 grammar only looks for tokens actually
expected by the G1 parser.

If you look back at your grammar and the error message you got, it might
give you some insight.   Your grammar did not allow for multiple sublevel
options -- only one.  So a first sublevel option was read, but when what
was intended as your 2nd sublevel option was encountered, the G1 grammar
was not expecting it, so that the L0 parser didn't look for it.  What L0
encountered at that point did not, in fact, match anything it was looking
for, so it reported "No lexeme".

I hope this helps, jeffrey

On Tue, Feb 20, 2018 at 12:09 PM, <stefan.gottsch...@gmail.com> wrote:

> Okay, I tried that, and it mostly worked!  But I don't understand why.  I
> think I may have to study this a bit more - my mental model for marpa's
> handling of L0 grammar is probably not right.
>
> -stefan
>
>
> On Monday, February 19, 2018 at 10:18:34 PM UTC-5, Stefan Gottschalk's
> wrote:
>>
>> I thought that
>>
>> SUBLEVELOPTIONS             ~ SUBLEVELOPTIONS_STRING+
>>
>> would cause the input to be chopped up into a series of quoted and
>> unquoted segments, where the quoted segments were allowed to contain spaces
>> (see CHAR_INSIDE_SQUOTES and CHAR_INSIDE_DQUOTES), but the unquoted
>> segments were not (see CHAR_UNQUOTED).
>>
>> Anyway, I will give your suggestion a try.
>>
>> Thank you!
>>
>> -stefan
>>
>>
>> On Monday, February 19, 2018 at 4:34:13 PM UTC-5, Jeffrey Kegler wrote:
>>>
>>> [ This is off the top of my head and untested. ]
>>>
>>> SubLevelOptions, despite its name, only allows for one sublevel option.
>>> You perhaps want something more like:
>>>
>>> SublevelOptionsMaybe ::= SublevelOptions
>>> SubLevelOptions ::= SubLevelOption+
>>> SublevelOptionsMaybe ::=
>>>
>>> SublevelOption      ::=  SUBLEVELOPTIONS
>>>
>>> There are alternative ways to write the above that are more elegant and
>>> probably better, but I think it conveys the idea.  Again, untested.
>>>
>>> I hope this helps, jeffrey
>>>
>>> On Mon, Feb 19, 2018 at 11:49 AM, <stefan.g...@gmail.com> wrote:
>>>
>>>> I'm scarcely more than a novice with Marpa, so please forgive me if I'm
>>>> asking for too much or being naive.
>>>>
>>>> A sample of my legacy DSL looks like this:
>>>>
>>>> sublevel: -only "-R[SUNLF]{0,1}\d+\s" -testargsmore -foo
>>>>
>>>> {
>>>>
>>>>   < test { foo } >
>>>>
>>>> }
>>>>
>>>>
>>>> The *sublevel:* is supposed to be a key word, introducing a kind of
>>>> statement.
>>>>
>>>> Following the key word is a bunch of nearly arbitrary text (containing
>>>> options and parameters), terminated by a {}-delimited body.  So the overall
>>>> structure is this:
>>>>
>>>> *sublevel:* *<options>*
>>>> *{*
>>>>     *<more statements>*
>>>> *}*
>>>>
>>>>
>>>> So, the open curly signals the end of the *<options>* and the start of
>>>> the body.
>>>>
>>>> This legacy DSL allows curlies inside the *<options>* provided they
>>>> are quoted (single or double) or escaped.  So, the curlies in *{0,1}*
>>>> should not be interpreted as special, but taken verbatim.
>>>>
>>>> I studied the string grammar listed in https://gist.github.com/jdd
>>>> urand/8d3238c22731a85eb890 and used it as a guide for my development
>>>> for the grammar I list below.  In particular, that example taught me that
>>>> L0 rules permit alternative productions, and also allow sequences.  Anyway,
>>>> the portion of the grammar in ALL CAPS was derived from that example.
>>>>
>>>> But, I get this "No lexeme" error when it hits the first dquote, and I
>>>> cannot figure out why!
>>>>
>>>> Setting trace_terminals option
>>>> Setting trace_values option
>>>> Discarded lexeme L1c1: whitespace
>>>> Accepted lexeme L2c1-9 e1: 'sublevel:'; value="sublevel:"
>>>> Accepted lexeme L2c1-9 e1: 'sublevel:'; value="sublevel:"
>>>> Accepted lexeme L2c10-16 e2: SUBLEVELOPTIONS; value=" -only "
>>>> ****** FAILED TO PARSE ******
>>>> MSG:
>>>> Error in SLIF parse: No lexeme found at line 2, column 17
>>>> * String before error: \nsublevel: -only\s
>>>> * The error was at line 2, column 17, and at character 0x0022 '"', ...
>>>> * here: "-R[SUNLF]{0,1}\\d+\\s" -testargsmore -foo\n{\n  <
>>>> Marpa::R2 exception at ./marpa_bnf_1.pl line 31.
>>>>
>>>>
>>>> I would be grateful for any insights.
>>>>
>>>> I intended that my sample input would have been interpreted, at some
>>>> depth of productions, as
>>>>
>>>> *sublevel:* *SublevelOptions*
>>>> *{*
>>>> *  NamedBlockList*
>>>> *}*
>>>>
>>>>
>>>> and I thought that the SublevelOptions
>>>>
>>>>  -only "-R[SUNLF]{0,1}\d+\s" -testargsmore -foo
>>>>
>>>> would decompose into
>>>>
>>>> STRING_UNQUOTED = ( -only )
>>>> STRING_DQUOTED = ("-R[SUNLF]{0,1}\d+\s")
>>>> STRING_UNQUOTED = ( -testargsmore -foo)
>>>>
>>>>
>>>> and I failed to see why it doesn't do so.  Instead, Marpa tells me it
>>>> doesn't know what to do when it sees that dquote.
>>>>
>>>> Below is my full grammar.
>>>>
>>>> :default ::= action => [name, start, length, values]
>>>> lexeme default = latm => 1
>>>>
>>>> File ::= BodyStatements
>>>> File ::=
>>>>
>>>> BodyStatements ::= BodyStatement+
>>>>
>>>> BodyStatement ::=
>>>>     Sublevel
>>>>   | SingleTest
>>>>
>>>>
>>>> Sublevel ::= ('sublevel:') SublevelOptionsMaybe ('{') BodyStatements
>>>> ('}')
>>>> Sublevel ::= ('sublevel:') SublevelOptionsMaybe ('{') ('}')
>>>> SublevelOptionsMaybe ::= SublevelOptions
>>>> SublevelOptionsMaybe ::=
>>>>
>>>> SublevelOptions      ::=  SUBLEVELOPTIONS
>>>>
>>>> SUBLEVELOPTIONS             ~ SUBLEVELOPTIONS_STRING+
>>>>
>>>> SUBLEVELOPTIONS_STRING      ~ STRING_UNQUOTED
>>>>                             | STRING_SQUOTED
>>>>                             | STRING_DQUOTED
>>>>
>>>> STRING_UNQUOTED             ~ CHAR_UNQUOTED+
>>>> CHAR_UNQUOTED               ~ [^"'\}\{;\\\n]
>>>> CHAR_UNQUOTED               ~ ES
>>>>
>>>> STRING_SQUOTED              ~ SQUOTE STRING_INSIDE_SQUOTES SQUOTE
>>>> STRING_INSIDE_SQUOTES       ~ CHAR_INSIDE_SQUOTES*
>>>> CHAR_INSIDE_SQUOTES         ~ [^'\\]
>>>> CHAR_INSIDE_SQUOTES         ~ [\\] [']
>>>> SQUOTE                      ~ [']
>>>>
>>>> STRING_DQUOTED              ~ DQUOTE STRING_INSIDE_DQUOTES DQUOTE
>>>> STRING_INSIDE_DQUOTES       ~ CHAR_INSIDE_DQUOTES*
>>>> CHAR_INSIDE_DQUOTES         ~ [^"\\\n]
>>>> CHAR_INSIDE_DQUOTES         ~ [\\] [^#]
>>>> DQUOTE                      ~ ["]
>>>>
>>>> ES                          ~ [\\] [\\'"\{\};]
>>>>
>>>> NamedBlockList ::= NamedBlock+
>>>> NamedBlock ::= ArgTag ('{') ArgBodyMaybe ('}')
>>>> ArgBodyMaybe ::= ArgBody
>>>> ArgBodyMaybe ::=
>>>> ArgBody ~ [^\{\}]+
>>>> ArgTag ~ [\w]+
>>>>
>>>> SingleTest ::=
>>>>     SingleSimpleTest
>>>>
>>>> SingleSimpleTest ::= ('<') NamedBlockList ('>')
>>>>
>>>> # whitespace
>>>> :discard ~ whitespace
>>>> whitespace ~ [\s]+
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "marpa parser" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to marpa-parser...@googlegroups.com.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>> --
> You received this message because you are subscribed to the Google Groups
> "marpa parser" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to marpa-parser+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to marpa-parser+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: why is this string not recognized? trouble with complex L0 and quotes, I think.

Reply via email to