Re: How to "trap" unacceptable lines.

F. Li Mon, 21 Feb 2022 11:02:03 -0800

Thank you for your response.

I tried stripping the example in your third link here
<https://metacpan.org/dist/Marpa-R2/view/pod/Semantics/Rank.pod> to a bare
minimum ending up with the attached


but the test strings get caught as "JUNK". (The first 'a = 1', is
acceptable.) Obviously I'm missing something!


On Mon, Feb 21, 2022 at 12:00 PM Jeffrey Kegler <[email protected]>
wrote:

> One matter which requires getting used to with Marpa is that you are
> working with BNF, so the core logic is non-procedural.  This is why most
> programmers seem to want to suffer endlessly with recursive descent rather
> than consider stronger parsers.  You can understand recursive descent with
> purely procedural thinking.
>
> The idea of "do this on error" is procedural thinking.  Procedural stuff
> can be added to Marpa via events, but the programmer needs to bear in mind
> the engine is being driven descriptively, not procedurally.
>
> One solution to your problem might be rule ranking.  See here
> <https://metacpan.org/dist/Marpa-R2/view/pod/Scanless/DSL.pod#rank>, here
> <https://metacpan.org/dist/Marpa-R2/view/pod/Semantics/Order.pod> and here
> <https://metacpan.org/dist/Marpa-R2/view/pod/Semantics/Rank.pod>.  Rule
> ranking turns Marpa into a better PEG.
>
> The docs I linked are a bit daunting at first glance, or if you don't skim
> the more technical parts.  The basic idea in your case might be to define a
> "catch all" line as an error case, ranking it below the non-error
> alternatives.
>
> People working with ranking can find it tricky because the ranking only is
> applied in very specific circumstances -- the alternatives must be at the
> same dotted position of a parent rule (which implies they will have the
> same LHS), the same start position and the same end position.  If any of
> these 3 is not the case, ranking will not be done.  This means, for
> example, that you can't use rule ranking for things which might differ in
> length.  But this seems to be OK in your case.  All line alternatives,
> error or non-error, should start at the same position and end at the same
> position.
>
> What I think might work is to give the error lines a lower rank than the
> non-error lines.  Then they will be seen only if there is no non-error
> parse of the line.
>
> The docs contain example and I hope looking at these will help make things
> clear.
>
> I hope this helps,
>
> jeffrey
>
> On Mon, Feb 21, 2022 at 7:36 AM clueless newbie <[email protected]>
> wrote:
>
>> Marpa brings back the feeling of being as a child listening to my father
>> giving a lecture to a graduate class in partial differential equations - I
>> could see the x's the y's but how things work were far beyond my
>> comprehension. I'm sure my head would be a lot less sore and Jeffrey richer
>> if instead of bouncing my head against the wall I were to donate another
>> collar but I am hardly the Duke of Brunswick-Lüneburg.
>> (Maybe I should just say "I'm just too stupid!", give up and see if I can
>> successfully twiddle my thumbs.)
>>
>>
>> The data consists of (physical) lines terminated by a newline. A line may
>> be:
>>  1) <name> = <boolean>
>>  2) '/'<regexp>'/' = <boolean>
>> Comments begin with '--' and are end of line type comments.
>>
>> Shouldn't I be able to say that anything else is an error? ie:
>>
>>     :default ::= action => [values]
>>     lexeme default = latm => 1
>>     :start ::= lines
>>     lines ::= line+
>>     line  ::= <name> ('=') <boolean> (NEWLINE) action => doName
>>             | ('/') <regexp> ('/') ('=') <boolean> (NEWLINE) action =>
>> doRegexp
>>     # would like the following to catch everything else
>>            || <bad stuff> (NEWLINE) action => doError rank => -1
>>     #
>>
>>     <name>    ~ <unquoted name> | <quoted name>
>>     <unquoted name> ~ ALPHA  | ALPHA ALPHANUMERICS
>>     <quoted name> ~ '"' <quoted name body> '"'
>>     <quoted name body> ~ [\w]+            # for now
>>
>>     <regexp> ~ [$(|)\w^]+
>>
>>     #
>>     <bad stuff> ~ ANYTHING+
>>     #
>>
>>     <boolean>   ~ TRUE | FALSE
>>     FALSE       ~ 'FALSE':i | 'F':i | '0'
>>     TRUE        ~ 'TRUE':i | 'T':i |'1'
>>
>>     ALPHA         ~ [a-z]:i
>>     ALPHANUMERICS ~ [\w]*
>>
>>     :discard       ~ COMMENT
>>     COMMENT        ~ '--' <comment body>
>>     <comment body> ~ ANYTHING*
>>
>>     ANYTHING       ~ [^\x{A}\x{B}\x{C}\x{D}\x{2028}\x{2029}]
>>     :discard       ~ WHITESPACE
>>     WHITESPACE     ~ [ \t]+
>>
>>     NEWLINE        ~ [\n]
>>
>> CAVEAT: <name> is going to be an Oracle identifier and they are weird!
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "marpa parser" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/marpa-parser/d07a1e10-3244-4899-b73f-ba7deb0369e7n%40googlegroups.com
>> <https://groups.google.com/d/msgid/marpa-parser/d07a1e10-3244-4899-b73f-ba7deb0369e7n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "marpa parser" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/marpa-parser/HWNo_JJINM4/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/marpa-parser/CABWWkhPuGjLK%2BCuo7HJfyo3Vf4nZ%2BVs9f6X7jLnbg1%2BbTMXKXw%40mail.gmail.com
> <https://groups.google.com/d/msgid/marpa-parser/CABWWkhPuGjLK%2BCuo7HJfyo3Vf4nZ%2BVs9f6X7jLnbg1%2BbTMXKXw%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/marpa-parser/CA%2BxWOmX%3DP%3DT4mpiCUE8JHnK7VidQjdp9yWYaaTZstp2mBi4s%3Dw%40mail.gmail.com.

ranking_01.t
Description: Binary data

Re: How to "trap" unacceptable lines.

Reply via email to