Yeah, that line is definitely the problematic line. It's also the reason
I'm rebuilding the parser from my current line by line methodology. Or
attempting to :) I actually wrote this grammar up in Regexp::Grammars
first, but the resource requirements were far too high. I figured I'd take
the time to learn Marpa as the capabilities and performance seem more in
line with what I needed.
I believe event parsing the comments myself might be the way to go. I was
also reading ranking documentation this morning, but I didn't get a good
handle on it at all. Maybe I'll play with it and see what happens.
Thanks for your time and insight here Jeffrey, I appreciate it :)
On Friday, May 9, 2014 12:55:07 PM UTC-7, Jeffrey Kegler wrote:
>
> I just took a second look at this one
>
> GlobalPList plist4 { Pat n8000000g0000008; #KEEP# } }
>
> Ouch! The solution in the face of stuff like this may be to not treat
> comments at the lexical level, but at the G1 level. That is, treat the
> '#', ',', tags, etc. as lexemes and parse comments as if they were
> statements. In your situation, that seems in effect to be the case. Your
> comments seem to have more structure and variety than some of the
> "statements". They are not just whitespace equivalents.
>
> At the G1 level you can use rule "rank" adverb (
> https://metacpan.org/pod/distribution/Marpa-R2/pod/Scanless/DSL.pod#rank),
> Marpa can help with the internal semantics of the comments. etc.
>
> I notice, by the way, that my documentation of the "rank" adverb could be
> improved.
>
> -- jeffrey
>
> On 05/09/2014 12:09 PM, [email protected] <javascript:> wrote:
>
> You have the right idea. Unfortunately, I do not get to dictate the
> syntax of this file I get to parse and there is considerable ambiguity in
> comments. There are essentially three forms of a comment. Two forms of
> this comment include information I need to parse. One form
> (non-information comment) does not contain useful information.
>
> 1) embedded base number --> Matches OptEmbeddedBase --> Actual information
> I need. Discernable from a non-information comment by it's location
> immediately after the opening of a pattern list brace and that if must
> contain '#base=<list>', where <list> is a comma delimited list of integers.
>
> 2) tag string --> Matches TagStr --> Again, information I need.
> Discernable from a non-information comment by location after a pattern
> declaration and by the fact that it is bookended by '#' symbols can can
> only contain a comma delimited list of word (\w) characters. Technically,
> whitespace is not allowed inside these strings either. I figured I'd sort
> that out once I had it matching as is.
>
> 3) Non information comment -> Matches COMMENT --> Can be discarded. This
> is any comment that does not match one of the first two forms.
>
> Hopefully that's helpful. When you say that you'd 'simply say that in the
> grammar', I'm confused. Is this not what I'm saying in the grammar in the
> TagStr rule by setting '#' characters before and after the TagList rule?
> Is there a better way to resolve this ambiguity?
>
> On Friday, May 9, 2014 11:46:16 AM UTC-7, Jeffrey Kegler wrote:
>>
>> Trying to get the idea, is it that tags use '#' as a delimiter, much in
>> the same way that strings use quotes? And that's it's a comment if
>> there's a '#' that is not matched before the newline? That is, that in
>>
>> Pat n2000000g0000002; #HOT# # Not so hot
>>
>> "#HOT#" is a tag, and "# Not so hot" is a comment?
>>
>> If that's the case, I'd simply say that in the grammar. I'd give more
>> detail, but I'm not 100% clear on the intent at this point.
>>
>> -- jeffrey
>>
> --
> You received this message because you are subscribed to the Google Groups
> "marpa parser" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected] <javascript:>.
> For more options, visit https://groups.google.com/d/optout.
>
>
>
--
You received this message because you are subscribed to the Google Groups
"marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.