Yeah, that line is definitely the problematic line. It's also
the reason I'm rebuilding the parser from my current line by line
methodology. Or attempting to :) I actually wrote this grammar
up in Regexp::Grammars first, but the resource requirements were
far too high. I figured I'd take the time to learn Marpa as the
capabilities and performance seem more in line with what I needed.
I believe event parsing the comments myself might be the way to
go. I was also reading ranking documentation this morning, but I
didn't get a good handle on it at all. Maybe I'll play with it
and see what happens.
Thanks for your time and insight here Jeffrey, I appreciate it :)
On Friday, May 9, 2014 12:55:07 PM UTC-7, Jeffrey Kegler wrote:
I just took a second look at this one
GlobalPList plist4 { Pat n8000000g0000008; #KEEP# } }
Ouch! The solution in the face of stuff like this may be to
not treat comments at the lexical level, but at the G1
level. That is, treat the '#', ',', tags, etc. as lexemes
and parse comments as if they were statements. In your
situation, that seems in effect to be the case. Your
comments seem to have more structure and variety than some of
the "statements". They are not just whitespace equivalents.
At the G1 level you can use rule "rank" adverb
(https://metacpan.org/pod/distribution/Marpa-R2/pod/Scanless/DSL.pod#rank
<https://metacpan.org/pod/distribution/Marpa-R2/pod/Scanless/DSL.pod#rank>),
Marpa can help with the internal semantics of the comments. etc.
I notice, by the way, that my documentation of the "rank"
adverb could be improved.
-- jeffrey
On 05/09/2014 12:09 PM, [email protected] wrote:
You have the right idea. Unfortunately, I do not get to
dictate the syntax of this file I get to parse and there is
considerable ambiguity in comments. There are essentially
three forms of a comment. Two forms of this comment include
information I need to parse. One form (non-information
comment) does not contain useful information.
1) embedded base number --> Matches OptEmbeddedBase -->
Actual information I need. Discernable from a
non-information comment by it's location immediately after
the opening of a pattern list brace and that if must contain
'#base=<list>', where <list> is a comma delimited list of
integers.
2) tag string --> Matches TagStr --> Again, information I
need. Discernable from a non-information comment by
location after a pattern declaration and by the fact that it
is bookended by '#' symbols can can only contain a comma
delimited list of word (\w) characters. Technically,
whitespace is not allowed inside these strings either. I
figured I'd sort that out once I had it matching as is.
3) Non information comment -> Matches COMMENT --> Can be
discarded. This is any comment that does not match one of
the first two forms.
Hopefully that's helpful. When you say that you'd 'simply
say that in the grammar', I'm confused. Is this not what
I'm saying in the grammar in the TagStr rule by setting '#'
characters before and after the TagList rule? Is there a
better way to resolve this ambiguity?
On Friday, May 9, 2014 11:46:16 AM UTC-7, Jeffrey Kegler wrote:
Trying to get the idea, is it that tags use '#' as a
delimiter, much in
the same way that strings use quotes? And that's it's a
comment if
there's a '#' that is not matched before the newline?
That is, that in
Pat n2000000g0000002; #HOT# # Not so hot
"#HOT#" is a tag, and "# Not so hot" is a comment?
If that's the case, I'd simply say that in the grammar.
I'd give more
detail, but I'm not 100% clear on the intent at this point.
-- jeffrey
--
You received this message because you are subscribed to the
Google Groups "marpa parser" group.
To unsubscribe from this group and stop receiving emails
from it, send an email to [email protected].
For more options, visit https://groups.google.com/d/optout
<https://groups.google.com/d/optout>.
--
You received this message because you are subscribed to the
Google Groups "marpa parser" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to [email protected] <javascript:>.
For more options, visit https://groups.google.com/d/optout
<https://groups.google.com/d/optout>.