Oops!  Ignore that last!  It suffers from the same problem -- longest 
captures the match and all the rest of that hack-ish apparatus I talked 
about would be ignored.

I've been doing mainly LaTeX these days and obviously it is turning my 
brain into mush.

Sorry!

On Wednesday, December 2, 2020 at 5:55:24 PM UTC-5 Jeffrey Kegler wrote:

> How many commands?  One approach that I just thought of and have never 
> tested is to have fixed length variables, prioritized versus the commands 
> of that length.
>
> Off the top of my head:
>
> var2 ~ <varchar><varchar> priority=>1
> PA_command ~ 'PA' priority=>2
> PR_command ~ 'PR' priority=>2
>
> include a catch-all var for lengths not specifically accounted for
>
> var_catch ~ <varchar>+ priority=>0
>
> <var_catch> will always lose to lexemes of the same length, but will catch 
> those variables whose length is not the same as any command.
>
> A little hack-ish, but should be very fast, and perhaps no more hack-ish 
> than alternatives.
>
> Again, not tested, so you'll be a pioneer!  If you try it, let me know!
>
> Hope this helps, jeffrey
>
>
> On Wed, Dec 2, 2020 at 5:42 PM Dean S <[email protected]> wrote:
>
>>
>> Thank you very much for your response!
>>
>> 1) Multiline: The language does have a "_" line-joining character, but 
>> the grammar wouldn't have to support that - it could be done with a trivial 
>> preprocessor. Once joined, commands may not span multiple lines.
>>
>> 2) Command/variable upper-case: Commands are always upper case, but there 
>> are no case restrictions on variables.
>>
>>
>> So it sounds, however, like there isn't a straight-forward grammar or 
>> option tweak. That's ok. The language has fancy expressions (algebraic 
>> expressions, function calls, strings, comments, and arrays), but its 
>> statement structure is extremely simplistic. The terminators (newline and 
>> semi-colon) are not allowed anywhere except as terminators (no escapes, not 
>> in strings, not in comments). So, as a practical solution, I should be able 
>> to dumb-split a program on terminators, look at the first characters of 
>> each statement, strip off the command or variable assignment part and parse 
>> the rest as an expression - which follows more reasonable rules that the 
>> LATM will like.
>>
>> So, I guess this falls to the "handled in easier faster ways" approach 
>> which I guess should have been obvious but I failed to think of.
>>
>> Thank you for your time, and a great library!
>>
>>   - Dean
>>
>>
>> On 12/2/20 4:28 PM, Jeffrey Kegler wrote:
>> > I'll first describe your immediate problem, then ask a couple Q's.
>> > 
>> > The problem: Lexing is LATM -- *Longest* Acceptable Token Matching.  
>> The lexeme priority is a tie breaker, used when tokens are the same 
>> length.  When your grammar fails, "PAx" is your longest token, and the only 
>> choice at length 3.  "PA" is only 2 chars long, and lexemes of different 
>> length are not compared for priority.
>> > 
>> > (Btw the reason for this is, as implemented, lexeme priorities can be 
>> (and are) tested in a few machine instructions.  If Marpa needed to look at 
>> earlier possibilities, the logic gets vastly very complex, efficiency goes 
>> out the window, and you get into the territory when the grammar can often 
>> be handled in easier faster ways.)
>> > 
>> > Now the questions:
>> > 
>> > 1.) I notice statements cannot be multiline.  Is that the intent going 
>> forward?
>> > 
>> > 2.) In the example, commands always begin with a capital letter, 
>> variables never do.  Will that continue to be the case?  (If so, it points 
>> to an easy, fast solution.)
>> > 
>> > Possible solutions, depending, include finding something that 
>> distinguishes commands from variables in the lexer; custom lexers; using 
>> events to guide custom lexing; and character-by-character lexing, whereby 
>> you handle your own whitespace.
>> > 
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "marpa parser" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>>
> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/marpa-parser/270ecda3-6917-f717-593b-051ded20629d%40gmail.com
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/marpa-parser/6c5d7b0e-bdee-4c8b-84e2-5282e5ab3c03n%40googlegroups.com.

Reply via email to