How many commands?  One approach that I just thought of and have never
tested is to have fixed length variables, prioritized versus the commands
of that length.

Off the top of my head:

var2 ~ <varchar><varchar> priority=>1
PA_command ~ 'PA' priority=>2
PR_command ~ 'PR' priority=>2

include a catch-all var for lengths not specifically accounted for

var_catch ~ <varchar>+ priority=>0

<var_catch> will always lose to lexemes of the same length, but will catch
those variables whose length is not the same as any command.

A little hack-ish, but should be very fast, and perhaps no more hack-ish
than alternatives.

Again, not tested, so you'll be a pioneer!  If you try it, let me know!

Hope this helps, jeffrey


On Wed, Dec 2, 2020 at 5:42 PM Dean S <[email protected]> wrote:

>
> Thank you very much for your response!
>
> 1) Multiline: The language does have a "_" line-joining character, but the
> grammar wouldn't have to support that - it could be done with a trivial
> preprocessor. Once joined, commands may not span multiple lines.
>
> 2) Command/variable upper-case: Commands are always upper case, but there
> are no case restrictions on variables.
>
>
> So it sounds, however, like there isn't a straight-forward grammar or
> option tweak. That's ok. The language has fancy expressions (algebraic
> expressions, function calls, strings, comments, and arrays), but its
> statement structure is extremely simplistic. The terminators (newline and
> semi-colon) are not allowed anywhere except as terminators (no escapes, not
> in strings, not in comments). So, as a practical solution, I should be able
> to dumb-split a program on terminators, look at the first characters of
> each statement, strip off the command or variable assignment part and parse
> the rest as an expression - which follows more reasonable rules that the
> LATM will like.
>
> So, I guess this falls to the "handled in easier faster ways" approach
> which I guess should have been obvious but I failed to think of.
>
> Thank you for your time, and a great library!
>
>   - Dean
>
>
> On 12/2/20 4:28 PM, Jeffrey Kegler wrote:
> > I'll first describe your immediate problem, then ask a couple Q's.
> >
> > The problem: Lexing is LATM -- *Longest* Acceptable Token Matching.  The
> lexeme priority is a tie breaker, used when tokens are the same length.
> When your grammar fails, "PAx" is your longest token, and the only choice
> at length 3.  "PA" is only 2 chars long, and lexemes of different length
> are not compared for priority.
> >
> > (Btw the reason for this is, as implemented, lexeme priorities can be
> (and are) tested in a few machine instructions.  If Marpa needed to look at
> earlier possibilities, the logic gets vastly very complex, efficiency goes
> out the window, and you get into the territory when the grammar can often
> be handled in easier faster ways.)
> >
> > Now the questions:
> >
> > 1.) I notice statements cannot be multiline.  Is that the intent going
> forward?
> >
> > 2.) In the example, commands always begin with a capital letter,
> variables never do.  Will that continue to be the case?  (If so, it points
> to an easy, fast solution.)
> >
> > Possible solutions, depending, include finding something that
> distinguishes commands from variables in the lexer; custom lexers; using
> events to guide custom lexing; and character-by-character lexing, whereby
> you handle your own whitespace.
> >
>
> --
> You received this message because you are subscribed to the Google Groups
> "marpa parser" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/marpa-parser/270ecda3-6917-f717-593b-051ded20629d%40gmail.com
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/marpa-parser/CA%2B2Wrv-AVsKwhAp7-aVwe96b_-JtrZucu010YmJEFgSO2NrViA%40mail.gmail.com.

Reply via email to