Oops! Ignore that last! It suffers from the same problem -- longest captures the match and all the rest of that hack-ish apparatus I talked about would be ignored.
I've been doing mainly LaTeX these days and obviously it is turning my brain into mush. Sorry! On Wednesday, December 2, 2020 at 5:55:24 PM UTC-5 Jeffrey Kegler wrote: > How many commands? One approach that I just thought of and have never > tested is to have fixed length variables, prioritized versus the commands > of that length. > > Off the top of my head: > > var2 ~ <varchar><varchar> priority=>1 > PA_command ~ 'PA' priority=>2 > PR_command ~ 'PR' priority=>2 > > include a catch-all var for lengths not specifically accounted for > > var_catch ~ <varchar>+ priority=>0 > > <var_catch> will always lose to lexemes of the same length, but will catch > those variables whose length is not the same as any command. > > A little hack-ish, but should be very fast, and perhaps no more hack-ish > than alternatives. > > Again, not tested, so you'll be a pioneer! If you try it, let me know! > > Hope this helps, jeffrey > > > On Wed, Dec 2, 2020 at 5:42 PM Dean S <[email protected]> wrote: > >> >> Thank you very much for your response! >> >> 1) Multiline: The language does have a "_" line-joining character, but >> the grammar wouldn't have to support that - it could be done with a trivial >> preprocessor. Once joined, commands may not span multiple lines. >> >> 2) Command/variable upper-case: Commands are always upper case, but there >> are no case restrictions on variables. >> >> >> So it sounds, however, like there isn't a straight-forward grammar or >> option tweak. That's ok. The language has fancy expressions (algebraic >> expressions, function calls, strings, comments, and arrays), but its >> statement structure is extremely simplistic. The terminators (newline and >> semi-colon) are not allowed anywhere except as terminators (no escapes, not >> in strings, not in comments). So, as a practical solution, I should be able >> to dumb-split a program on terminators, look at the first characters of >> each statement, strip off the command or variable assignment part and parse >> the rest as an expression - which follows more reasonable rules that the >> LATM will like. >> >> So, I guess this falls to the "handled in easier faster ways" approach >> which I guess should have been obvious but I failed to think of. >> >> Thank you for your time, and a great library! >> >> - Dean >> >> >> On 12/2/20 4:28 PM, Jeffrey Kegler wrote: >> > I'll first describe your immediate problem, then ask a couple Q's. >> > >> > The problem: Lexing is LATM -- *Longest* Acceptable Token Matching. >> The lexeme priority is a tie breaker, used when tokens are the same >> length. When your grammar fails, "PAx" is your longest token, and the only >> choice at length 3. "PA" is only 2 chars long, and lexemes of different >> length are not compared for priority. >> > >> > (Btw the reason for this is, as implemented, lexeme priorities can be >> (and are) tested in a few machine instructions. If Marpa needed to look at >> earlier possibilities, the logic gets vastly very complex, efficiency goes >> out the window, and you get into the territory when the grammar can often >> be handled in easier faster ways.) >> > >> > Now the questions: >> > >> > 1.) I notice statements cannot be multiline. Is that the intent going >> forward? >> > >> > 2.) In the example, commands always begin with a capital letter, >> variables never do. Will that continue to be the case? (If so, it points >> to an easy, fast solution.) >> > >> > Possible solutions, depending, include finding something that >> distinguishes commands from variables in the lexer; custom lexers; using >> events to guide custom lexing; and character-by-character lexing, whereby >> you handle your own whitespace. >> > >> >> -- >> You received this message because you are subscribed to the Google Groups >> "marpa parser" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> > To view this discussion on the web visit >> https://groups.google.com/d/msgid/marpa-parser/270ecda3-6917-f717-593b-051ded20629d%40gmail.com >> . >> > -- You received this message because you are subscribed to the Google Groups "marpa parser" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/marpa-parser/6c5d7b0e-bdee-4c8b-84e2-5282e5ab3c03n%40googlegroups.com.
