> For scripting language implementations, I tend to disagree. The rule that lexing costs outweigh parsing is very old and pretty consistent. I recall getting a ~40% speedup from changing one assembly-language instruction in the tokenizer of a text processor on the old Honeywell mainframe. However, it assumes you're just comparing the raw parsing (e.g., no actions). Obviously, a real implementation could be doing arbitrary amounts of work "in the parser", but that doesn't obviate the rule that handwriting the lexer is almost always the better source of speedup than handwriting the parser.
Also, of course, if you are reading your source code from a file, *that* invokes the rule that I/O swamps everything else, which has only gotten more true over the decades. The person wanting to have fun handwriting for efficiency on a modern Intel architecture should really consider working on making the I/O, lexing, and parsing take place in concurrent threads. This requires making the lexing stateless and buffered (so the tokens tend to migrate out of that processor's cache naturally before the parser CPU tries to fetch them), so that the parser fetches a token by incrementing a pointer most of the time (any per-token function call in the parser can be a disaster for really high-speed operation, speaking from experience in C). IMHO. On Fri, Oct 2, 2015 at 3:22 PM, Eric Wong <normalper...@yhbt.net> wrote: > John Levine <jo...@iecc.com> wrote: > > In article < > camk0+vvsk082jz_c_uc7moforfmk+katvx423rs4xvkt7wh...@mail.gmail.com> you > write: > > >- yes, after tweaking, your manual parser will probably be faster. > > >- but that assumes you put all the necessary time into tweaking > > >- and you put in all the necessary time to get it functionally correct > in > > >the first place > > >- but re-implementing Bison's nifty error unrolling is considered > Extremely > > >Nontrivial. > > > > A more important point is that the time spent in the parser is never > > significant. If your compiler is simple, the bulk of the time is > > in the lexer since it has to touch each character in the input. If > > your compiler is sophisticated, it'll spend a its time in analysis and > > optimization. > > For scripting language implementations, I tend to disagree. > > With the C implementation of Ruby, the parser usually shows up at > or near the top of profiles for short-lived scripts. > > Hoping to learn more about Bison and speeding up Ruby's parsing > is why I started following this list, anyhow :) > > git-svn (Perl) startup time is atrocious, too, profiling showed much of > that coming from the parser as well. I am not at all familiar with Perl > internals, however. > > _______________________________________________ > help-bison@gnu.org https://lists.gnu.org/mailman/listinfo/help-bison > _______________________________________________ help-bison@gnu.org https://lists.gnu.org/mailman/listinfo/help-bison