Hi Adrian, Just a quick note about this part of your answer:
> Le 10 mars 2020 à 11:01, Adrian Vogelsgesang <[email protected]> a > écrit : > >> I'm also wondering whether we should always have a pstate structure, even >> for traditional pull parsers. > I was wondering the same thing. > I would have gone even one point further and ask: shouldn't we just implement > the pull-parser on top of the push parser? I.e., always generate a > push-parser. But if the user requested a pull-parser, just allocate the > push-parser-state on the stack, add a while loop which feeds the push-parser > and be done with it. Absolutely. It makes perfect sense. Unfortunately, according to measurements I made a couple of days ago, it's significantly more expensive. It's probably not measurable in real life parsers, but on a simple calculator, there's a very clear difference. At least on my machine. See the attached zip to try on yours with your favorite compiler. I was hoping that the compiler would be able to de-struct the pstate structure back into several independent variables, some of them on the stack, in the registers, etc. But given the results, it's clearly not what happens at -O2. It might do something like that in the case of -O3, where the results are better, but anyway regular pull in -O2 is still faster than pull-on-top-of-push at -O3. Here are my figures: CXX = clang++-mp-9.0 -O2 -isystem /usr/local/include -L/usr/local/lib ================= pull BM_parse 13847 ns 13788 ns 205465 ================= pull-on-push BM_parse 21867 ns 21785 ns 118286 ================= pull-on-pstate BM_parse 19045 ns 18985 ns 138226 pull-on-state is written by hand: it's an experiment where pstate is a local variable of yyparse. pull-on-push is really api.push-pull=both, and using its yyparse() which invokes yyparse_push repeatedly, with a heap-allocated pstate. Here are the result with -O3: CXX = clang++-mp-9.0 -O3 -isystem /usr/local/include -L/usr/local/lib ================= pull BM_parse 14659 ns 14607 ns 188840 ================= pull-on-push BM_parse 22548 ns 22314 ns 123411 ================= pull-on-pstate BM_parse 14974 ns 14916 ns 185864 Be sure to have Google benchmarks installed, and reachable to your compiler. Edit the Makefile for your environment, and run "make bench". BTW, etc/bench.pl does not seem to be precise enough to measure any difference here: $ ./_build/g9d/etc/bench.pl -g calc [ %d api.push-pull=both ] Possible attempt to put comments in qw() list at ./_build/g9d/etc/bench.pl line 938. Entering directory `benches/38' Using bison=/Users/akim/src/gnu/bison/_build/g9d/tests/bison. 1. %define api.push-pull both 2. Benchmark: timing 100 iterations of 1, 2... 1: 15.992 wallclock secs (15.71 cusr + 0.14 csys = 15.85 CPU) @ 6.31/s (n=100) 2: 15.4657 wallclock secs (15.20 cusr + 0.13 csys = 15.33 CPU) @ 6.52/s (n=100) Rate 1 2 1 6.31/s -- -3% 2 6.52/s 3% -- Sizes (decreasing): 1: 13.53kB 2: 13.24kB It is frightening, since in the past I used it a lot to decide whether changes were ok performance-wise... I will try to convert it to support google-benchmark to get more precise results. Cheers!
<<attachment: push-pull-parser.zip>>
