Hi Hans, On Wed, 2023-08-30 at 10:26 +0200, Hans Hagen wrote: > so the tl 2023 bin is slower but when you compile fresh the performenac > is the same? if so ... then why waste time on it if generating a new bin > solves the problem?
TL doesn't like updating binaries mid-year without a good reason, but this is a pretty significant drop in performance. This issue only affects a single architecture/OS, so an update shouldn't be too hard. Hopefully this thread will persuade Karl/Norbert :) > Factorial does little (not spread all over token space). Some mem access > for registers, a little amount of macro tokens that likely sit in the > cpu cache. Plus making a macro that gets larger body every iteration (so > that is actually the bottleneck as it involved copying tokens). As you > start ini tokens are not scattered that much. Rebuilding the LaTeX format is a much broader test (loading expl3 does lots of stuff) and the results there matched up pretty well with the factorial test. > So, do you see the same 50 % drop with the current luatex when you > compile without O3 ? Ok, good question. "factorial.tex" is the ini-mode factorial test file from earlier in the thread. "O0" is a freshly-built LuaTeX v1.17.0 built with the LuaTeX build.sh and "CFLAGS=-O0", "O3" is the same with "CFLAGS=-O3", "tl23- orig" is the LuaTeX v1.16.0 distributed with the initial release of TL23, and "tl23-current" is the LuaTeX v1.17.0 in TL23 right now. Results: $ hyperfine --warmup 2 \ -L ver O0,O3,tl23-orig,tl23-current \ 'PATH=/tmp/texlive-testing/{ver}/bin/x86_64-linux:/bin/ luatex -ini factorial.tex' Benchmark 1: PATH=/tmp/texlive-testing/O0/bin/x86_64-linux:/bin/ luatex -ini factorial.tex Time (mean ± σ): 5.257 s ± 0.056 s [User: 5.232 s, System: 0.024 s] Range (min … max): 5.202 s … 5.332 s 10 runs Benchmark 2: PATH=/tmp/texlive-testing/O3/bin/x86_64-linux:/bin/ luatex -ini factorial.tex Time (mean ± σ): 3.713 s ± 0.047 s [User: 3.690 s, System: 0.023 s] Range (min … max): 3.646 s … 3.770 s 10 runs Benchmark 3: PATH=/tmp/texlive-testing/tl23-orig/bin/x86_64-linux:/bin/ luatex -ini factorial.tex Time (mean ± σ): 3.941 s ± 0.044 s [User: 3.919 s, System: 0.022 s] Range (min … max): 3.853 s … 3.973 s 10 runs Benchmark 4: PATH=/tmp/texlive-testing/tl23-current/bin/x86_64-linux:/bin/ luatex -ini factorial.tex Time (mean ± σ): 5.447 s ± 0.056 s [User: 5.420 s, System: 0.026 s] Range (min … max): 5.386 s … 5.516 s 10 runs Summary PATH=/tmp/texlive-testing/O3/bin/x86_64-linux:/bin/ luatex -ini factorial.tex ran 1.06 ± 0.02 times faster than PATH=/tmp/texlive-testing/tl23-orig/bin/x86_64-linux:/bin/ luatex -ini factorial.tex 1.42 ± 0.02 times faster than PATH=/tmp/texlive-testing/O0/bin/x86_64-linux:/bin/ luatex -ini factorial.tex 1.47 ± 0.02 times faster than PATH=/tmp/texlive-testing/tl23-current/bin/x86_64-linux:/bin/ luatex -ini factorial.tex So yeah, it looks like the original TL23 binary was -O3, but the current one is now -O0. Karl, could you please rebuild the TL LuaTeX binaries with -O3 for x86_64-linux? I think that that should solve this problem. Thanks, -- Max