On Wed, 24 Jan 2024 at 17:59, Marco Pivetta <ocram...@gmail.com> wrote: > > Depends on the actual numbers: is there any way to make a comparison that > is relatively stable across architectures? > > Would it be feasible to start with the > cross-platform-let-the-compiler-do-its-job version (that somebody may > actually be capable of auditing), and then introduce other versions when > the jump is significant enough? >
don't know about "relatively stable across architectures" but wrote some benchmarking code, keep reading. On Wed, 24 Jan 2024 at 17:55, tag Knife <fennic...@gmail.com> wrote: > Should we even be considering the specific instruction implementations? > I've always been in the camp > of you are not smarter than the compiler. As even the best human written > ASM code can be slower > than the obscure instructions the compiler might choose to use in a weird > and wonderful way. The BLAKE3 team is smarter than GCC11.4, even with -march=native -mtune=native, which is *not* commonly used in PHP, the compiler didn't stand a chance against the hand-optimized assembly versions, wrote some benchmarks, but the TL;DR is: portable -O2 usually used by PHP managed 1126MB/s, portable -O2 -march=native managed 533MB/s (wtf? gcc obviously got something wrong here), hand-written -O2 SSE2 managed 3144MB/s, hand-written -O2 SSE41 managed 3332MB/s, hand-written -O2 avx2 managed 6554MB/s, hand-writen -O2 AVX512 managed 8913MB/s, on my AMD Ryzen 9 7950x, benchmarking code: https://gist.github.com/divinity76/5729472dd5d77e94cd0acb245aac2226 raw output: array(6) { ["O2-portable-march"]=> array(2) { ["microseconds_for_16_kib"]=> int(29295) ["mb_per_second"]=> float(533.3674688513398) } ["O2-portable"]=> array(2) { ["microseconds_for_16_kib"]=> int(13876) ["mb_per_second"]=> float(1126.0449697319111) } ["O2-sse2"]=> array(2) { ["microseconds_for_16_kib"]=> int(4969) ["mb_per_second"]=> float(3144.4958744214127) } ["O2-sse41"]=> array(2) { ["microseconds_for_16_kib"]=> int(4688) ["mb_per_second"]=> float(3332.977815699659) } ["O2-avx2"]=> array(2) { ["microseconds_for_16_kib"]=> int(2384) ["mb_per_second"]=> float(6554.1107382550335) } ["O2-avx512"]=> array(2) { ["microseconds_for_16_kib"]=> int(1753) ["mb_per_second"]=> float(8913.291500285226) } } -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php