> I'm curious about where exactly the regression is coming from. Is it possible > that your build for the SSE 4.2 tests was using it unconditionally, i.e., > optimizing away the function pointer?
I am calling the SSE 4.2 implementation directly; I am not even building the pg_sse42_*_choose.c file with the AVX512 choice. As best I can tell there is one extra function call and one extra int64 conditional test when bytes are <256 and a of course a JMP instruction to skip the AVX512 implementation. Paul