On 24/12/2024 16:03, Sam Russell wrote:
I've released a new paper here https://arxiv.org/abs/2412.16398 and this was the easiest algorithm to implement from it. It gets a 5-20% speedup for SSE/AVX1 and diminishing returns for AVX2/AVX512
Ignoring this as looks applicable to gnulib not coreutils, and I think you've already landed this in gnulib. cheers, Pádraig
