On Thu, Nov 20, 2025 at 03:55:43PM +0300, Nazir Bilal Yavuz wrote: > On Thu, 20 Nov 2025 at 00:01, Nathan Bossart <[email protected]> wrote: >> + /* Load a chunk of data into a vector register */ >> + vector8_load(&chunk, (const uint8 *) >> ©_input_buf[input_buf_ptr]); >> >> In other places, processing 2 or 4 vectors of data at a time has proven >> faster. Have you tried that here? > > Sorry, I could not find the related code piece. I only saw the > vector8_load() inside of hex_decode_safe() function and its comment > says: > > /* > * We must process 2 vectors at a time since the output will be half the > * length of the input. > */ > > But this does not mention any speedup from using 2 vectors at a time. > Could you please show the related code?
See pg_lfind32(). -- nathan
