On Thu, Nov 20, 2025 at 03:55:43PM +0300, Nazir Bilal Yavuz wrote:
> On Thu, 20 Nov 2025 at 00:01, Nathan Bossart <[email protected]> wrote:
>> +            /* Load a chunk of data into a vector register */
>> +            vector8_load(&chunk, (const uint8 *) 
>> &copy_input_buf[input_buf_ptr]);
>>
>> In other places, processing 2 or 4 vectors of data at a time has proven
>> faster.  Have you tried that here?
> 
> Sorry, I could not find the related code piece. I only saw the
> vector8_load() inside of hex_decode_safe() function and its comment
> says:
> 
> /*
>  * We must process 2 vectors at a time since the output will be half the
>  * length of the input.
>  */
> 
> But this does not mention any speedup from using 2 vectors at a time.
> Could you please show the related code?

See pg_lfind32().

-- 
nathan


Reply via email to