Re: More speedups for tuple deformation

Chao Li Mon, 19 Jan 2026 20:33:51 -0800

> On Jan 20, 2026, at 08:11, David Rowley <[email protected]> wrote:
> 
> On Mon, 19 Jan 2026 at 18:48, Chao Li <[email protected]> wrote:
>> I reviewed the patch and traced some basic workflows. But I haven’t done a 
>> load test to compare performance differences with and without this patch, I 
>> will do that if I get some bandwidth later. Here comes some review comments:
>> 
>> 1 - tupmacs.h
>> ```
>> +       /* Create a mask with all bits beyond natts's bit set to off */
>> +       mask = 0xFF & ((((uint8) 1) << (natts & 7)) - 1);
>> +       byte = (~bits[lastByte]) & mask;
>> ```
>> 
>> When I read the code, I got an impression bits[lastByte] might overflow when 
>> natts % 8 == 0, so I traced the code, then I realized that, this function is 
>> only called when a row has null values, so that, when reaching here, natts % 
>> 8 != 0, otherwise it should return earlier within the for loop.
> 
> It certainly is possible to get to that part of the code when natts is
> a multiple of 8 and the tuple contains NULLs after that (we may not be
> deforming the entire tuple). The code you quoted that's setting "mask"
> in that case will produce a zero mask, resulting in not finding any
> NULLs. I don't quite see any risk of overflowing any of the types
> here.  If natts is 16 then effectively the code does 0xFF & ((1 << 0)
> - 1); so no overflow. Just left shift by 0 bits and bitwise AND with
> zero, resulting in the mask becoming zero.
> 
> How about if I write the comment as follows?
> 
> /*
> * Create a mask with all bits beyond natts's bit set to off.  The code
> * below will generate a zero mask when natts & 7 == 0.  When that happens
> * all bytes that need to be checked were done so in the loop above.  The
> * code below will create an empty mask and end up returning natts.  This
> * has been done to avoid having to write a special case to check if we've
> * covered all bytes already.
> */
> 

I’m sorry I didn’t express myself clearly, maybe I should have used “OOB” 
rather than “overflow". My real concern is about out-of-boundary read of 
bits[lastByte] when natts&7==0.

Say, natts is 16, then bits is 2 bytes long; lastByte = 16>>3 = 2, so bits[2] 
is a OOB read.

If first_null_attr() is only called when hasnulls==true, then it will never hit 
the OOB point, because it will return early from the “for” loop. In the current 
patch, which is true, so the OOB should never happen.

However, I don’t see any comment mentions something like “first_null_attr() 
should only be called when hasnulls is true. If in future one calls 
first_null_attr() in a situation where hasnulls == false, then the OOB will be 
triggered.

The comment you added explains that even if OOB happens, no matter what value 
is hold by bits[lastByte], because mask is 0, the final result is still 
correct, which is true, but OOB is still a concern. If the bits array happens 
to end exactly at the edge of a memory page, the OOB read bits[lastByte] may 
trigger a segment fault; and valgrind may detect the OOB and complain about it.

So, my original comment was that, we should at least add something to the 
header comment to mention “first_null_attr() should only be called when 
hasnulls is true. If we can add an Assert to ensure hasnulls is true, that 
would be even better.

But if we want first_null_attr() to be safe no matter hasnulls is true or 
false, I think we should avoid the OOB.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
Re: More speedups for tuple deformation

Reply via email to