On Thu, 8 Sept 2022 at 01:22, David Rowley <dgrowle...@gmail.com> wrote:
>
> On Thu, 8 Sept 2022 at 01:05, Julien Rouhaud <rjuju...@gmail.com> wrote:
> > FYI lapwing isn't happy with this patch:
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lapwing&dt=2022-09-07%2012%3A40%3A16.
>
> I'll look into it further.

Looks like my analysis wasn't that good in nodeWindowAgg.c.  The
reason it's crashing is due to int2int4_sum() returning
Int64GetDatumFast(transdata->sum).  For 64-bit machines,
Int64GetDatumFast() translates to Int64GetDatum() and and that's
byval, so the MemoryContextContains() call is not triggered, but on
32-bit machines that's PointerGetDatum() and a byref type, and we're
returning a pointer to transdata->sum, which is part way into an
allocation.

Funnily, the struct looks like:

typedef struct Int8TransTypeData
{
int64 count;
int64 sum;
} Int8TransTypeData;

so the previous version of MemoryContextContains() would have
subtracted sizeof(void *) from &transdata->sum which, on this 32-bit
machine would have pointed halfway up the "count" field.  That count
field seems like it would be a good candidate for the "false positive"
that the previous comment in MemoryContextContains mentioned about. So
it looks like it had about a 1 in 2^32 odds of doing the wrong thing
before.

Had the fields in that struct happened to be in the opposite order,
then I don't think it would have crashed, but that's certainly no fix.

I'll need to think about how best to fix this. In the meantime, I
think the other 32-bit animals are probably not going to like this
either :-(

David


Reply via email to