Hi hackers,

numeric_poly_deserialize and numeric_deserialize describe themselves
(in their header comments) as the deserializers for aggregate functions
which require sumX2: var_pop, var_samp, stddev_pop, stddev_samp, and
the regr_* family. Both nevertheless construct their result with
calcSumX2 = false:

    /* numeric_poly_deserialize */
    result = makeInt128AggStateCurrentContext(false);
    /* numeric_deserialize */
    result = makeNumericAggStateCurrentContext(false);

Within the in-tree pipeline this is harmless: deserialize is only ever
followed by numeric_(poly_)combine, neither of which reads the flag.
But for callers that follow a deserialize with a transition call
(extensions round-tripping aggregate state through external storage,
in my case), it's wrong: do_int128_accum / do_numeric_accum gate
"sumX2 += value * value" on state->calcSumX2, so a deserialized
var/stddev state silently stops accumulating sumX2 while N and sumX
continue to update.

There's precedent for setting the flag to its semantically correct
value even when the in-tree pipeline doesn't observe it: numeric_combine
creates its fresh state with calcSumX2 = true. Extending that convention
to the deserialize side would align the two sites.

I've attached a two-line patch. Would this be worth fixing, or is
there a reason the current default is intentional that I'm missing?

Thanks,
Maxime Schoemans

Attachment: numeric-calcsumx2-deserialize.patch
Description: Binary data

Reply via email to