Hi hackers, numeric_poly_deserialize and numeric_deserialize describe themselves (in their header comments) as the deserializers for aggregate functions which require sumX2: var_pop, var_samp, stddev_pop, stddev_samp, and the regr_* family. Both nevertheless construct their result with calcSumX2 = false:
/* numeric_poly_deserialize */
result = makeInt128AggStateCurrentContext(false);
/* numeric_deserialize */
result = makeNumericAggStateCurrentContext(false);
Within the in-tree pipeline this is harmless: deserialize is only ever
followed by numeric_(poly_)combine, neither of which reads the flag.
But for callers that follow a deserialize with a transition call
(extensions round-tripping aggregate state through external storage,
in my case), it's wrong: do_int128_accum / do_numeric_accum gate
"sumX2 += value * value" on state->calcSumX2, so a deserialized
var/stddev state silently stops accumulating sumX2 while N and sumX
continue to update.
There's precedent for setting the flag to its semantically correct
value even when the in-tree pipeline doesn't observe it: numeric_combine
creates its fresh state with calcSumX2 = true. Extending that convention
to the deserialize side would align the two sites.
I've attached a two-line patch. Would this be worth fixing, or is
there a reason the current default is intentional that I'm missing?
Thanks,
Maxime Schoemans
numeric-calcsumx2-deserialize.patch
Description: Binary data
