Hi hackers, We found that several functions -- namely numeric_combine, numeric_avg_combine, numeric_poly_combine, and int8_avg_combine -- are returning NULL without signaling the nullity of datum in fcinfo.isnull. This is obscured by the fact that the only functions in core (finalfunc for various aggregates) that those return values feed into happen to tolerate (or rather, not quite distinguish) zero-but-not-NULL trans values.
In Greenplum, this behavior becomes problematic because Greenplum serializes internal trans values before spilling the hash table. The serial functions (numeric_serialize and friends) are strict functions that will blow up when they are given null (either in the C sense or the SQL sense) inputs. In Postgres if we change hash aggregation in the future to spill the hash table (vis-à-vis the input tuples), this issues would manifest itself in the final aggregate because we'll serialize the combined (and likely incorrectly null) trans values. Please find attached a small patch fixing said issue. Originally reported by Denis Smirnov over at https://github.com/greenplum-db/gpdb/pull/9878 Cheers, Jesse and Deep
From edeeaa220fbc3d082d133b29a8c394632588d11a Mon Sep 17 00:00:00 2001 From: Denis Smirnov <s...@arenadata.io> Date: Thu, 9 Apr 2020 16:10:29 -0700 Subject: [PATCH] Properly mark NULL returns in numeric aggregates When a function returns NULL (in the SQL sense), it is expected to signal that in fcinfo.isnull . numeric_combine and friends break this rule by directly returning a NULL datum without signaling the nullity. This is obscured by the fact that the only functions in core (finalfunc for various aggregates) that those return values feed into happen to tolerate (or rather, not quite distinguish) zero-but-not-NULL trans values. In Greenplum, this behavior becomes problematic because Greenplum serializes internal trans values before spilling the hash table. The serial functions (numeric_serialize and friends) are strict functions that will blow up when they are given null (either in the C sense or the SQL sense) inputs. --- src/backend/utils/adt/numeric.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c index 9986132b45e..57896de6edf 100644 --- a/src/backend/utils/adt/numeric.c +++ b/src/backend/utils/adt/numeric.c @@ -3946,7 +3946,11 @@ numeric_combine(PG_FUNCTION_ARGS) state2 = PG_ARGISNULL(1) ? NULL : (NumericAggState *) PG_GETARG_POINTER(1); if (state2 == NULL) + { + if (state1 == NULL) + PG_RETURN_NULL(); PG_RETURN_POINTER(state1); + } /* manually copy all fields from state2 to state1 */ if (state1 == NULL) @@ -4034,7 +4038,11 @@ numeric_avg_combine(PG_FUNCTION_ARGS) state2 = PG_ARGISNULL(1) ? NULL : (NumericAggState *) PG_GETARG_POINTER(1); if (state2 == NULL) + { + if (state1 == NULL) + PG_RETURN_NULL(); PG_RETURN_POINTER(state1); + } /* manually copy all fields from state2 to state1 */ if (state1 == NULL) @@ -4543,7 +4551,11 @@ numeric_poly_combine(PG_FUNCTION_ARGS) state2 = PG_ARGISNULL(1) ? NULL : (PolyNumAggState *) PG_GETARG_POINTER(1); if (state2 == NULL) + { + if (state1 == NULL) + PG_RETURN_NULL(); PG_RETURN_POINTER(state1); + } /* manually copy all fields from state2 to state1 */ if (state1 == NULL) @@ -4774,7 +4786,11 @@ int8_avg_combine(PG_FUNCTION_ARGS) state2 = PG_ARGISNULL(1) ? NULL : (PolyNumAggState *) PG_GETARG_POINTER(1); if (state2 == NULL) + { + if (state1 == NULL) + PG_RETURN_NULL(); PG_RETURN_POINTER(state1); + } /* manually copy all fields from state2 to state1 */ if (state1 == NULL) -- 2.26.0