Re: Missing free_var() at end of accum_sum_final()?

Dean Rasheed Fri, 03 Mar 2023 07:12:00 -0800

> On 20.02.23 23:16, Joel Jacobson wrote:
> > In the new attached patch, Andres fixed buffer idea has been implemented
> > throughout the entire numeric.c code base.
>


I have been going over this patch, and IMO it's far too invasive for
the fairly modest performance gains (and performance regressions in
some cases) that it gives (which seem to be somewhat smaller on my
machine).

One code change that I am particularly opposed to is changing all the
low-level functions like add_var(), mul_var(), etc., so that they no
longer accept the result being the same variable as any of the inputs.
That is a particularly convenient thing to be able to do, and without
it, various functions become more complex and less readable, and have
to resort to using more temporary variables.

I actually find the whole business of attaching a static buffer and
new buf_len fields to NumericVar quite ugly, and the associated extra
complexity in alloc_var(), free_var(), zero_var(), and
set_var_from_var() is all part of that. Now that might be worth it, if
this gave significant performance gains across the board, but the
trouble is it doesn't. AFAICS, it seems to be just as likely to
degrade performance. For example:

SELECT sqrt(6*sum(1/x/x)) FROM generate_series(1::numeric
,10000000::numeric) g(x);

is consistently 1-2% slower for me, with this patch. That's not much,
but then neither are most of the gains. In a lot of cases, it's so
close to the level of noise that I don't think most users will notice
one way or the other.

So IMO the results just don't justify such an extensive set of
changes, and I think we should abandon this fixed buffer approach.

Having said that, I think the results from the COPY test are worth
looking at more closely. Your results seem to suggest around a 5%
improvement. On my machine it was only around 3%, but that still might
be worth having, if it didn't involve such invasive changes throughout
the rest of the code.

As an experiment, I took another look at my earlier patch, making
make_result() construct the result using the same allocated memory as
the variable's digit buffer (if allocated). That eliminates around a
third of all free_var() calls from numeric.c, and for most functions,
it saves both a palloc() and a pfree(). In the case of numeric_in(), I
realised that it's possible to go further, by reusing the decimal
digits buffer for the NumericVar's digits, and then later for the
final Numeric result. Also, by carefully aligning things, it's
possible to arrange it so that the final make_result() doesn't need to
copy/move the digits at all. With that I get something closer to a 15%
improvement in the COPY test, which is definitely worth having.

In the pi series above, it gave a 3-4% performance improvement, and
that seemed to be a common pattern across a number of other tests.
It's also a much less invasive change, since it's only really changing
make_result(), which makes the knock-on effects much more manageable,
and reduces the chances of any performance regressions.

I didn't do all the tests that you did though, so it would be
interesting to see how it fares in those.

Regards,
Dean

diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
new file mode 100644
index a83feea..c024fcf
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -474,8 +474,14 @@ static void dump_var(const char *str, Nu
 #define dump_var(s,v)
 #endif
 
+/*
+ * Macros to allocate/free numeric digit buffers. Whenever we allocate a digit
+ * buffer, we give it an extra NUMERIC_HDRSZ (8 bytes) of space, so that the
+ * same memory block can be used by make_result() to construct a Numeric result
+ * from it, avoiding palloc/pfree overhead.
+ */
 #define digitbuf_alloc(ndigits)  \
-	((NumericDigit *) palloc((ndigits) * sizeof(NumericDigit)))
+	((NumericDigit *) palloc(NUMERIC_HDRSZ + (ndigits) * sizeof(NumericDigit)))
 #define digitbuf_free(buf)	\
 	do { \
 		 if ((buf) != NULL) \
@@ -783,8 +789,6 @@ numeric_in(PG_FUNCTION_ARGS)
 			ereturn(escontext, (Datum) 0,
 					(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 					 errmsg("value overflows numeric format")));
-
-		free_var(&value);
 	}
 
 	PG_RETURN_NUMERIC(res);
@@ -1141,8 +1145,6 @@ numeric_recv(PG_FUNCTION_ARGS)
 		(void) apply_typmod_special(res, typmod, NULL);
 	}
 
-	free_var(&value);
-
 	PG_RETURN_NUMERIC(res);
 }
 
@@ -1305,8 +1307,6 @@ numeric		(PG_FUNCTION_ARGS)
 	(void) apply_typmod(&var, typmod, NULL);
 	new = make_result(&var);
 
-	free_var(&var);
-
 	PG_RETURN_NUMERIC(new);
 }
 
@@ -1566,7 +1566,6 @@ numeric_round(PG_FUNCTION_ARGS)
 	 */
 	res = make_result(&arg);
 
-	free_var(&arg);
 	PG_RETURN_NUMERIC(res);
 }
 
@@ -1615,7 +1614,6 @@ numeric_trunc(PG_FUNCTION_ARGS)
 	 */
 	res = make_result(&arg);
 
-	free_var(&arg);
 	PG_RETURN_NUMERIC(res);
 }
 
@@ -1642,7 +1640,6 @@ numeric_ceil(PG_FUNCTION_ARGS)
 	ceil_var(&result, &result);
 
 	res = make_result(&result);
-	free_var(&result);
 
 	PG_RETURN_NUMERIC(res);
 }
@@ -1670,7 +1667,6 @@ numeric_floor(PG_FUNCTION_ARGS)
 	floor_var(&result, &result);
 
 	res = make_result(&result);
-	free_var(&result);
 
 	PG_RETURN_NUMERIC(res);
 }
@@ -1793,7 +1789,13 @@ generate_series_step_numeric(PG_FUNCTION
 		(fctx->step.sign == NUMERIC_NEG &&
 		 cmp_var(&fctx->current, &fctx->stop) >= 0))
 	{
-		Numeric		result = make_result(&fctx->current);
+		NumericVar	result_var;
+		Numeric		result;
+
+		/* copy current and make result from copy */
+		init_var(&result_var);
+		set_var_from_var(&fctx->current, &result_var);
+		result = make_result(&result_var);
 
 		/* switch to memory context appropriate for iteration calculation */
 		oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
@@ -2898,8 +2900,6 @@ numeric_add_opt_error(Numeric num1, Nume
 
 	res = make_result_opt_error(&result, have_error);
 
-	free_var(&result);
-
 	return res;
 }
 
@@ -2976,8 +2976,6 @@ numeric_sub_opt_error(Numeric num1, Nume
 
 	res = make_result_opt_error(&result, have_error);
 
-	free_var(&result);
-
 	return res;
 }
 
@@ -3097,8 +3095,6 @@ numeric_mul_opt_error(Numeric num1, Nume
 
 	res = make_result_opt_error(&result, have_error);
 
-	free_var(&result);
-
 	return res;
 }
 
@@ -3232,8 +3228,6 @@ numeric_div_opt_error(Numeric num1, Nume
 
 	res = make_result_opt_error(&result, have_error);
 
-	free_var(&result);
-
 	return res;
 }
 
@@ -3321,8 +3315,6 @@ numeric_div_trunc(PG_FUNCTION_ARGS)
 
 	res = make_result(&result);
 
-	free_var(&result);
-
 	PG_RETURN_NUMERIC(res);
 }
 
@@ -3410,8 +3402,6 @@ numeric_mod_opt_error(Numeric num1, Nume
 
 	res = make_result_opt_error(&result, NULL);
 
-	free_var(&result);
-
 	return res;
 }
 
@@ -3443,8 +3433,6 @@ numeric_inc(PG_FUNCTION_ARGS)
 
 	res = make_result(&arg);
 
-	free_var(&arg);
-
 	PG_RETURN_NUMERIC(res);
 }
 
@@ -3537,8 +3525,6 @@ numeric_gcd(PG_FUNCTION_ARGS)
 
 	res = make_result(&result);
 
-	free_var(&result);
-
 	PG_RETURN_NUMERIC(res);
 }
 
@@ -3597,8 +3583,6 @@ numeric_lcm(PG_FUNCTION_ARGS)
 
 	res = make_result(&result);
 
-	free_var(&result);
-
 	PG_RETURN_NUMERIC(res);
 }
 
@@ -3648,9 +3632,6 @@ numeric_fac(PG_FUNCTION_ARGS)
 
 	res = make_result(&result);
 
-	free_var(&fact);
-	free_var(&result);
-
 	PG_RETURN_NUMERIC(res);
 }
 
@@ -3721,8 +3702,6 @@ numeric_sqrt(PG_FUNCTION_ARGS)
 
 	res = make_result(&result);
 
-	free_var(&result);
-
 	PG_RETURN_NUMERIC(res);
 }
 
@@ -3788,8 +3767,6 @@ numeric_exp(PG_FUNCTION_ARGS)
 
 	res = make_result(&result);
 
-	free_var(&result);
-
 	PG_RETURN_NUMERIC(res);
 }
 
@@ -3837,8 +3814,6 @@ numeric_ln(PG_FUNCTION_ARGS)
 
 	res = make_result(&result);
 
-	free_var(&result);
-
 	PG_RETURN_NUMERIC(res);
 }
 
@@ -3908,8 +3883,6 @@ numeric_log(PG_FUNCTION_ARGS)
 
 	res = make_result(&result);
 
-	free_var(&result);
-
 	PG_RETURN_NUMERIC(res);
 }
 
@@ -4096,8 +4069,6 @@ numeric_power(PG_FUNCTION_ARGS)
 
 	res = make_result(&result);
 
-	free_var(&result);
-
 	PG_RETURN_NUMERIC(res);
 }
 
@@ -4183,7 +4154,6 @@ numeric_min_scale(PG_FUNCTION_ARGS)
 
 	init_var_from_num(num, &arg);
 	min_scale = get_min_scale(&arg);
-	free_var(&arg);
 
 	PG_RETURN_INT32(min_scale);
 }
@@ -4204,7 +4174,6 @@ numeric_trim_scale(PG_FUNCTION_ARGS)
 	init_var_from_num(num, &result);
 	result.dscale = get_min_scale(&result);
 	res = make_result(&result);
-	free_var(&result);
 
 	PG_RETURN_NUMERIC(res);
 }
@@ -4229,8 +4198,6 @@ int64_to_numeric(int64 val)
 
 	res = make_result(&result);
 
-	free_var(&result);
-
 	return res;
 }
 
@@ -4318,8 +4285,6 @@ int64_div_fast_to_numeric(int64 val1, in
 
 	res = make_result(&result);
 
-	free_var(&result);
-
 	return res;
 }
 
@@ -4529,8 +4494,6 @@ float8_numeric(PG_FUNCTION_ARGS)
 
 	res = make_result(&result);
 
-	free_var(&result);
-
 	PG_RETURN_NUMERIC(res);
 }
 
@@ -4623,8 +4586,6 @@ float4_numeric(PG_FUNCTION_ARGS)
 
 	res = make_result(&result);
 
-	free_var(&result);
-
 	PG_RETURN_NUMERIC(res);
 }
 
@@ -6005,8 +5966,6 @@ numeric_poly_sum(PG_FUNCTION_ARGS)
 
 	res = make_result(&result);
 
-	free_var(&result);
-
 	PG_RETURN_NUMERIC(res);
 #else
 	return numeric_sum(fcinfo);
@@ -6035,8 +5994,6 @@ numeric_poly_avg(PG_FUNCTION_ARGS)
 	countd = NumericGetDatum(int64_to_numeric(state->N));
 	sumd = NumericGetDatum(make_result(&result));
 
-	free_var(&result);
-
 	PG_RETURN_DATUM(DirectFunctionCall2(numeric_div, sumd, countd));
 #else
 	return numeric_avg(fcinfo);
@@ -6073,7 +6030,6 @@ numeric_avg(PG_FUNCTION_ARGS)
 	init_var(&sumX_var);
 	accum_sum_final(&state->sumX, &sumX_var);
 	sumX_datum = NumericGetDatum(make_result(&sumX_var));
-	free_var(&sumX_var);
 
 	PG_RETURN_DATUM(DirectFunctionCall2(numeric_div, sumX_datum, N_datum));
 }
@@ -6105,7 +6061,6 @@ numeric_sum(PG_FUNCTION_ARGS)
 	init_var(&sumX_var);
 	accum_sum_final(&state->sumX, &sumX_var);
 	result = make_result(&sumX_var);
-	free_var(&sumX_var);
 
 	PG_RETURN_NUMERIC(result);
 }
@@ -6198,10 +6153,6 @@ numeric_stddev_internal(NumericAggState
 		res = make_result(&vsumX);
 	}
 
-	free_var(&vNminus1);
-	free_var(&vsumX);
-	free_var(&vsumX2);
-
 	return res;
 }
 
@@ -6934,6 +6885,7 @@ set_var_from_str(const char *str, const
 {
 	bool		have_dp = false;
 	int			i;
+	unsigned char *buf;
 	unsigned char *decdigits;
 	int			sign = NUMERIC_POS;
 	int			dweight = -1;
@@ -6970,7 +6922,14 @@ set_var_from_str(const char *str, const
 	if (!isdigit((unsigned char) *cp))
 		goto invalid_syntax;
 
-	decdigits = (unsigned char *) palloc(strlen(cp) + DEC_DIGITS * 2);
+	/*
+	 * Allocate a buffer for the decimal digit values, with space for leading
+	 * and trailing padding for digit alignment (DEC_DIGITS * 2).  This same
+	 * buffer will then be reused for the result's digit buffer, which might
+	 * later become a Numeric, so also include space for the numeric header.
+	 */
+	buf = (unsigned char *) palloc(NUMERIC_HDRSZ + strlen(cp) + DEC_DIGITS * 2);
+	decdigits = buf + NUMERIC_HDRSZ;
 
 	/* leading padding for digit alignment later */
 	memset(decdigits, 0, DEC_DIGITS);
@@ -7081,7 +7040,21 @@ set_var_from_str(const char *str, const
 	offset = (weight + 1) * DEC_DIGITS - (dweight + 1);
 	ndigits = (ddigits + offset + DEC_DIGITS - 1) / DEC_DIGITS;
 
-	alloc_var(dest, ndigits);
+	/*
+	 * Reuse the same buffer for the result to save a palloc() and pfree(),
+	 * making the digits start where they would in a Numeric result so that
+	 * make_result() doesn't need to move them if this variable is converted
+	 * to a Numeric.  Note that decdigits starts at an offset of NUMERIC_HDRSZ
+	 * into the buffer, so there's no danger of overwriting the decimal digits
+	 * as we produce the result digits.
+	 */
+	digitbuf_free(dest->buf);
+	dest->buf = (NumericDigit *) buf;
+	dest->digits = NUMERIC_CAN_BE_SHORT(dscale, weight) ?
+				   &((Numeric) (buf))->choice.n_short.n_data[0] :
+				   &((Numeric) (buf))->choice.n_long.n_data[0];
+	dest->digits[-1] = 0;			/* spare digit for rounding */
+	dest->ndigits = ndigits;
 	dest->sign = sign;
 	dest->weight = weight;
 	dest->dscale = dscale;
@@ -7104,8 +7077,6 @@ set_var_from_str(const char *str, const
 		i += DEC_DIGITS;
 	}
 
-	pfree(decdigits);
-
 	/* Strip any leading/trailing zeroes, and normalize weight if zero */
 	strip_var(dest);
 
@@ -7691,11 +7662,19 @@ duplicate_numeric(Numeric num)
 /*
  * make_result_opt_error() -
  *
- *	Create the packed db numeric format in palloc()'d memory from
- *	a variable.  This will handle NaN and Infinity cases.
+ *	Create the packed db numeric format from a variable.  This will handle NaN
+ *	and Infinity cases.
  *
  *	If "have_error" isn't NULL, on overflow *have_error is set to true and
  *	NULL is returned.  This is helpful when caller needs to handle errors.
+ *
+ *	NOTE: This reuses the memory allocated for the variable's digit buffer, if
+ *	possible.  The variable must not be freed or used in any way afterwards (we
+ *	may trample over its digit buffer).  If the variable doesn't have an
+ *	allocated digit buffer, new memory will be palloc'd and returned.  This
+ *	makes it safe to use with constant variables (const_one, const_nan, etc.),
+ *	as well as variables initialized using init_var_from_num(), zeroed using
+ *	zero_var(), etc.
  */
 static Numeric
 make_result_opt_error(const NumericVar *var, bool *have_error)
@@ -7756,7 +7735,9 @@ make_result_opt_error(const NumericVar *
 	if (NUMERIC_CAN_BE_SHORT(var->dscale, weight))
 	{
 		len = NUMERIC_HDRSZ_SHORT + n * sizeof(NumericDigit);
-		result = (Numeric) palloc(len);
+		result = var->buf ? (Numeric) var->buf : (Numeric) palloc(len);
+		if (n > 0 && result->choice.n_short.n_data != digits)
+			memmove(result->choice.n_short.n_data, digits, n * sizeof(NumericDigit));
 		SET_VARSIZE(result, len);
 		result->choice.n_short.n_header =
 			(sign == NUMERIC_NEG ? (NUMERIC_SHORT | NUMERIC_SHORT_SIGN_MASK)
@@ -7768,7 +7749,9 @@ make_result_opt_error(const NumericVar *
 	else
 	{
 		len = NUMERIC_HDRSZ + n * sizeof(NumericDigit);
-		result = (Numeric) palloc(len);
+		result = var->buf ? (Numeric) var->buf : (Numeric) palloc(len);
+		if (n > 0 && result->choice.n_long.n_data != digits)
+			memmove(result->choice.n_long.n_data, digits, n * sizeof(NumericDigit));
 		SET_VARSIZE(result, len);
 		result->choice.n_long.n_sign_dscale =
 			sign | (var->dscale & NUMERIC_DSCALE_MASK);
@@ -7776,8 +7759,6 @@ make_result_opt_error(const NumericVar *
 	}
 
 	Assert(NUMERIC_NDIGITS(result) == n);
-	if (n > 0)
-		memcpy(NUMERIC_DIGITS(result), digits, n * sizeof(NumericDigit));
 
 	/* Check for overflow of int16 fields */
 	if (NUMERIC_WEIGHT(result) != weight ||

Re: Missing free_var() at end of accum_sum_final()?

Reply via email to