Remove fundamentally-redundant processing in jsonb_agg() et al.

The various variants of jsonb_agg() operate as follows,
for each aggregate input value:

1. Build a JsonbValue tree representation of the input value.
2. Flatten the JsonbValue tree into a Jsonb in on-disk format.
3. Iterate through the Jsonb, building a JsonbValue that is part
of the aggregate's state stored in aggcontext, but is otherwise
identical to what phase 1 built.

This is very slightly less silly than it sounds, because phase 1
involves calling non-JSONB code such as datatype output functions,
which are likely to leak memory, and we don't want to leak into the
aggcontext.  Nonetheless, phases 2 and 3 are accomplishing exactly
nothing that is useful if we can make phase 1 put the JsonbValue
tree where we need it.  We could probably do that with a bunch of
MemoryContextSwitchTo's, but what seems more robust is to give
pushJsonbValue the responsibility of building the JsonbValue tree
in a specified non-current memory context.  The previous patch
created the infrastructure for that, and this patch simply makes
the aggregate functions use it and then rips out phases 2 and 3.

For me, this makes jsonb_agg() with a text column as input run
about 2X faster than before.  It's not yet on par with json_agg(),
but this removes a whole lot of the difference.

Author: Tom Lane <[email protected]>
Reviewed-by: jian he <[email protected]>
Reviewed-by: Chao Li <[email protected]>
Discussion: https://postgr.es/m/[email protected]

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/b61aa76e4585b4b71098b0990b208813a85dbbf5

Modified Files
--------------
src/backend/utils/adt/jsonb.c | 303 +++++++-----------------------------------
1 file changed, 45 insertions(+), 258 deletions(-)

Reply via email to