Remove fundamentally-redundant processing in jsonb_agg() et al. The various variants of jsonb_agg() operate as follows, for each aggregate input value:
1. Build a JsonbValue tree representation of the input value. 2. Flatten the JsonbValue tree into a Jsonb in on-disk format. 3. Iterate through the Jsonb, building a JsonbValue that is part of the aggregate's state stored in aggcontext, but is otherwise identical to what phase 1 built. This is very slightly less silly than it sounds, because phase 1 involves calling non-JSONB code such as datatype output functions, which are likely to leak memory, and we don't want to leak into the aggcontext. Nonetheless, phases 2 and 3 are accomplishing exactly nothing that is useful if we can make phase 1 put the JsonbValue tree where we need it. We could probably do that with a bunch of MemoryContextSwitchTo's, but what seems more robust is to give pushJsonbValue the responsibility of building the JsonbValue tree in a specified non-current memory context. The previous patch created the infrastructure for that, and this patch simply makes the aggregate functions use it and then rips out phases 2 and 3. For me, this makes jsonb_agg() with a text column as input run about 2X faster than before. It's not yet on par with json_agg(), but this removes a whole lot of the difference. Author: Tom Lane <[email protected]> Reviewed-by: jian he <[email protected]> Reviewed-by: Chao Li <[email protected]> Discussion: https://postgr.es/m/[email protected] Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/b61aa76e4585b4b71098b0990b208813a85dbbf5 Modified Files -------------- src/backend/utils/adt/jsonb.c | 303 +++++++----------------------------------- 1 file changed, 45 insertions(+), 258 deletions(-)
