Re: [HACKERS] EXPLAIN VERBOSE with parallel Aggregate

David Rowley Sun, 17 Apr 2016 00:19:33 -0700

On 16 April 2016 at 04:27, Tom Lane <[email protected]> wrote:
> Robert Haas <[email protected]> writes:
>> I definitely agree that the current output is messed up, but I'm not
>> sure your proposed output is much better.  I wonder if it shouldn't
>> say something like:
>> Output: serialfn(transfn(args))
>> for the partial aggregate and
>> Output: finalfn(combinefn(deserialfn(args)))
>> for the finalize aggregate step.
>
>> Or maybe just insert the word PARTIAL before each partial aggregate
>> step, like PARTIAL sum(num) for the partial step and then just
>> sum(num) for the final step.
>
> +1 for the latter, if we can do it conveniently.  I think exposing
> the names of the aggregate implementation functions would be very
> user-unfriendly, as nobody but us hackers knows what those are.


It does not really seem all that convenient to do this. It also seems
a bit strange to me to have a parent node report a column which does
not exist in any nodes descending from it. Remember that the combine
Aggref does not have the same ->args as its corresponding partial
Aggref. It's not all that clear to me if there is any nice way to do
have this work the way you'd like. If we were to just call
get_rule_expr() on the first arg of the combine aggregate node, it
would re-print the PARTIAL keyword again.

In the attached I have it displaying as:

postgres=# explain verbose select count(*),max(a),avg(a) filter (where
a = 0) from t;
                                              QUERY PLAN
-------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=14739.56..14739.57 rows=1 width=44)
   Output: pg_catalog.count(*), max((max(a))), pg_catalog.avg((PARTIAL
avg(a) FILTER (WHERE (a = 0))))
   ->  Gather  (cost=14739.33..14739.54 rows=2 width=44)
         Output: (count(*)), (max(a)), (PARTIAL avg(a) FILTER (WHERE (a = 0)))
         Workers Planned: 2
         ->  Partial Aggregate  (cost=13739.33..13739.34 rows=1 width=44)
               Output: count(*), max(a), PARTIAL avg(a) FILTER (WHERE (a = 0))
               ->  Parallel Seq Scan on public.t  (cost=0.00..9572.67
rows=416667 width=4)
                     Output: a
(9 rows)

I like this much better, as there's no fudging of any function
arguments to print them as something they're not.

Note that max() does not get tagged with PARTIAL since it has no
finalfn. Obtaining this information does require a syscache lookup in
get_agg_expr(), but I've managed to make that only happen when
aggpartial is true.

>> I think ending up with sum(sum(num)) is
>> right out.  It doesn't look so bad for that case but avg(avg(num))
>> would certainly imply something that's not the actual behavior.
>
> Agreed.

Note that I've done nothing for the weird schema prefixing problem I
mentioned. I think I'd need input on the best way to solve this. If
it's actually a problem.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

parallel_agg_explain_verbose_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN VERBOSE with parallel Aggregate

Reply via email to