Re: [HACKERS] Parallel Aggregate costs don't consider combine/serial/deserial funcs

2016-04-13 Thread Robert Haas
On Tue, Apr 12, 2016 at 5:38 PM, David Rowley
 wrote:
>>> One small point which I was a little unsure of in the attached is,
>>> should the "if (aggref->aggdirectargs)" part of
>>> count_agg_clauses_walker() be within the "if
>>> (!context->combineStates)". I simply couldn't decide. We currently
>>> have no aggregates which this affects anyway, per; select * from
>>> pg_aggregate where aggcombinefn <> 0 and aggkind <> 'n'; so for now
>>> I've left it outwith.
>>
>> The direct arguments would be evaluated in the worker, but not in the
>> leader, right?  Or am I confused?
>
> That seems right, but I just can't think of how its possible to
> parallelise these aggregates anyway.

Well, if you could ensure that each worker would see a whole group,
you could do it, I think.  But it's probably fine to just leave this
for now.  It's not like it can't be changed if somebody figures out
some cool thing to do in this area.

>>> Another thing I thought of is that it's not too nice that I have to
>>> pass 3 bools to count_agg_clauses() in order to tell it what to do. I
>>> was tempted to invent some bitmask flags for this, then modify
>>> create_agg_path() to use the same flags, but I thought I'd better not
>>> cause too much churn with this patch.
>>
>> I'm kinda tempted to say this should be using an enum.  I note that
>> serialStates has a subtly different meaning here than in some other
>> places where you have used the same term.
>
> hmm, I'm not sure how it's subtly different. Do you mean the
> preference towards costing the finalfn when finalizeAggs is true, and
> ignoring the serialfn in this case? nodeAgg.c should do the same,
> although it'll deserialize in such a case. We can never finalize and
> serialize in the same node.

I mean that, IIUC, in some other places where you use serialStates,
true means that (de)serialization is known to be needed.  Here,
however, it only means it might be needed, contingent on whether the
serial/deserial functions are actually present.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Aggregate costs don't consider combine/serial/deserial funcs

2016-04-12 Thread David Rowley
On 13 April 2016 at 08:52, Robert Haas  wrote:
> On Sun, Apr 10, 2016 at 8:47 AM, David Rowley
>  wrote:
>> I realised a few days ago that the parallel aggregate code does not
>> cost for the combine, serialisation and deserialisation functions at
>> all.
>
> Oops.
>
>> I've attached a patch which fixes this.
>
> I've committed this patch.  I wonder if it's going to produce compiler
> warnings for some people, complaining about possible use of an
> uninitialized variable.  That would kind of suck.  I don't much mind
> having to insert a dummy assignment to shut the compiler up; a smarter
> compiler will just throw it out anyway.  I'm less enthused about a
> dummy MemSet.  The compiler is less likely to be able to get rid of
> that, and it's more expensive if it doesn't.  But let's see what
> happens.

Thanks for committing.

I wondered that too, so checked a couple of compilers and got no
warnings, but the buildfarm should let us know. The other option would
be to palloc() them, and have them set to NULL initially... that's not
very nice either... Another option would be to protect the final
parallel path generation with if (grouped_rel->partial_pathlist &&
grouped_rel->consider_parallel). I'd imagine any compiler smart enough
to work out that uninitialised is not possible would also be able to
remove the check for grouped_rel->consider_parallel, but *shrug*, I
don't often look at the assembly that compilers generate, so I might
be giving them too much credit.

>> One small point which I was a little unsure of in the attached is,
>> should the "if (aggref->aggdirectargs)" part of
>> count_agg_clauses_walker() be within the "if
>> (!context->combineStates)". I simply couldn't decide. We currently
>> have no aggregates which this affects anyway, per; select * from
>> pg_aggregate where aggcombinefn <> 0 and aggkind <> 'n'; so for now
>> I've left it outwith.
>
> The direct arguments would be evaluated in the worker, but not in the
> leader, right?  Or am I confused?

That seems right, but I just can't think of how its possible to
parallelise these aggregates anyway.

>> Another thing I thought of is that it's not too nice that I have to
>> pass 3 bools to count_agg_clauses() in order to tell it what to do. I
>> was tempted to invent some bitmask flags for this, then modify
>> create_agg_path() to use the same flags, but I thought I'd better not
>> cause too much churn with this patch.
>
> I'm kinda tempted to say this should be using an enum.  I note that
> serialStates has a subtly different meaning here than in some other
> places where you have used the same term.

hmm, I'm not sure how it's subtly different. Do you mean the
preference towards costing the finalfn when finalizeAggs is true, and
ignoring the serialfn in this case? nodeAgg.c should do the same,
although it'll deserialize in such a case. We can never finalize and
serialize in the same node.


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Aggregate costs don't consider combine/serial/deserial funcs

2016-04-12 Thread Robert Haas
On Sun, Apr 10, 2016 at 8:47 AM, David Rowley
 wrote:
> I realised a few days ago that the parallel aggregate code does not
> cost for the combine, serialisation and deserialisation functions at
> all.

Oops.

> I've attached a patch which fixes this.

I've committed this patch.  I wonder if it's going to produce compiler
warnings for some people, complaining about possible use of an
uninitialized variable.  That would kind of suck.  I don't much mind
having to insert a dummy assignment to shut the compiler up; a smarter
compiler will just throw it out anyway.  I'm less enthused about a
dummy MemSet.  The compiler is less likely to be able to get rid of
that, and it's more expensive if it doesn't.  But let's see what
happens.

> One small point which I was a little unsure of in the attached is,
> should the "if (aggref->aggdirectargs)" part of
> count_agg_clauses_walker() be within the "if
> (!context->combineStates)". I simply couldn't decide. We currently
> have no aggregates which this affects anyway, per; select * from
> pg_aggregate where aggcombinefn <> 0 and aggkind <> 'n'; so for now
> I've left it outwith.

The direct arguments would be evaluated in the worker, but not in the
leader, right?  Or am I confused?

> Another thing I thought of is that it's not too nice that I have to
> pass 3 bools to count_agg_clauses() in order to tell it what to do. I
> was tempted to invent some bitmask flags for this, then modify
> create_agg_path() to use the same flags, but I thought I'd better not
> cause too much churn with this patch.

I'm kinda tempted to say this should be using an enum.  I note that
serialStates has a subtly different meaning here than in some other
places where you have used the same term.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers