Re: [HACKERS] oversight in parallel aggregate
On Mon, Apr 4, 2016 at 9:09 PM, David Rowleywrote: > On 5 April 2016 at 11:59, Robert Haas wrote: >> One of my EDB colleagues, while in the process of refactoring some >> unrelated Advanced Server code, discovered that (1) there's no way to >> mark an aggregate as anything other than parallel-unsafe but (2) it >> doesn't matter because has_parallel_hazard ignores Aggrefs anyway. >> These mistakes cancel each other out (sorta) if all of your aggregates >> happen to be parallel-safe, but otherwise not so much. Barring >> objections, I intend to speedily apply the attached patch to fix this. > > Thanks for working on this. I should have noticed this myself... > > I had a quick look at this and I manged to make this happen; > > david=# create aggregate mysum(int) (sfunc=int4pl, combinefunc=int4pl, > stype=int, parallel); > server closed the connection unexpectedly > > I've attached a fix, which makes the code a bit more simple, and also > inline with the other code in DefineAggregate(). Thanks. > I think there was also a couple of missing syntax synopsis in the docs > too. I've added those. The first one was indeed needed, but the second syntax doesn't actually work, so I took that back out. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] oversight in parallel aggregate
On Mon, Apr 4, 2016 at 10:26 PM, David Rowleywrote: > Does this need to check the parallel flags on the transfn or serialfn? > these'll be executed on the worker process. Possibly we also need the > combinefn/deserialfn/finalfn to be checked too as I see that we do > generate_gather_paths() from set_append_rel_pathlist(). That's basically the same as Tom's question, I think. For right now, I'd like to regard the aggregate function's pg_proc marking as certifying that the entire aggregate can be trusted to be parallel-safe. That's cheap and easy to check. If, in the future, we want to allow more complicated things where some but not all of aggregate's functions are parallel-safe, we can add logic for that then - i.e. if the aggregate is labeled as parallel-restricted, then inquire within. But to be honest, I hope we won't get there. As it is, the list of things that you might want to do in an aggregate that are parallel-unsafe is pretty short, and I hope we're going to go in the direction of making even more things parallel-safe in the future. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] oversight in parallel aggregate
On Mon, Apr 4, 2016 at 10:35 PM, Tom Lanewrote: > Robert Haas writes: >> One of my EDB colleagues, while in the process of refactoring some >> unrelated Advanced Server code, discovered that (1) there's no way to >> mark an aggregate as anything other than parallel-unsafe but (2) it >> doesn't matter because has_parallel_hazard ignores Aggrefs anyway. >> These mistakes cancel each other out (sorta) if all of your aggregates >> happen to be parallel-safe, but otherwise not so much. Barring >> objections, I intend to speedily apply the attached patch to fix this. > > Um ... why is it a good idea to attach a parallel-safe annotation to an > aggregate as such, rather than relying on the parallel-safe annotations > of its implementation functions? > > This seems not entirely academic, since perhaps the functions are not > all marked the same; which might be sensible. Perhaps the transition > function can be pushed down but not the final function. We could do it that way, and then just ignore the marking of the aggregate function itself. However, that would require has_parallel_hazard to do more syscache lookups, since it would have to examine all of the functions bound into the aggregate instead of just looking at the aggregate itself. I think that's probably not worth it, because I struggle to think of why an aggregate function would be ever be parallel-restricted or parallel-unsafe. I suppose somebody could create a user-defined aggregate that has side effects, but that seems like a corner case for which we shouldn't be optimizing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] oversight in parallel aggregate
Robert Haaswrites: > One of my EDB colleagues, while in the process of refactoring some > unrelated Advanced Server code, discovered that (1) there's no way to > mark an aggregate as anything other than parallel-unsafe but (2) it > doesn't matter because has_parallel_hazard ignores Aggrefs anyway. > These mistakes cancel each other out (sorta) if all of your aggregates > happen to be parallel-safe, but otherwise not so much. Barring > objections, I intend to speedily apply the attached patch to fix this. Um ... why is it a good idea to attach a parallel-safe annotation to an aggregate as such, rather than relying on the parallel-safe annotations of its implementation functions? This seems not entirely academic, since perhaps the functions are not all marked the same; which might be sensible. Perhaps the transition function can be pushed down but not the final function. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] oversight in parallel aggregate
On 5 April 2016 at 13:09, David Rowleywrote: > On 5 April 2016 at 11:59, Robert Haas wrote: >> One of my EDB colleagues, while in the process of refactoring some >> unrelated Advanced Server code, discovered that (1) there's no way to >> mark an aggregate as anything other than parallel-unsafe but (2) it >> doesn't matter because has_parallel_hazard ignores Aggrefs anyway. >> These mistakes cancel each other out (sorta) if all of your aggregates >> happen to be parallel-safe, but otherwise not so much. Barring >> objections, I intend to speedily apply the attached patch to fix this. > > Thanks for working on this. I should have noticed this myself... > > I had a quick look at this and I manged to make this happen; > > david=# create aggregate mysum(int) (sfunc=int4pl, combinefunc=int4pl, > stype=int, parallel); > server closed the connection unexpectedly > > I've attached a fix, which makes the code a bit more simple, and also > inline with the other code in DefineAggregate(). > > I think there was also a couple of missing syntax synopsis in the docs > too. I've added those. Another thought; + else if (IsA(node, Aggref)) + { + Aggref *aggref = (Aggref *) node; + + if (parallel_too_dangerous(func_parallel(aggref->aggfnoid), context)) + return true; + } Does this need to check the parallel flags on the transfn or serialfn? these'll be executed on the worker process. Possibly we also need the combinefn/deserialfn/finalfn to be checked too as I see that we do generate_gather_paths() from set_append_rel_pathlist(). -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] oversight in parallel aggregate
On 5 April 2016 at 11:59, Robert Haaswrote: > One of my EDB colleagues, while in the process of refactoring some > unrelated Advanced Server code, discovered that (1) there's no way to > mark an aggregate as anything other than parallel-unsafe but (2) it > doesn't matter because has_parallel_hazard ignores Aggrefs anyway. > These mistakes cancel each other out (sorta) if all of your aggregates > happen to be parallel-safe, but otherwise not so much. Barring > objections, I intend to speedily apply the attached patch to fix this. Thanks for working on this. I should have noticed this myself... I had a quick look at this and I manged to make this happen; david=# create aggregate mysum(int) (sfunc=int4pl, combinefunc=int4pl, stype=int, parallel); server closed the connection unexpectedly I've attached a fix, which makes the code a bit more simple, and also inline with the other code in DefineAggregate(). I think there was also a couple of missing syntax synopsis in the docs too. I've added those. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services create-aggregate-parallel_david.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] oversight in parallel aggregate
One of my EDB colleagues, while in the process of refactoring some unrelated Advanced Server code, discovered that (1) there's no way to mark an aggregate as anything other than parallel-unsafe but (2) it doesn't matter because has_parallel_hazard ignores Aggrefs anyway. These mistakes cancel each other out (sorta) if all of your aggregates happen to be parallel-safe, but otherwise not so much. Barring objections, I intend to speedily apply the attached patch to fix this. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company create-aggregate-parallel.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers