On Mon, Jul 31, 2017 at 9:11 PM, Andres Freund <and...@anarazel.de> wrote: > - Echoing concerns from other threads (Robert: ping): I'm doubtful that > it makes sense to size the number of parallel workers solely based on > the parallel scan node's size. I don't think it's this patch's job to > change that, but to me it seriously amplifys that - I'd bet there's a > lot of cases with nontrivial joins where the benefit from parallelism > on the join level is bigger than on the scan level itself. And the > number of rows in the upper nodes might also be bigger than on the > scan node level, making it more important to have higher number of > nodes.
Well, I feel like a broken record here but ... yeah, I agree we need to improve that. It's probably generally true that the more parallel operators we add, the more potential benefit there is in doing something about that problem. But, like you say, not in this patch. http://postgr.es/m/CA+TgmoYL-SQZ2gRL2DpenAzOBd5+SW30QB=a4csewtogejz...@mail.gmail.com I think we could improve things significantly by generating multiple partial paths with different number of parallel workers, instead of just picking a number of workers based on the table size and going with it. For that to work, though, you'd need something built into the costing to discourage picking paths with too many workers. And you'd need to be OK with planning taking a lot longer when parallelism is involved, because you'd be carrying around more paths for longer. There are other problems to solve, too. I still think, though, that it's highly worthwhile to get at least a few more parallel operators - and this one in particular - done before we attack that problem in earnest. Even with a dumb calculation of the number of workers, this helps a lot. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (firstname.lastname@example.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers