On 2017-04-05 14:52:38 +0530, Amit Khandekar wrote:
> This is what the earlier versions of my patch had done : just add up
> per-subplan parallel_workers (1 for non-partial subplan and
> subpath->parallel_workers for partial subplans) and set this total as
> the Append parallel_workers.

I don't think that's great, consider e.g. the case that you have one
very expensive query, and a bunch of cheaper ones. Most of those workers
wouldn't do much while waiting for the the expensive query.  What I'm
basically thinking we should do is something like the following
pythonesque pseudocode:

best_nonpartial_cost = -1
best_nonpartial_nworkers = -1

for numworkers in 1...#max workers:
   worker_work = [0 for x in range(0, numworkers)]

   nonpartial_cost += startup_cost * numworkers

   # distribute all nonpartial tasks over workers.  Assign tasks to the
   # worker with the least amount of work already performed.
   for task in all_nonpartial_subqueries:
       least_busy_worker = worker_work.smallest()
       least_busy_worker += task.total_nonpartial_cost

   # the nonpartial cost here is the largest amount any single worker
   # has to perform.
   nonpartial_cost += worker_work.largest()

   total_partial_cost = 0
   for task in all_partial_subqueries:
       total_partial_cost += task.total_nonpartial_cost

   # Compute resources needed by partial tasks. First compute how much
   # cost we can distribute to workers that take shorter than the
   # "busiest" worker doing non-partial tasks.
   remaining_avail_work = 0
   for i in range(0, numworkers):
       remaining_avail_work += worker_work.largest() - worker_work[i]

   # Equally divide up remaining work over all workers
   if remaining_avail_work < total_partial_cost:
      nonpartial_cost += (worker_work.largest - remaining_avail_work) / 
numworkers

   # check if this is the best number of workers
   if best_nonpartial_cost == -1 or best_nonpartial_cost > nonpartial_cost:
      best_nonpartial_cost = worker_work.largest
      best_nonpartial_nworkers = nworkers

Does that make sense?


> BTW all of the above points apply only for non-partial plans.

Indeed. But I think that's going to be a pretty common type of plan,
especially if we get partitionwise joins.


Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to