On Fri, Mar 17, 2017 at 1:12 PM, Amit Khandekar <amitdkhan...@gmail.com> wrote: >> - The substantive changes in add_paths_to_append_rel don't look right >> either. It's not clear why accumulate_partialappend_subpath is >> getting called even in the non-enable_parallelappend case. I don't >> think the logic for the case where we're not generating a parallel >> append path needs to change at all. > > When accumulate_partialappend_subpath() is called for a childrel with > a partial path, it works just like accumulate_append_subpath() when > enable_parallelappend is false. That's why, for partial child path, > the same function is called irrespective of parallel-append or > non-parallel-append case. May be mentioning this in comments should > suffice here ?

I don't get it. If you can get the same effect by changing something or not changing it, presumably it'd be better to not change it. We try not to change things just because we can; the change should be an improvement in some way. >> - When parallel append is enabled, I think add_paths_to_append_rel >> should still consider all the same paths that it does today, plus one >> extra. The new path is a parallel append path where each subpath is >> the cheapest subpath for that childrel, whether partial or >> non-partial. If !enable_parallelappend, or if all of the cheapest >> subpaths are partial, then skip this. (If all the cheapest subpaths >> are non-partial, it's still potentially useful.) > > In case of all-partial childrels, the paths are *exactly* same as > those that would have been created for enable_parallelappend=off. The > extra path is there for enable_parallelappend=on only when one or more > of the child rels do not have partial paths. Does this make sense ? No, I don't think so. Imagine that we have three children, A, B, and C. The cheapest partial paths have costs of 10,000 each. A, however, has a non-partial path with a cost of 1,000. Even though A has a partial path, we still want to consider a parallel append using the non-partial path because it figures to be hugely faster. > The > Path->total_cost for a partial path is *always* per-worker cost, right > ? Just want to confirm this assumption of mine. Yes. >> Also, it >> could be smarter about what happens with the costs of non-partial >> paths. I suggest the following algorithm instead. >> >> 1. Add up all the costs of the partial paths. Those contribute >> directly to the final cost of the Append. This ignores the fact that >> the Append may escalate the parallel degree, but I think we should >> just ignore that problem for now, because we have no real way of >> knowing what the impact of that is going to be. > > I wanted to take into account per-subpath parallel_workers for total > cost of Append. Suppose the partial subpaths have per worker total > costs (3, 3, 3) and their parallel_workers are (2, 8, 4), with 2 > Append workers available. So according to what you say, the total cost > is 9. With per-subplan parallel_workers taken into account, total cost > = (3*2 + 3*8 * 3*4)/2 = 21. But that case never happens, because the parallel workers for the append is always at least as large as the number of workers for any single child. > May be I didn't follow exactly what you suggested. Your logic is not > taking into account number of workers ? I am assuming you are > calculating per-worker total cost here. >> >> 2. Next, estimate the cost of the non-partial paths. To do this, make >> an array of Cost of that length and initialize all the elements to >> zero, then add the total cost of each non-partial plan in turn to the >> element of the array with the smallest cost, and then take the maximum >> of the array elements as the total cost of the non-partial plans. Add >> this to the result from step 1 to get the total cost. > > So with costs (8, 5, 2), add 8 and 5 to 2 so that it becomes (8, 5, > 15) , and so the max is 15 ? I surely am misinterpreting this. No. If you have costs 8, 5, and 2 and only one process, cost is 15. If you have two processes then for costing purposes you assume worker 1 will execute the first path (cost 8) and worker 2 will execute the other two (cost 5 + 2 = 7), so the total cost is 8. If you have three workers, the cost will still be 8, because there's no way to finish the cost-8 path in less than 8 units of work. >> - In get_append_num_workers, instead of the complicated formula with >> log() and 0.693, just add the list lengths and call fls() on the >> result. Integer arithmetic FTW! > > Yeah fls() could be used. BTW I just found that costsize.c already has > this defined in the same way I did: > #define LOG2(x) (log(x) / 0.693147180559945) > May be we need to shift this to some common header file. LOG2() would make sense if you're working with a value represented as a double, but if you have an integer input, I think fls() is better. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers