On Wed, Aug 2, 2017 at 11:12 PM, Jeff Janes <jeff.ja...@gmail.com> wrote:
> On Wed, Jul 12, 2017 at 7:08 PM, Amit Kapila <amit.kapil...@gmail.com>
>> On Wed, Jul 12, 2017 at 11:20 PM, Jeff Janes <jeff.ja...@gmail.com> wrote:
>> > On Tue, Jul 11, 2017 at 10:25 PM, Amit Kapila <amit.kapil...@gmail.com>
>> > wrote:
>> >> On Wed, Jul 12, 2017 at 1:50 AM, Jeff Janes <jeff.ja...@gmail.com>
>> >> wrote:
>> >> > On Mon, Jul 10, 2017 at 9:51 PM, Dilip Kumar <dilipbal...@gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> So because of this high projection cost the seqpath and parallel
>> >> >> path
>> >> >> both have fuzzily same cost but seqpath is winning because it's
>> >> >> parallel safe.
>> >> >
>> >> >
>> >> > I think you are correct. However, unless parallel_tuple_cost is set
>> >> > very
>> >> > low, apply_projection_to_path never gets called with the Gather path
>> >> > as
>> >> > an
>> >> > argument. It gets ruled out at some earlier stage, presumably
>> >> > because
>> >> > it
>> >> > assumes the projection step cannot make it win if it is already
>> >> > behind
>> >> > by
>> >> > enough.
>> >> >
>> >> I think that is genuine because tuple communication cost is very high.
>> > Sorry, I don't know which you think is genuine, the early pruning or my
>> > complaint about the early pruning.
>> Early pruning. See, currently, we don't have a way to maintain both
>> parallel and non-parallel paths till later stage and then decide which
>> one is better. If we want to maintain both parallel and non-parallel
>> paths, it can increase planning cost substantially in the case of
>> joins. Now, surely it can have benefit in many cases, so it is a
>> worthwhile direction to pursue.
> If I understand it correctly, we have a way, it just can lead to exponential
> explosion problem, so we are afraid to use it, correct? If I just
> lobotomize the path domination code (make pathnode.c line 466 always test
> if (JJ_all_paths==0 && costcmp != COSTS_DIFFERENT)
> Then it keeps the parallel plan and later chooses to use it (after applying
> your other patch in this thread) as the overall best plan. It even doesn't
> slow down "make installcheck-parallel" by very much, which I guess just
> means the regression tests don't have a lot of complex joins.
> But what is an acceptable solution? Is there a heuristic for when retaining
> a parallel path could be helpful, the same way there is for fast-start
> paths? It seems like the best thing would be to include the evaluation
> costs in the first place at this step.
> Why is the path-cost domination code run before the cost of the function
> evaluation is included?
Because the function evaluation is part of target list and we create
path target after the creation of base paths (See call to
create_pathtarget @ planner.c:1696).
> Is that because the information needed to compute
> it is not available at that point,
I see two ways to include the cost of the target list for parallel
paths before rejecting them (a) Don't reject parallel paths
(Gather/GatherMerge) during add_path. This has the danger of path
explosion. (b) In the case of parallel paths, somehow try to identify
that path has a costly target list (maybe just check if the target
list has anything other than vars) and use it as a heuristic to decide
that whether a parallel path can be retained.
I think the preference will be to do something on the lines of
approach (b), but I am not sure whether we can easily do that.
Sent via pgsql-hackers mailing list (email@example.com)
To make changes to your subscription: