On Mon, Mar 27, 2017 at 10:59 AM, Rushabh Lathia <rushabh.lat...@gmail.com> wrote:
> > > On Mon, Mar 27, 2017 at 3:43 AM, Tomas Vondra < > tomas.von...@2ndquadrant.com> wrote: > >> On 03/25/2017 05:18 PM, Rushabh Lathia wrote: >> >>> >>> >>> On Sat, Mar 25, 2017 at 7:01 PM, Peter Eisentraut >>> <peter.eisentr...@2ndquadrant.com >>> <mailto:peter.eisentr...@2ndquadrant.com>> wrote: >>> >>> On 3/25/17 09:01, David Rowley wrote: >>> > On 25 March 2017 at 23:09, Rushabh Lathia < >>> rushabh.lat...@gmail.com <mailto:rushabh.lat...@gmail.com>> wrote: >>> >> Also another point which I think we should fix is, when someone >>> set >>> >> max_parallel_workers = 0, we should also set the >>> >> max_parallel_workers_per_gather >>> >> to zero. So that way it we can avoid generating the gather path >>> with >>> >> max_parallel_worker = 0. >>> > I see that it was actually quite useful that it works the way it >>> does. >>> > If it had worked the same as max_parallel_workers_per_gather, then >>> > likely Tomas would never have found this bug. >>> >>> Another problem is that the GUC system doesn't really support cases >>> where the validity of one setting depends on the current value of >>> another setting. So each individual setting needs to be robust >>> against >>> cases of related settings being nonsensical. >>> >>> >>> Okay. >>> >>> About the original issue reported by Tomas, I did more debugging and >>> found that - problem was gather_merge_clear_slots() was not returning >>> the clear slot when nreader is zero (means nworkers_launched = 0). >>> Due to the same scan was continue even all the tuple are exhausted, >>> and then end up with server crash at gather_merge_getnext(). In the patch >>> I also added the Assert into gather_merge_getnext(), about the index >>> should be less then the nreaders + 1 (leader). >>> >>> PFA simple patch to fix the problem. >>> >>> >> I think there are two issues at play, here - the first one is that we >> still produce parallel plans even with max_parallel_workers=0, and the >> second one is the crash in GatherMerge when nworkers=0. >> >> Your patch fixes the latter (thanks for looking into it), which is >> obviously a good thing - getting 0 workers on a busy system is quite >> possible, because all the parallel workers can be already chewing on some >> other query. >> >> > Thanks. > > I was doing more testing with the patch and I found one more server crash with the patch around same area, when we forced the gather merge for the scan having zero rows. create table dept ( deptno numeric, dname varchar(20); set parallel_tuple_cost =0; set parallel_setup_cost =0; set min_parallel_table_scan_size =0; set min_parallel_index_scan_size =0; set force_parallel_mode=regress; explain analyze select * from dept order by deptno; This is because for leader we don't initialize the slot into gm_slots. So in case where launched worker is zero and table having zero rows, we end up having NULL slot into gm_slots array. Currently gather_merge_clear_slots() clear out the tuple table slots for each gather merge input and returns clear slot. In the patch I modified function gather_merge_clear_slots() to just clear out the tuple table slots and always return NULL when All the queues and heap us exhausted. > But it seems a bit futile to produce the parallel plan in the first place, >> because with max_parallel_workers=0 we can't possibly get any parallel >> workers ever. I wonder why compute_parallel_worker() only looks at >> max_parallel_workers_per_gather, i.e. why shouldn't it do: >> >> parallel_workers = Min(parallel_workers, max_parallel_workers); >> >> > I agree with you here. Producing the parallel plan when > max_parallel_workers = 0 is wrong. But rather then your suggested fix, I > think that we should do something like: > > /* > * In no case use more than max_parallel_workers_per_gather or > * max_parallel_workers. > */ > parallel_workers = Min(parallel_workers, Min(max_parallel_workers, > max_parallel_workers_per_gather)); > > > >> Perhaps this was discussed and is actually intentional, though. >> >> > Yes, I am not quite sure about this. > > Regarding handling this at the GUC level - I agree with Peter that that's >> not a good idea. I suppose we could deal with checking the values in the >> GUC check/assign hooks, but what we don't have is a way to undo the changes >> in all the GUCs. That is, if I do >> >> SET max_parallel_workers = 0; >> SET max_parallel_workers = 16; >> >> I expect to end up with just max_parallel_workers GUC changed and nothing >> else. >> >> regards >> >> -- >> Tomas Vondra http://www.2ndQuadrant.com >> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services >> > > > > -- > Rushabh Lathia > Regards, Rushabh Lathia www.EnterpriseDB.com
diff --git a/src/backend/executor/nodeGatherMerge.c b/src/backend/executor/nodeGatherMerge.c index 3f0c3ee..62c399e 100644 --- a/src/backend/executor/nodeGatherMerge.c +++ b/src/backend/executor/nodeGatherMerge.c @@ -419,10 +419,9 @@ reread: } /* - * Clear out the tuple table slots for each gather merge input, - * and return a cleared slot. + * Clear out the tuple table slots for each gather merge input. */ -static TupleTableSlot * +static void gather_merge_clear_slots(GatherMergeState *gm_state) { int i; @@ -437,9 +436,6 @@ gather_merge_clear_slots(GatherMergeState *gm_state) pfree(gm_state->gm_tuple_buffers); /* Free the binaryheap, which was created for sort */ binaryheap_free(gm_state->gm_heap); - - /* return any clear slot */ - return gm_state->gm_slots[0]; } /* @@ -479,7 +475,8 @@ gather_merge_getnext(GatherMergeState *gm_state) if (binaryheap_empty(gm_state->gm_heap)) { /* All the queues are exhausted, and so is the heap */ - return gather_merge_clear_slots(gm_state); + gather_merge_clear_slots(gm_state); + return NULL; } else {
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers