Thanks Amit for reviewing this patch. On Mon, Oct 17, 2016 at 2:26 PM, Amit Kapila <amit.kapil...@gmail.com> wrote:
> On Wed, Oct 5, 2016 at 11:35 AM, Rushabh Lathia > <rushabh.lat...@gmail.com> wrote: > > Hi hackers, > > > > Attached is the patch to implement Gather Merge. > > > > Couple of review comments: > > 1. > ExecGatherMerge() > { > .. > + /* No workers? Then never mind. */ > + if (!got_any_worker > || > + node->nreaders < 2) > + { > + > ExecShutdownGatherMergeWorkers(node); > + node->nreaders = 0; > + > } > > Are you planning to restrict the use of gather merge based on number > of workers, if there is a valid reason, then I think comments should > be updated for same? > > Thanks for catching this. This is left over from the earlier design patch. With current design we don't have any limitation for the number of worker . I did the performance testing with setting max_parallel_workers_per_gather to 1 and didn't noticed any performance degradation. So I removed this limitation into attached patch. 2. > +ExecGatherMerge(GatherMergeState * node){ > .. > + if (!node->initialized) > + { > + EState *estate = node->ps.state; > + > GatherMerge *gm = (GatherMerge *) node->ps.plan; > + > + /* > + * Sometimes we > might have to run without parallelism; but if parallel > + * mode is active then we can try to > fire up some workers. > + */ > + if (gm->num_workers > 0 && IsInParallelMode()) > + > { > + ParallelContext *pcxt; > + bool got_any_worker = > false; > + > + /* Initialize the workers required to execute Gather node. */ > + > if (!node->pei) > + node->pei = ExecInitParallelPlan(node- > >ps.lefttree, > + > estate, > + > gm->num_workers); > .. > } > > There is lot of common code between ExecGatherMerge and ExecGather. > Do you think it makes sense to have a common function to avoid the > duplicity? > > I see there are small discrepancies in both the codes like I don't see > the use of single_copy flag, as it is present in gather node. > > Yes, even I thought to centrilize some things of ExecGather and ExecGatherMerge, but its really not something that is fixed. And I thought it might change particularly for the Gather Merge. And as explained by Robert single_copy doesn't make sense for the Gather Merge. I will still look into this to see if something can be make centralize. > 3. > +gather_merge_readnext(GatherMergeState * gm_state, int reader, bool > force) > { > .. > + tup = gm_readnext_tuple(gm_state, reader, force, NULL); > + > + /* > + > * try to read more tuple into nowait mode and store it into the tuple > + * array. > + > */ > + if (HeapTupleIsValid(tup)) > + fill_tuple_array(gm_state, reader); > > How the above read tuple is stored in array? In anycase the above > interface seems slightly awkward to me. Basically, I think what you > are trying to do here is after reading first tuple in waitmode, you > are trying to fill the array by reading more tuples. So, can't we > push reading of this fist tuple into that function and name it as > form_tuple_array(). > > Yes, you are right. First its trying to read tuple into wait-mode, and once it find tuple then its try to fill the tuple array (which basically try to read tuple into nowait-mode). Reason I keep it separate is because in case of initializing the gather merge, if we unable to read tuple from all the worker - while trying re-read, we again try to fill the tuple array for the worker who already produced atleast a single tuple (see gather_merge_init() for more details). Also I thought filling tuple array (which basically read tuple into nowait mode) and reading tuple (into wait-mode) are two separate task - and if its into separate function that code look more clear. If you have any suggestion for the function name (fill_tuple_array) then I am open to change that. > 4. > +create_gather_merge_path(..) > { > .. > + 0 /* FIXME: pathnode->limit_tuples*/); > > What exactly you want to fix in above code? > > Fixed. > 5. > +/* Tuple array size */ > +#define MAX_TUPLE_STORE 10 > > Have you tried with other values of MAX_TUPLE_STORE? If yes, then > what are the results? I think it is better to add a comment why array > size is best for performance. > > Actually I was thinking on that, but I don't wanted to add their because its just performance number on my machine. Anyway I added the comments. > > 6. > +/* INTERFACE ROUTINES > + * ExecInitGatherMerge - initialize the MergeAppend > node > + * ExecGatherMerge - retrieve the next tuple from the node > + * > ExecEndGatherMerge - shut down the MergeAppend node > + * > ExecReScanGatherMerge - rescan the MergeAppend node > > typo. /MergeAppend/GatherMerge > > Fixed. > 7. > +static TupleTableSlot *gather_merge_getnext(GatherMergeState * gm_state); > +static HeapTuple > gm_readnext_tuple(GatherMergeState * gm_state, int nreader, bool > force, bool *done); > > Formatting near GatherMergeState doesn't seem to be appropriate. I > think you need to add GatherMergeState in typedefs.list and then run > pgindent again. > > Fixed. > 8. > + /* > + * Initialize funnel slot to same tuple descriptor as outer plan. > + */ > + if > (!ExecContextForcesOids(&gm_state->ps, &hasoid)) > > I think in above comment, you mean Initialize GatherMerge slot. > > No, it has to be funnel slot only - its just place holder. For Gather Merge, initialize one slot per worker and it is done into gather_merge_init(). I will look into this point to see if I can get rid of funnel slot completely. 9. > + /* Does tuple array have any avaiable tuples? */ > /avaiable/available > > Fixed. > > > > Open Issue: > > > > - Commit af33039317ddc4a0e38a02e2255c2bf453115fd2 fixed the leak into > > tqueue.c by > > calling gather_readnext() into per-tuple context. Now for gather merge > that > > is > > not possible, as we storing the tuple into Tuple array and we want tuple > to > > be > > free only its get pass through the merge sort algorithm. One idea is, we > can > > also call gm_readnext_tuple() under per-tuple context (which will fix the > > leak > > into tqueue.c) and then store the copy of tuple into tuple array. > > > > Won't extra copy per tuple impact performance? Is the fix in > mentioned commit was for record or composite types, if so, does > GatherMerge support such types and if it support, does it provide any > benefit over Gather? > > I don't think was specificially for the record or composite types - but I might be wrong. As per my understanding commit fix leak into tqueue.c. Fix was to add standard to call TupleQueueReaderNext() with shorter memory context - so that tqueue.c doesn't leak the memory. I have idea to fix this by calling the TupleQueueReaderNext() with per-tuple context, and then copy the tuple and store it to the tuple array and later with the next run of ExecStoreTuple() will free the earlier tuple. I will work on that. > -- > With Regards, > Amit Kapila. > EnterpriseDB: http://www.enterprisedb.com > -- Rushabh Lathia
gather_merge_v2.patch
Description: application/download
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers