On 16 March 2017 at 18:18, Ashutosh Bapat
<ashutosh.ba...@enterprisedb.com> wrote:
> +         * Check if we are already finished plans from parallel append. This
> +         * can happen if all the subplans are finished when this worker
> +         * has not even started returning tuples.
> +         */
> +        if (node->as_padesc && node->as_whichplan == PA_INVALID_PLAN)
> +            return ExecClearTuple(node->ps.ps_ResultTupleSlot);
> From the comment, it looks like this condition will be encountered before the
> backend returns any tuple. But this code is part of the loop which returns the
> tuples. Shouldn't this be outside the loop? Why do we want to check a 
> condition
> for every row returned when the condition can happen only once and that too
> before returning any tuple?

The way ExecProcNode() gets called, there is no different special code
that gets called instead of ExecProcNode() when a tuple is fetched for
the first time. I mean, we cannot prevent ExecProcNode() from getting
called when as_whichplan is invalid right from the beginning.

One thing we can do is : have a special slot in AppenState->as_plan[]
which has some dummy execution node that just returns NULL tuple, and
initially make as_whichplan point to this slot. But I think it is not
worth doing this.

We can instead reduce the if condition to:
if (node->as_whichplan == PA_INVALID_PLAN)
{
Assert(node->as_padesc != NULL);
   return ExecClearTuple(node->ps.ps_ResultTupleSlot);
}

BTW, the loop which you mentioned that returns tuples.... the loop is
not for returning tuples, the loop is for iterating to the next
subplan. Even if we take the condition out and keep it in the
beginning of ExecAppend, the issue will remain.

>
> Why do we need following code in both ExecAppendInitializeWorker() and
> ExecAppendInitializeDSM()? Both of those things happen before starting the
> actual execution, so one of those should suffice?
> +    /* Choose the optimal subplan to be executed. */
> +    (void) parallel_append_next(node);

ExecAppendInitializeWorker() is for the worker to attach (and then
initialize its own local data) to the dsm area created and shared by
ExecAppendInitializeDSM() in backend. But both worker and backend
needs to initialize its own as_whichplan to the next subplan.

>
> There is no pa_num_worker now, so probably this should get updated. Per 
> comment
> we should also get rid of SpinLockAcquire() and SpinLockRelease()?
> + *        purpose. The spinlock is used so that it does not change the
> + *        pa_num_workers field while workers are choosing the next node.
Will do this.

>
> BTW, sa_finished seems to be a misnomor. The plan is not finished yet, but it
> wants no more workers. So, should it be renamed as sa_no_new_workers or
> something like that?

Actually in this context, "finished" means "we are done with this subplan".

>
> In parallel_append_next() we shouldn't need to call goto_next_plan() twice. If
> the plan indicated by pa_next_plan is finished, all the plans must have
> finished. This should be true if we set pa_next_plan to 0 at the time of
> initialization. Any worker picking up pa_next_plan will set it to the next
> valid plan. So the next worker asking for plan should pick pa_next_plan and
> set it to the next one and so on.

The current patch does not call it twice, but I might have overlooked
something. Let me know if I have.

>
> I am wonding whether goto_next_plan() can be simplified as some module
> arithmatic e.g. (whichplan - first_plan)++ % (last_plan - first_plan)
> + first_plan.

Hmm. IMHO it seems too much calculation for just shifting to next array element.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to