Re: [HACKERS] Parallel bitmap heap scan

Dilip Kumar Sun, 26 Feb 2017 21:01:29 -0800

On Sun, Feb 26, 2017 at 9:26 PM, Robert Haas <robertmh...@gmail.com> wrote:


Thanks for the review, I will work on these. There are some comments I
need suggestions.

> tbm_free_shared_area because the two iterators share one copy of the
> underlying iteration arrays, and the TBM code isn't smart enough to
> avoid freeing them twice.  You're going to have to come up with a
> better solution to that problem; nodeBitmapHeapScan.c shouldn't know
> about the way the underlying storage details are managed.  (Maybe you
> need to reference-count the iterator arrays?)

Yeah, I also think current way doesn't look so clean, currently, these
arrays are just integers array, may be we can use a first slot of the
array for reference-count? or convert to the structure which has space
for reference-count and an integers array. What do you suggest?

>
> +    if (node->inited)
> +        goto start_iterate;
>
> My first programming teacher told me not to use goto.  I've
> occasionally violated that rule, but I need a better reason than you
> have here.  It looks very easy to avoid.

Yes, this can be avoided, I was just trying to get rid of multi-level
if nesting and end up with the "goto".

>
> +            pbms_set_parallel(outerPlanState(node));
>
> I think this should be a flag in the plan, and the planner should set
> it correctly, instead of having it be a flag in the executor that the
> executor sets.  Also, the flag itself should really be called
> something that involves the word "shared" rather than "parallel",
> because the bitmap will not be created in parallel, but it will be
> shared.

Earlier, I thought that it will be not a good idea to set that flag in
BitmapIndexScan path because the same path can be used either under
ParallelBitmapHeapPath or under normal BitmapHeapPath.  But, now after
putting some thought, I realised that we can do that in create_plan.
Therefore, I will change this.

>
> Have you checked whether this patch causes any regression in the
> non-parallel case?  It adds a bunch of "if" statements, so it might.
> Hopefully not, but it needs to be carefully tested.

During the initial patch development, I have tested this aspect also
but never published any results for the same. I will do that along
with my next patch and post the results.
>
> pbms_is_leader() looks horrifically inefficient.  Every worker will
> reacquire the spinlock for every tuple.  You should only have to enter
> this spinlock-acquiring loop for the first tuple.  After that, either
> you became the leader, did the initialization, and set the state to
> PBM_FINISHED, or you waited until the pre-existing leader did the
> same.  You should have a backend-local flag that keeps you from
> touching the spinlock for every tuple.  I wouldn't be surprised if
> fixing this nets a noticeable performance increase for this patch at
> high worker counts.

I think there is some confusion, if you notice, below code will avoid
calling pbms_is_leader for every tuple.
It will be called only first time. And, after that node->inited will
be true and it will directly jump to start_iterate for subsequent
calls.  Am I missing something?

> +    if (node->inited)
> +        goto start_iterate;


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel bitmap heap scan

Reply via email to