On Fri, Feb 10, 2017 at 11:27 PM, Robert Haas <robertmh...@gmail.com> wrote:
> On Wed, Feb 1, 2017 at 8:20 AM, Amit Kapila <amit.kapil...@gmail.com> wrote:
>>> The hunk in indexam.c looks like a leftover that can be omitted.
>> It is not a leftover hunk. Earlier, the patch has the same check
>> btparallelrescan, but based on your comment up thread [1] (Why does
>> btparallelrescan cater to the case where scan->parallel_scan== NULL?),
>> this has been moved to indexam.c.
> That's not my point.  The point is, if it's not a parallel scan,
> index_parallelrescan() should never get called in the first place.  So
> therefore it shouldn't need to worry about scan->parallel_scan being
> NULL.  It seems possibly reasonable to put something like
> Assert(scan->parallel_scan != NULL) in there, but arranging to return
> without doing anything in that case seems like it just masks possible
> bugs in the calling code.  And really I'm not sure we even need the
> Assert.

This is just a safety check, so probably we can change it to Assert.

>>> That
>>> is a little odd.  Another, possibly related thing that is odd is that
>>> when _bt_steppage() finds no tuples and decides to advance to a new
>>> page again, there's a very short comment in the forward case and a
>>> very long comment in the backward case:
>>>             /* nope, keep going */
>>> vs.
>>>             /*
>>>              * For parallel scans, get the last page scanned as it is quite
>>>              * possible that by the time we try to fetch previous page, 
>>> other
>>>              * worker has also decided to scan that previous page.  We could
>>>              * avoid that by doing _bt_parallel_release once we have read 
>>> the
>>>              * current page, but it is bad to make other workers wait till 
>>> we
>>>              * read the page.
>>>              */
>>> Now it seems to me that these cases are symmetric and the issues are
>>> identical.  It's basically that, while we were judging whether the
>>> current page has useful contents, some other process could have
>>> advanced the scan (since _bt_readpage un-seized it).
>> Yeah, but the reason of difference in comments is that for
>> non-parallel backwards scans there is no code at that place to move to
>> previous page and it basically relies on next call to _bt_walk_left()
>> whereas for parallel-scans, we can't simply rely on _bt_walk_left().
>> I have slightly modified the  comments for backward scan case, see if
>> that looks better and if not, then suggest what you think is better.
> Why can't we rely on _bt_walk_left?

The reason is mentioned in comments, but let me try to explain with
some example.  When you reach that point of code, it means that either
the current page (assume page number is 10) doesn't contain any
matching items or it is a half-dead page, both of which indicates that
we have to move to the previous page.   Now, before checking if the
current page contains matching items, we signal parallel machinery
(via _bt_parallel_release) to allow workers to read the previous page
(assume previous page number is 9).  So it is quite possible that
after deciding that current page (page number 10) doesn't contain any
matching tuples if we directly move to the previous page (in this case
it will be 9) by using _bt_walk_left, some other worker would have
read page 9.  In short, if we directly use _bt_walk_left(), then we
are prone to returning some of the values twice as multiple workers
can read the same page.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to