Re: Parallel Index Scan vs BTP_DELETED and BTP_HALF_DEAD

Kuntal Ghosh Tue, 12 Dec 2017 21:04:12 -0800

On Tue, Dec 12, 2017 at 5:20 PM, Amit Kapila <[email protected]> wrote:
> On Tue, Dec 12, 2017 at 4:00 PM, Kuntal Ghosh
> <[email protected]> wrote:
>> On Mon, Dec 11, 2017 at 2:26 PM, Thomas Munro
>> <[email protected]> wrote:
>>> On Mon, Dec 11, 2017 at 8:14 PM, Amit Kapila <[email protected]> 
>>> wrote:
>>>
>>>> Thanks for looking into it.  I will see if we can write some test.  In
>>>> the meantime if possible, can you please request Patrick Hemmer to
>>>> verify the attached patch?
>>>
>>> Our discussion was on the #postgresql Freenode channel.  I pointed him
>>> at this thread, but I'm not sure if he'll see my message or be able to
>>> test.
>> After discussing with Amit, I'm able to reproduce the scenario in a
>> master-standby setup. The issue occurs when we perform parallel
>> index(-only) scan on a BTP_HALF_DEAD -marked page. (If a page is
>> marked as BTP_DELETED, it's already unlinked from the index).
>>
>> When a btree page is deleted during vacuum, it's first marked as
>> BTP_HALF_DEAD in _bt_mark_page_halfdead and then marked as BTP_DELETED
>> in _bt_unlink_halfdead_page without releasing cleanup lock on the
>> buffer. Hence, any scan node cannot lock the same buffer. So, the
>> issue can't be reproduced on master.
>>
>> However, after replaying XLOG_BTREE_MARK_PAGE_HALFDEAD record, standby
>> releases the lock on the same buffer. If we force parallelism, an
>> index scan on the same page will cause hang the standby server.
>> Following is a (unpleasant)way to reproduce the issue:
>>
>> In master (with autovacuum = off):
>> 1. create table t1(a int primary key);
>> 2. insert into t1 select * from generate_series(1,1000); --generates 3
>> leaf nodes (block no 1,2,4) and 1 root node (block no 3)
>> 3. delete from t1 where a>=367 and a<=735; --delete all tuples pointed by 
>> leaf 2
>> 4. analyze t1; --update the stats
>> 5. explain analyze select * from t1 where a>=1 and a<=1000; --ensures
>> that the next vacuum will consider leaf 2 for page deletion
>
> What do you mean by next vacuum, here autovacuum is off?  Are you
> missing any step which manually performs the vacuum?
>
Yeah, you've to manually vacuum the table.
6. vacuum t1.


>> Now, put a break point at _bt_unlink_halfdead_page, so that vacuum
>> can't unlink the page.
>>
>> In standby,
>> 1. force parallelism.
>> 2. explain analyze select * from t1 where a>=1 and a<=1000; and the
>> parallel workers hang at the above-discussed place!
>>



-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com

Re: Parallel Index Scan vs BTP_DELETED and BTP_HALF_DEAD

Reply via email to