Gin page deletion fails to take into account that there might be a search in-flight to the page that is deleted. If the page is reused for something else, the search can get very confused.

That's pretty difficult to reproduce in a real system, as the window between releasing a lock on page and following its right-link is very tight, but by setting a breakpoint with a debugger it's easy. Here's how I reproduced it:

-----------
1. Put a breakpoint or sleep in entryGetNextItem() function, where it has released lock on one page and is about to read the next one. I used this patch:

--- a/src/backend/access/gin/ginget.c
+++ b/src/backend/access/gin/ginget.c
@@ -574,6 +574,9 @@ entryGetNextItem(GinState *ginstate, GinScanEntry entry)
                                return;
                        }

+                       elog(NOTICE, "about to move right to page %u", blkno);
+                       sleep(5);
+
                        entry->buffer = ReleaseAndReadBuffer(entry->buffer,
                                                                                   
              ginstate->index,
                                                                                
                 blkno);

2. Initialize a page with a gin index in suitable state:

create extension btree_gin;
create table foo (i int4);
create index i_gin_foo on foo using gin (i) with (fastupdate = off);

insert into foo select 1 from generate_series(1, 5000);
insert into foo select 2 from generate_series(1, 5000);
set enable_bitmapscan=off; set enable_seqscan=on;
delete from foo where i = 1;

3. Start a query, it will sleep between every page:

set enable_bitmapscan=on; set enable_seqscan=off;
select * from foo where i = 1;
postgres=# select * from foo where i = 1;
NOTICE:  about to move right to page 3
NOTICE:  about to move right to page 5
...

4. In another session, delete and reuse the pages:

vacuum foo;
insert into foo select 2 from generate_series(1, 10000) g

5. Let the query run to completion. It will return a lot of tuples with i=2, which should not have matched:

...
NOTICE:  about to move right to page 24
NOTICE:  about to move right to page 25
 i
---
 2
 2
 2
...

-----------

The regular b-tree code solves this by stamping deleted pages with the current XID, and only allowing them to be reused once that XID becomes old enough (< RecentGlobalXmin). Another approach might be to grab a cleanup-strength lock on the left and parent pages when deleting a page, and requiring search to keep the pin on the page its coming from, until it has locked the next page.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to