Gregory Stark wrote:
How do we handle this situation?
We go to insert a record in the heap, find no free space, so we extend the
table and insert it into a new page. Then we insert an index entry pointing
to the new tuple. Then some other backend (or bgwriter) comes along and
decides the index page is a good candidate for eviction and forces an xlog
buffer flush for that buffer. Then the system crashes.
Let me reiterate:
1. extend table
2. insert heap tuple
3. insert index tuple
4. flush index page
5. crash
Now when the system comes back up the index will have a pointer to a page
beyond the end of the heap. Even if we have a WAL log entry for the extension
the index pointer would be pointing to a zeroed block so vacuum would never
get the chance to note the tuple is dead and remove the index pointer.
There's a hole in your logic. The xlog flush in step 4 is also going to
flush the xlog record of 1-3. By the time 3 is replayed, the heap page
has already been reconstructed.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match