Gregory Stark wrote:
How do we handle this situation?

 We go to insert a record in the heap, find no free space, so we extend the
 table and insert it into a new page. Then we insert an index entry pointing
 to the new tuple. Then some other backend (or bgwriter) comes along and
 decides the index page is a good candidate for eviction and forces an xlog
 buffer flush for that buffer. Then the system crashes.

Let me reiterate:

1. extend table
2. insert heap tuple
3. insert index tuple
4. flush index page
5. crash

Now when the system comes back up the index will have a pointer to a page
beyond the end of the heap. Even if we have a WAL log entry for the extension
the index pointer would be pointing to a zeroed block so vacuum would never
get the chance to note the tuple is dead and remove the index pointer.

There's a hole in your logic. The xlog flush in step 4 is also going to flush the xlog record of 1-3. By the time 3 is replayed, the heap page has already been reconstructed.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not
      match

Reply via email to