On 3/23/17 1:54 AM, Masahiko Sawada wrote:
On Wed, Mar 15, 2017 at 7:51 AM, Peter Geoghegan <p...@bowt.ie> wrote:
On Tue, Mar 14, 2017 at 3:10 PM, Peter Geoghegan <p...@bowt.ie> wrote:
We already have BTPageOpaqueData.btpo, a union whose contained type
varies based on the page being dead. We could just do the same with
some other field in that struct, and then store epoch there. Clearly
nobody really cares about most data that remains on the page. Index
scans just need to be able to land on it to determine that it's dead,
and VACUUM needs to be able to determine whether or not there could
possibly be such an index scan at the time it considers recycling..

ISTM that we need all of the fields within BTPageOpaqueData even for
dead pages, actually. The left links and right links still need to be
sane, and the flag bits are needed. Plus, the field that stores an XID
already is clearly necessary. Even if they weren't needed, it would
probably still be a good idea to keep them around for forensic
purposes. However, the page header field pd_prune_xid is currently
unused for indexes, and is the same width as CheckPoint.nextXidEpoch
(the extra thing we might want to store -- the epoch).

Maybe you could store the epoch within that field when B-Tree VACUUM
deletes a page, and then compare that within _bt_page_recyclable(). It
would come before the existing XID comparison in that function. One
nice thing about this idea is that pd_prune_xid will be all-zero for
index pages from the current format, so there is no need to take
special care to make sure that databases that have undergone
pg_upgrade don't break.

Thank you for the suggestion!
If we store the poch within union field, I think we will not be able
to use BTPageOpaqueData.btpo.xact at the same time. Since comparing
btpo.xact is still necessary to determine if that page is recyclable
we cannot store the epoch into the same union field. And if we store
it into BTPageOpaqueData, it would break disk compatibility.

I have marked this patch "Waiting for Author".

This thread has been idle for five days. Please respond with a new patch by 2017-03-30 00:00 AoE (UTC-12) or this submission will be marked "Returned with Feedback".


Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to