On Wed, Mar 29, 2017 at 12:23 AM, David Steele <da...@pgmasters.net> wrote:
> On 3/23/17 1:54 AM, Masahiko Sawada wrote:
>> On Wed, Mar 15, 2017 at 7:51 AM, Peter Geoghegan <p...@bowt.ie> wrote:
>>> On Tue, Mar 14, 2017 at 3:10 PM, Peter Geoghegan <p...@bowt.ie> wrote:
>>>> We already have BTPageOpaqueData.btpo, a union whose contained type
>>>> varies based on the page being dead. We could just do the same with
>>>> some other field in that struct, and then store epoch there. Clearly
>>>> nobody really cares about most data that remains on the page. Index
>>>> scans just need to be able to land on it to determine that it's dead,
>>>> and VACUUM needs to be able to determine whether or not there could
>>>> possibly be such an index scan at the time it considers recycling..
>>> ISTM that we need all of the fields within BTPageOpaqueData even for
>>> dead pages, actually. The left links and right links still need to be
>>> sane, and the flag bits are needed. Plus, the field that stores an XID
>>> already is clearly necessary. Even if they weren't needed, it would
>>> probably still be a good idea to keep them around for forensic
>>> purposes. However, the page header field pd_prune_xid is currently
>>> unused for indexes, and is the same width as CheckPoint.nextXidEpoch
>>> (the extra thing we might want to store -- the epoch).
>>> Maybe you could store the epoch within that field when B-Tree VACUUM
>>> deletes a page, and then compare that within _bt_page_recyclable(). It
>>> would come before the existing XID comparison in that function. One
>>> nice thing about this idea is that pd_prune_xid will be all-zero for
>>> index pages from the current format, so there is no need to take
>>> special care to make sure that databases that have undergone
>>> pg_upgrade don't break.
>> Thank you for the suggestion!
>> If we store the poch within union field, I think we will not be able
>> to use BTPageOpaqueData.btpo.xact at the same time. Since comparing
>> btpo.xact is still necessary to determine if that page is recyclable
>> we cannot store the epoch into the same union field. And if we store
>> it into BTPageOpaqueData, it would break disk compatibility.
> I have marked this patch "Waiting for Author".
> This thread has been idle for five days. Please respond with a new patch by
> 2017-03-30 00:00 AoE (UTC-12) or this submission will be marked "Returned
> with Feedback".
I was thinking that the status of this patch is still "Needs review"
because I sent latest version patch. We're still under discussion
how safely we can skip to the cleanup vacuum on btree index. That
patch has some restrictions but it would be good for first step.
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Sent via pgsql-hackers mailing list (email@example.com)
To make changes to your subscription: