> On Jul 13, 2015, at 3:50 PM, Jim Nasby <jim.na...@bluetreble.com> wrote:
> 
> On 7/13/15 3:26 PM, David Christensen wrote:
>> * Incremental Checksums
>> 
>> PostgreSQL users should have a way up upgrading their cluster to use data 
>> checksums without having to do a costly pg_dump/pg_restore; in particular, 
>> checksums should be able to be enabled/disabled at will, with the database 
>> enforcing the logic of whether the pages considered for a given database are 
>> valid.
>> 
>> Considered approaches for this are having additional flags to pg_upgrade to 
>> set up the new cluster to use checksums where they did not before (or 
>> optionally turning these off).  This approach is a nice tool to have, but in 
>> order to be able to support this process in a manner which has the database 
>> online while the database is going throught the initial checksum process.
> 
> It would be really nice if this could be extended to handle different page 
> formats as well, something that keeps rearing it's head. Perhaps that could 
> be done with the cycle idea you've described.

I had had this thought too, but the main issues I saw were that new page 
formats were not guaranteed to take up the same space/storage, so there was an 
inherent limitation on the ability to restructure things out *arbitrarily*; 
that being said, there may be a use-case for the types of modifications that 
this approach *would* be able to handle.

> Another possibility is some kind of a page-level indicator of what binary 
> format is in use on a given page. For checksums maybe a single bit would 
> suffice (indicating that you should verify the page checksum). Another use 
> case is using this to finally ditch all the old VACUUM FULL code in 
> HeapTupleSatisfies*().

There’s already a page version field, no?  I assume that would be sufficient 
for the page format indicator.  I don’t know about using a flag for verifying 
the checksum, as that is already modifying the page which is to be checksummed 
anyway, which we want to avoid having to rewrite a bunch of pages 
unnecessarily, no?  And you’d presumably need to clear that state again which 
would be an additional write.  This was the issue that the checksum cycle was 
meant to handle, since we store this information in the system catalogs and the 
types of modifications here would be idempotent.

David
--
David Christensen
PostgreSQL Team Manager
End Point Corporation
da...@endpoint.com
785-727-1171







-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to