> On Jul 2, 2015, at 3:43 PM, Heikki Linnakangas <hlinn...@iki.fi> wrote: > > On 07/02/2015 11:28 PM, Andres Freund wrote: >> On 2015-07-02 22:53:40 +0300, Heikki Linnakangas wrote: >>> Add a "enabling-checksums" mode to the server where it calculates checksums >>> for anything it writes, but doesn't check or complain about incorrect >>> checksums on reads. Put the server into that mode, and then have a >>> background process that reads through all data in the cluster, calculates >>> the checksum for every page, and writes all the data back. Once that's >>> completed, checksums can be fully enabled. >> >> You'd need, afaics, a bgworker that connects to every database to read >> pg_class, to figure out what type of page a relfilenode has. And this >> probably should call back into the relevant AM or such. > > Nah, we already assume that every relation data file follows the standard > page format, at least enough to have the checksum field at the right > location. See FlushBuffer() - it just unconditionally calculates the checksum > before writing out the page. (I'm not totally happy about that, but that ship > has sailed) > - Heikki
So thinking some more about the necessary design to support enabling checksums post-initdb, what about the following?: Introduce a new field in pg_control, data_checksum_state -> (0 - disabled, 1 - enabling in process, 2 - enabled). This could be set via (say) a pg_upgrade flag when creating a new cluster with --enable-checksums or a standalone program to adjust that field in pg_control. Checksum enforcing behavior will be dependent on that setting; 0 is non-enforcing read or write, 1 is enforcing checksums on buffer write but ignoring on read, and 2 is the normal enforcing read/write mode. Disabling checksums could be done with this tool as well, and would trivially just cause it to ignore the checksums (or alternately set to 0 on page write, depending on if we think it matters). Add new catalog fields pg_database.dathaschecksum, pg_class.relhaschecksum; initially set to 0, or 1 if checksums were enabled at initdb time. Augment autovacuum to check if we are currently enabling checksums based on the value in pg_control; if so, loop over any database with !pg_database.dathaschecksum. For any relation in said database, check for relations with !pg_class.relhaschecksum; if found, read/dirty/write (however) each block to force the checksum written out for each page. As each relation is completely verified checksummed, update relhaschecksum = t. When no relations remain, set pg_database.dathaschecksum = t. (There may need to be some additional considerations for storing the checksum state of global relations or any other thing that uses the standard page format that live outside a specific database; i.e., all shared catalogs, quite possibly some things I haven't considered yet.) If the data_checksum_state is "enabling" and there are no database needing to be enabled, then we can set data_checksum_state to "enabled"; everything then works as expected for the normal enforcing state. External programs needing to be adjusted: - pg_reset_xlog -- add the persistence of the data_checksum_state - pg_controldata -- add the display of the data_checksum_state - pg_upgrade -- add an --enable-checksums flag to transition a new cluster with the data pages. May need some adjustments for the data_checksum_state field Possible new tool: - pg_enablechecksums -- basic tool to set the data_checksum_state flag of pg_control Other thoughts Do we need periodic CRC scanning background worker just to check buffers periodically? - if so, does this cause any interference with frozen relations? What additional changes would be required or what wrinkles would we have to work out? David -- David Christensen PostgreSQL Team Manager End Point Corporation da...@endpoint.com 785-727-1171 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers