On Tue, Dec 1, 2009 at 6:41 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Bruce Momjian <br...@momjian.us> writes: >> OK, here is another idea, maybe crazy: > >> When we read in a page that has an invalid CRC, we check the page to see >> which hint bits are _not_ set, and we try setting them to see if can get >> a matching CRC.
Unfortunately you would also have to try *unsetting* every hint bit as well since the updated hint bits might have made it to disk but not the CRC leaving the old CRC for the block with the unset bits. I actually independently had the same thought today that Simon had of moving the hint bits to the line pointer. We can obtain more free bits in the line pointers by dividing the item offsets and sizes by maxalign if we need it. That should give at least 4 spare bits which is all we need for the four VALID/INVALID hint bits. It should be relatively cheap to skip the hint bits in the line pointers since they'll be the same bits of every 16-bit value for a whole range. Alternatively we could just CRC the tuples and assume a corrupted line pointer will show itself quickly. That would actually make it faster than a straight CRC of the whole block -- making lemonade out of lemons as it were. There's still the all-tuples-in-page-are-visible hint bit and the hint bits in btree pages. I'm not sure if those are easier or harder to solve. We might be able to assume the all-visible flag will not be torn from the crc as long as they're within the same 512 byte sector. And iirc the btree hint bits are in the line pointers themselves as well? Another thought is that would could use the MSSQL-style torn page detection of including a counter (or even a bit?) in every 512-byte chunk which gets incremented every time the page is written. If they don't all match when read in then the page was torn and we can't check the CRC. That gets us the advantage that we can inform the user that a torn page was detected so they know that they must absolutely use full_page_writes on their system. Currently users are in the dark whether their system is susceptible to them or not and have now idea with what frequency. Even here there are quite divergent opinions about their frequency and which systems are susceptible to them or immune. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers