Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-09 Thread Greg Stark
On Mon, Nov 8, 2010 at 5:59 PM, Aidan Van Dyk ai...@highrise.ca wrote: The problem that putting checksums in a different place solves is the page layout (binary upgrade) problem.  You're still doing to need to buffer the page as you calculate the checksum and write it out. buffering that page

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-09 Thread Aidan Van Dyk
On Tue, Nov 9, 2010 at 8:45 AM, Greg Stark gsst...@mit.edu wrote: But buffering the page only means you've got some consistent view of the page. It doesn't mean the checksum will actually match the data in the page that gets written out. So when you read it back in the checksum may be

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-09 Thread Greg Stark
On Tue, Nov 9, 2010 at 2:28 PM, Aidan Van Dyk ai...@highrise.ca wrote: On Tue, Nov 9, 2010 at 8:45 AM, Greg Stark gsst...@mit.edu wrote: But buffering the page only means you've got some consistent view of the page. It doesn't mean the checksum will actually match the data in the page that

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-09 Thread Greg Stark
On Tue, Nov 9, 2010 at 3:25 PM, Greg Stark gsst...@mit.edu wrote: Oh, I'm mistaken. The problem was that buffering the writes was insufficient to deal with torn pages. Even if you buffer the writes if the machine crashes while only having written half the buffer out then the checksum won't

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-09 Thread Jim Nasby
On Nov 9, 2010, at 9:27 AM, Greg Stark wrote: On Tue, Nov 9, 2010 at 3:25 PM, Greg Stark gsst...@mit.edu wrote: Oh, I'm mistaken. The problem was that buffering the writes was insufficient to deal with torn pages. Even if you buffer the writes if the machine crashes while only having written

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-09 Thread Gurjeet Singh
On Tue, Nov 9, 2010 at 12:32 AM, Tom Lane t...@sss.pgh.pa.us wrote: There are also crosschecks that you can apply: if it's a heap page, are there any index pages with pointers to it? If it's an index page, are there downlink or sibling links to it from elsewhere in the index? A page that

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-09 Thread Greg Stark
On Tue, Nov 9, 2010 at 4:26 PM, Jim Nasby j...@nasby.net wrote: On Tue, Nov 9, 2010 at 3:25 PM, Greg Stark gsst...@mit.edu wrote: Oh, I'm mistaken. The problem was that buffering the writes was insufficient to deal with torn pages. Even if you buffer the writes if the machine crashes while

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-09 Thread Aidan Van Dyk
On Tue, Nov 9, 2010 at 11:26 AM, Jim Nasby j...@nasby.net wrote: Huh, this implies that if we did go through all the work of segregating the hint bits and could arrange that they all appear on the same 512-byte sector and if we buffered them so that we were writing the same bits we

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-09 Thread Tom Lane
Gurjeet Singh singh.gurj...@gmail.com writes: On Tue, Nov 9, 2010 at 12:32 AM, Tom Lane t...@sss.pgh.pa.us wrote: IMO there are a lot of methods that can separate filesystem misfeasance from Postgres errors, probably with greater reliability than this hack. Doing this postmortem on a regular

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-09 Thread Greg Stark
On Tue, Nov 9, 2010 at 5:06 PM, Aidan Van Dyk ai...@highrise.ca wrote: So, for getting checksums, we have to offer up a few things: 1) zero-copy writes, we need to buffer the write to get a consistent checksum (or lock the buffer tight) 2) saving hint-bits on an otherwise unchanged page.  We

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-09 Thread Robert Haas
On Tue, Nov 9, 2010 at 12:31 PM, Greg Stark gsst...@mit.edu wrote: On Tue, Nov 9, 2010 at 5:06 PM, Aidan Van Dyk ai...@highrise.ca wrote: So, for getting checksums, we have to offer up a few things: 1) zero-copy writes, we need to buffer the write to get a consistent checksum (or lock the

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-09 Thread Kenneth Marshall
On Tue, Nov 09, 2010 at 02:05:57PM -0500, Robert Haas wrote: On Tue, Nov 9, 2010 at 12:31 PM, Greg Stark gsst...@mit.edu wrote: On Tue, Nov 9, 2010 at 5:06 PM, Aidan Van Dyk ai...@highrise.ca wrote: So, for getting checksums, we have to offer up a few things: 1) zero-copy writes, we need to

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-09 Thread Alvaro Herrera
Excerpts from Robert Haas's message of mar nov 09 16:05:57 -0300 2010: And it still allows silent data corruption, because bogusly clearing a hint bit is, at the moment, harmless, but bogusly setting one is not. I really have to wonder how other products handle this. PostgreSQL isn't the

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-09 Thread Josh Berkus
PostgreSQL isn't the only database product that uses MVCC - not by a long shot - and the problem of detecting whether an XID is visible to the current snapshot can't be ours alone. So what do other people do about this? They either don't cache the information about whether the XID is

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-09 Thread Greg Stark
On Tue, Nov 9, 2010 at 7:37 PM, Josh Berkus j...@agliodbs.com wrote: Well, most of the other MVCC-in-table DBMSes simply don't deal with large, on-disk databases.  In fact, I can't think of one which does, currently; while MVCC has been popular for the New Databases, they're all focused on

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-09 Thread Josh Berkus
The whole point of the hint bits is that it's in the same place as the data. Yes, but the hint bits are currently causing us trouble on several features or potential features: * page-level CRC checks * eliminating vacuum freeze for cold data * index-only access * replication * this patch *

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-09 Thread Greg Stark
On Tue, Nov 9, 2010 at 8:12 PM, Josh Berkus j...@agliodbs.com wrote: The whole point of the hint bits is that it's in the same place as the data. Yes, but the hint bits are currently causing us trouble on several features or potential features: Then we might have to get rid of hint bits. But

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-09 Thread Aidan Van Dyk
On Tue, Nov 9, 2010 at 3:25 PM, Greg Stark gsst...@mit.edu wrote: Then we might have to get rid of hint bits. But they're hint bits for a metadata file that already exists, creating another metadata file doesn't solve anything. Is there any way to instrument the writes of dirty buffers from

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-09 Thread Josh Berkus
Though incidentally all of the other items you mentioned are generic problems caused by with MVCC, not hint bits. Yes, but the hint bits prevent us from implementing workarounds. -- -- Josh Berkus PostgreSQL Experts Inc.

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-09 Thread Robert Haas
On Tue, Nov 9, 2010 at 2:05 PM, Robert Haas robertmh...@gmail.com wrote: On Tue, Nov 9, 2010 at 12:31 PM, Greg Stark gsst...@mit.edu wrote: On Tue, Nov 9, 2010 at 5:06 PM, Aidan Van Dyk ai...@highrise.ca wrote: So, for getting checksums, we have to offer up a few things: 1) zero-copy writes,

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-09 Thread Josh Berkus
On 11/9/10 1:50 PM, Robert Haas wrote: 5. It would be pretty much impossible to run with autovacuum turned off, and in fact you would likely need to make it a good deal more aggressive in the specific case of aborted transactions, to mitigate problems #1, #3, and #4. 6. This would require us

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-09 Thread Kevin Grittner
Josh Berkus j...@agliodbs.com wrote: 6. This would require us to be more aggressive about VACUUMing old-cold relations/page, e.g. VACUUM FREEZE. This it would make one of our worst issues for data warehousing even worse. I continue to feel that it is insane that when a table is populated

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-09 Thread Robert Haas
On Tue, Nov 9, 2010 at 5:03 PM, Josh Berkus j...@agliodbs.com wrote: On 11/9/10 1:50 PM, Robert Haas wrote: 5. It would be pretty much impossible to run with autovacuum turned off, and in fact you would likely need to make it a good deal more aggressive in the specific case of aborted

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-09 Thread Robert Haas
On Tue, Nov 9, 2010 at 5:15 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Josh Berkus j...@agliodbs.com wrote: 6. This would require us to be more aggressive about VACUUMing old-cold relations/page, e.g. VACUUM FREEZE.  This it would make one of our worst issues for data warehousing

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-09 Thread Robert Haas
On Tue, Nov 9, 2010 at 3:05 PM, Greg Stark gsst...@mit.edu wrote: On Tue, Nov 9, 2010 at 7:37 PM, Josh Berkus j...@agliodbs.com wrote: Well, most of the other MVCC-in-table DBMSes simply don't deal with large, on-disk databases.  In fact, I can't think of one which does, currently; while MVCC

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-09 Thread Josh Berkus
Robert, Uh, no it doesn't. It only requires you to be more aggressive about vacuuming the transactions that are in the aborted-XIDs array. It doesn't affect transaction wraparound vacuuming at all, either positively or negatively. You still have to freeze xmins before they flip from being

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-09 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes: Though incidentally all of the other items you mentioned are generic problems caused by with MVCC, not hint bits. Yes, but the hint bits prevent us from implementing workarounds. If we got rid of hint bits, we'd need workarounds for the ensuing massive

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-09 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: dons asbestos underpants 4. There would presumably be some finite limit on the size of the shared memory structure for aborted transactions. I don't think there'd be any reason to make it particularly small, but if you sat there and aborted

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-09 Thread Robert Haas
On Tue, Nov 9, 2010 at 5:45 PM, Josh Berkus j...@agliodbs.com wrote: Robert, Uh, no it doesn't.  It only requires you to be more aggressive about vacuuming the transactions that are in the aborted-XIDs array.  It doesn't affect transaction wraparound vacuuming at all, either positively or

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-09 Thread Gurjeet Singh
On Wed, Nov 10, 2010 at 1:15 AM, Tom Lane t...@sss.pgh.pa.us wrote: Once you know that there is, or isn't, a filesystem-level error involved, what are you going to do next? You're going to go try to debug the component you know is at fault, that's what. And that problem is still AI-complete.

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-09 Thread Robert Haas
On Tue, Nov 9, 2010 at 6:42 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: dons asbestos underpants 4. There would presumably be some finite limit on the size of the shared memory structure for aborted transactions.  I don't think there'd be any reason to

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-09 Thread Robert Haas
On Tue, Nov 9, 2010 at 7:04 PM, Robert Haas robertmh...@gmail.com wrote: On Tue, Nov 9, 2010 at 5:45 PM, Josh Berkus j...@agliodbs.com wrote: Robert, Uh, no it doesn't.  It only requires you to be more aggressive about vacuuming the transactions that are in the aborted-XIDs array.  It

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-08 Thread Aidan Van Dyk
On Sun, Nov 7, 2010 at 1:04 AM, Greg Stark gsst...@mit.edu wrote: It does seem like this is kind of part and parcel of adding checksums to blocks. It's arguably kind of silly to add checksums to blocks but have an commonly produced bitpattern in corruption cases go undetected. Getting back to

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-08 Thread Tom Lane
Aidan Van Dyk ai...@highrise.ca writes: Getting back to the checksum debate (and this seems like a semi-version of the checksum debate), now that we have forks, could we easily add block checksumming to a fork? IT would mean writing to 2 files but that shouldn't be a problem, because until

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-08 Thread Tom Lane
Gurjeet Singh singh.gurj...@gmail.com writes: On Sat, Nov 6, 2010 at 11:48 PM, Tom Lane t...@sss.pgh.pa.us wrote: Um ... and exactly how does that differ from the existing behavior? Right now a zero filled page considered valid, and is treated as a new page; PageHeaderIsValid()-/* Check

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-08 Thread Tom Lane
I wrote: Aidan Van Dyk ai...@highrise.ca writes: Getting back to the checksum debate (and this seems like a semi-version of the checksum debate), now that we have forks, could we easily add block checksumming to a fork? More generally, this re-opens the question of whether data in secondary

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-08 Thread Greg Stark
On Mon, Nov 8, 2010 at 5:00 PM, Tom Lane t...@sss.pgh.pa.us wrote: So maybe Aidan's got a good idea here.  It would sure be a lot easier to shoehorn checksum checking in as an optional feature if the checksums were kept someplace else. Would it? I thought the only problem was the hint bits

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-08 Thread Aidan Van Dyk
On Mon, Nov 8, 2010 at 12:53 PM, Greg Stark gsst...@mit.edu wrote: On Mon, Nov 8, 2010 at 5:00 PM, Tom Lane t...@sss.pgh.pa.us wrote: So maybe Aidan's got a good idea here.  It would sure be a lot easier to shoehorn checksum checking in as an optional feature if the checksums were kept

[HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-06 Thread Gurjeet Singh
A customer of ours is quite bothered about finding zero pages in an index after a system crash. The task now is to improve the diagnosability of such an issue and be able to definitively point to the source of zero pages. The proposed solution below has been vetted in-house at EnterpriseDB and am

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-06 Thread Tom Lane
Gurjeet Singh singh.gurj...@gmail.com writes: .) The basic idea is to have a magic number in every PageHeader before it is written to disk, and check for this magic number when performing page validity checks. Um ... and exactly how does that differ from the existing behavior? .) To avoid

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-06 Thread Gurjeet Singh
On Sat, Nov 6, 2010 at 11:48 PM, Tom Lane t...@sss.pgh.pa.us wrote: Gurjeet Singh singh.gurj...@gmail.com writes: .) The basic idea is to have a magic number in every PageHeader before it is written to disk, and check for this magic number when performing page validity checks. Um ...

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-06 Thread Greg Stark
On Sun, Nov 7, 2010 at 4:23 AM, Gurjeet Singh singh.gurj...@gmail.com wrote: I understand that it is a pretty low-level change, but IMHO the change is minimal and is being applied in well understood places. All the assumptions listed have been effective for quite a while, and I don't see these