Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-13 Thread k...@rice.edu
On Sun, May 12, 2013 at 03:46:00PM -0500, Jim Nasby wrote: On 5/10/13 1:06 PM, Jeff Janes wrote: Of course the paranoid DBA could turn off restart_after_crash and do a manual investigation on every crash, but in that case the database would refuse to restart even in the case where it

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-13 Thread k...@rice.edu
On Sun, May 12, 2013 at 07:41:26PM -0500, Jon Nelson wrote: On Sun, May 12, 2013 at 3:46 PM, Jim Nasby j...@nasby.net wrote: On 5/10/13 1:06 PM, Jeff Janes wrote: Of course the paranoid DBA could turn off restart_after_crash and do a manual investigation on every crash, but in that case

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-13 Thread Jon Nelson
On Mon, May 13, 2013 at 7:49 AM, k...@rice.edu k...@rice.edu wrote: On Sun, May 12, 2013 at 07:41:26PM -0500, Jon Nelson wrote: On Sun, May 12, 2013 at 3:46 PM, Jim Nasby j...@nasby.net wrote: On 5/10/13 1:06 PM, Jeff Janes wrote: Of course the paranoid DBA could turn off

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-13 Thread Andres Freund
On 2013-05-12 19:41:26 -0500, Jon Nelson wrote: On Sun, May 12, 2013 at 3:46 PM, Jim Nasby j...@nasby.net wrote: On 5/10/13 1:06 PM, Jeff Janes wrote: Of course the paranoid DBA could turn off restart_after_crash and do a manual investigation on every crash, but in that case the database

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-13 Thread Jon Nelson
On Mon, May 13, 2013 at 8:32 AM, Andres Freund and...@2ndquadrant.com wrote: On 2013-05-12 19:41:26 -0500, Jon Nelson wrote: On Sun, May 12, 2013 at 3:46 PM, Jim Nasby j...@nasby.net wrote: On 5/10/13 1:06 PM, Jeff Janes wrote: Of course the paranoid DBA could turn off restart_after_crash

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-13 Thread Andres Freund
On 2013-05-13 08:45:41 -0500, Jon Nelson wrote: On Mon, May 13, 2013 at 8:32 AM, Andres Freund and...@2ndquadrant.com wrote: On 2013-05-12 19:41:26 -0500, Jon Nelson wrote: On Sun, May 12, 2013 at 3:46 PM, Jim Nasby j...@nasby.net wrote: On 5/10/13 1:06 PM, Jeff Janes wrote: Of

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-13 Thread Greg Stark
On Mon, May 13, 2013 at 2:49 PM, Andres Freund and...@2ndquadrant.com wrote: Sure, the initial file creation will be faster. But are the actual individual wal writes (small, frequently fdatasync()ed) still faster? That's the critical path currently. Whether it is pretty much depends on how the

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-13 Thread Andres Freund
On 2013-05-13 16:03:11 +0100, Greg Stark wrote: On Mon, May 13, 2013 at 2:49 PM, Andres Freund and...@2ndquadrant.com wrote: Sure, the initial file creation will be faster. But are the actual individual wal writes (small, frequently fdatasync()ed) still faster? That's the critical path

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-13 Thread Simon Riggs
On 13 May 2013 14:45, Jon Nelson jnelson+pg...@jamponi.net wrote: I should not derail this thread any further. Perhaps, if interested parties would like to discuss the use of fallocate/posix_fallocate, a new thread might be more appropriate? Sounds like a good idea. Always nice to see a fresh

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-12 Thread Jim Nasby
On 5/9/13 5:18 PM, Jeff Davis wrote: On Thu, 2013-05-09 at 14:28 -0500, Jim Nasby wrote: What about moving some critical data from the beginning of the WAL record to the end? That would make it easier to detect that we don't have a complete record. It wouldn't necessarily replace the CRC

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-12 Thread Jim Nasby
On 5/10/13 1:06 PM, Jeff Janes wrote: Of course the paranoid DBA could turn off restart_after_crash and do a manual investigation on every crash, but in that case the database would refuse to restart even in the case where it perfectly clear that all the following WAL belongs to the recycled

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-12 Thread Jon Nelson
On Sun, May 12, 2013 at 3:46 PM, Jim Nasby j...@nasby.net wrote: On 5/10/13 1:06 PM, Jeff Janes wrote: Of course the paranoid DBA could turn off restart_after_crash and do a manual investigation on every crash, but in that case the database would refuse to restart even in the case where it

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-11 Thread Simon Riggs
On 10 May 2013 23:41, Jeff Davis pg...@j-davis.com wrote: On Fri, 2013-05-10 at 18:32 +0100, Simon Riggs wrote: We don't write() WAL except with an immediate sync(), so the chances of what you say happening are very low to impossible. Are you sure? An XLogwrtRqst contains a write and a flush

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-10 Thread Simon Riggs
On 9 May 2013 23:13, Greg Stark st...@mit.edu wrote: On Thu, May 9, 2013 at 10:45 PM, Simon Riggs si...@2ndquadrant.com wrote: On 9 May 2013 22:39, Tom Lane t...@sss.pgh.pa.us wrote: Simon Riggs si...@2ndquadrant.com writes: If the current WAL record is corrupt and the next WAL record is in

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-10 Thread Greg Stark
On Fri, May 10, 2013 at 7:44 AM, Simon Riggs si...@2ndquadrant.com wrote: Having one corrupt record followed by a valid record is not an abnormal situation. It could easily be the correct end of WAL. I disagree, that *is* an abnormal situation and would not be the correct end-of-WAL. Each

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-10 Thread Amit Kapila
On Friday, May 10, 2013 6:09 PM Greg Stark wrote: On Fri, May 10, 2013 at 7:44 AM, Simon Riggs si...@2ndquadrant.com wrote: Having one corrupt record followed by a valid record is not an abnormal situation. It could easily be the correct end of WAL. I disagree, that *is* an abnormal

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-10 Thread Greg Stark
On Fri, May 10, 2013 at 5:31 PM, Amit Kapila amit.kap...@huawei.com wrote: In the case where one block is missing, how can it even reach to next record to check prev pointer. I think it can be possible when one of the record is corrupt and following are okay which I think is the case in which

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-10 Thread Tom Lane
Greg Stark st...@mit.edu writes: A single WAL record can be over 24kB. pedantic Actually, WAL records can run to megabytes. Consider for example a commit record for a transaction that dropped thousands of tables --- there'll be info about each such table in the commit record, to cue replay to

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-10 Thread Simon Riggs
On 10 May 2013 13:39, Greg Stark st...@mit.edu wrote: On Fri, May 10, 2013 at 7:44 AM, Simon Riggs si...@2ndquadrant.com wrote: Having one corrupt record followed by a valid record is not an abnormal situation. It could easily be the correct end of WAL. I disagree, that *is* an abnormal

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-10 Thread Simon Riggs
On 10 May 2013 18:23, Tom Lane t...@sss.pgh.pa.us wrote: Greg Stark st...@mit.edu writes: A single WAL record can be over 24kB. pedantic Actually, WAL records can run to megabytes. Consider for example a commit record for a transaction that dropped thousands of tables --- there'll be info

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-10 Thread Jeff Janes
On Fri, May 10, 2013 at 9:54 AM, Greg Stark st...@mit.edu wrote: On Fri, May 10, 2013 at 5:31 PM, Amit Kapila amit.kap...@huawei.com wrote: In the case where one block is missing, how can it even reach to next record to check prev pointer. I think it can be possible when one of the

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-10 Thread Robert Haas
On Fri, May 10, 2013 at 2:06 PM, Jeff Janes jeff.ja...@gmail.com wrote: But based on your description, perhaps refusing to automatically restart and forcing an explicit decision would happen a lot more often, during normal crashes with no corruption, than I was thinking it would. I bet it

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-10 Thread Jeff Davis
On Fri, 2013-05-10 at 18:32 +0100, Simon Riggs wrote: We don't write() WAL except with an immediate sync(), so the chances of what you say happening are very low to impossible. Are you sure? An XLogwrtRqst contains a write and a flush pointer, so I assume they can be different. I agree that it

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-10 Thread Greg Smith
On 5/10/13 1:32 PM, Simon Riggs wrote: The timing window between the write and the sync is negligible and yet I/O would need to occur in that window and also be out of order from the order of the write, which is unlikely because an I/O elevator would either not touch the order of writes at all,

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-10 Thread Amit kapila
On Friday, May 10, 2013 10:24 PM Greg Stark wrote: On Fri, May 10, 2013 at 5:31 PM, Amit Kapila amit.kap...@huawei.com wrote: In the case where one block is missing, how can it even reach to next record to check prev pointer. I think it can be possible when one of the record is corrupt and

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-09 Thread Jim Nasby
On 5/8/13 7:34 PM, Jeff Davis wrote: On Wed, 2013-05-08 at 17:56 -0500, Jim Nasby wrote: Apologies if this is a stupid question, but is this mostly an issue due to torn pages? IOW, if we had a way to ensure we never see torn pages, would that mean an invalid CRC on a WAL page indicated there

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-09 Thread Simon Riggs
On 9 May 2013 20:28, Jim Nasby j...@nasby.net wrote: Unfortunately, it seems that doing any kind of validation to determine that we have a valid end-of-the-WAL inherently requires some kind of separate durable write somewhere. It would be a tiny amount of data (an LSN and maybe some extra

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-09 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes: If the current WAL record is corrupt and the next WAL record is in every way valid, we can potentially continue. That seems like a seriously bad idea. regards, tom lane -- Sent via pgsql-hackers mailing list

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-09 Thread Simon Riggs
On 9 May 2013 22:39, Tom Lane t...@sss.pgh.pa.us wrote: Simon Riggs si...@2ndquadrant.com writes: If the current WAL record is corrupt and the next WAL record is in every way valid, we can potentially continue. That seems like a seriously bad idea. I agree. But if you knew that were true, is

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-09 Thread Greg Stark
On Thu, May 9, 2013 at 10:45 PM, Simon Riggs si...@2ndquadrant.com wrote: On 9 May 2013 22:39, Tom Lane t...@sss.pgh.pa.us wrote: Simon Riggs si...@2ndquadrant.com writes: If the current WAL record is corrupt and the next WAL record is in every way valid, we can potentially continue. That

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-09 Thread Jeff Davis
On Thu, 2013-05-09 at 14:28 -0500, Jim Nasby wrote: What about moving some critical data from the beginning of the WAL record to the end? That would make it easier to detect that we don't have a complete record. It wouldn't necessarily replace the CRC though, so maybe that's not good enough.

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-09 Thread Jeff Davis
On Thu, 2013-05-09 at 23:13 +0100, Greg Stark wrote: However it is possible to reduce the window... Sounds reasonable. It's fairly limited though -- the window is already a checkpoint (typically 5-30 minutes), and we'd bring that down an order of magnitude (10s). I speculate that, if it got

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-08 Thread Jim Nasby
On 4/5/13 6:39 PM, Jeff Davis wrote: On Fri, 2013-04-05 at 10:34 +0200, Florian Pflug wrote: Maybe we could scan forward to check whether a corrupted WAL record is followed by one or more valid ones with sensible LSNs. If it is, chances are high that we haven't actually hit the end of the WAL.

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-08 Thread Jeff Davis
On Wed, 2013-05-08 at 17:56 -0500, Jim Nasby wrote: Apologies if this is a stupid question, but is this mostly an issue due to torn pages? IOW, if we had a way to ensure we never see torn pages, would that mean an invalid CRC on a WAL page indicated there really was corruption on that page?

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-07 Thread Robert Haas
On Mon, May 6, 2013 at 5:04 PM, Jeff Davis pg...@j-davis.com wrote: On Mon, 2013-05-06 at 15:31 -0400, Robert Haas wrote: On Wed, May 1, 2013 at 3:04 PM, Jeff Davis pg...@j-davis.com wrote: Regardless, you have a reasonable claim that my patch had effects that were not necessary. I have

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-07 Thread Jeff Davis
On Tue, 2013-05-07 at 13:20 -0400, Robert Haas wrote: Hmm. Rereading your last email, I see your point: since we now have HEAP_XLOG_VISIBLE, this is much less of an issue than it would have been before. I'm still not convinced that simplifying that code is a good idea, but maybe it doesn't

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-06 Thread Robert Haas
On Wed, May 1, 2013 at 3:04 PM, Jeff Davis pg...@j-davis.com wrote: Regardless, you have a reasonable claim that my patch had effects that were not necessary. I have attached a draft patch to remedy that. Only rudimentary testing was done. This looks reasonable to me. -- Robert Haas

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-06 Thread Jeff Davis
On Mon, 2013-05-06 at 15:31 -0400, Robert Haas wrote: On Wed, May 1, 2013 at 3:04 PM, Jeff Davis pg...@j-davis.com wrote: Regardless, you have a reasonable claim that my patch had effects that were not necessary. I have attached a draft patch to remedy that. Only rudimentary testing was

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-04 Thread Simon Riggs
On 3 May 2013 21:53, Jeff Davis pg...@j-davis.com wrote: At this point, I don't think more changes are required. After detailed further analysis, I agree, no further changes are required. I think the code in that area needs considerable refactoring to improve things. I've looked for an easy

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-03 Thread Simon Riggs
On 1 May 2013 20:40, Jeff Davis pg...@j-davis.com wrote: Looks easy. There is no additional logic for checksums, so there's no third complexity. So we either have * cleanup info with vismap setting info * cleanup info only which is the same number of WAL records as we have now, just that

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-03 Thread Jeff Davis
On Fri, 2013-05-03 at 19:52 +0100, Simon Riggs wrote: On 1 May 2013 20:40, Jeff Davis pg...@j-davis.com wrote: Looks easy. There is no additional logic for checksums, so there's no third complexity. So we either have * cleanup info with vismap setting info * cleanup info only

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-01 Thread Simon Riggs
On 30 April 2013 22:54, Jeff Davis pg...@j-davis.com wrote: On Tue, 2013-04-30 at 08:34 -0400, Robert Haas wrote: Uh, wait a minute. I think this is completely wrong. The buffer is LOCKED for this entire sequence of operations. For a checkpoint to happen, it's got to write every buffer,

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-01 Thread Robert Haas
On Tue, Apr 30, 2013 at 5:54 PM, Jeff Davis pg...@j-davis.com wrote: On Tue, 2013-04-30 at 08:34 -0400, Robert Haas wrote: Uh, wait a minute. I think this is completely wrong. The buffer is LOCKED for this entire sequence of operations. For a checkpoint to happen, it's got to write every

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-01 Thread Robert Haas
On Wed, May 1, 2013 at 11:29 AM, Robert Haas robertmh...@gmail.com wrote: I was worried because SyncOneBuffer checks whether it needs writing without taking a content lock, so the exclusive lock doesn't help. That makes sense, because you don't want a checkpoint to have to get a content lock

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-01 Thread Simon Riggs
On 1 May 2013 16:33, Robert Haas robertmh...@gmail.com wrote: On Wed, May 1, 2013 at 11:29 AM, Robert Haas robertmh...@gmail.com wrote: I was worried because SyncOneBuffer checks whether it needs writing without taking a content lock, so the exclusive lock doesn't help. That makes sense,

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-01 Thread Robert Haas
On Wed, May 1, 2013 at 1:02 PM, Simon Riggs si...@2ndquadrant.com wrote: I agree, but that was in the original coding wasn't it? I believe the problem was introduced by this commit: commit fdf9e21196a6f58c6021c967dc5776a16190f295 Author: Heikki Linnakangas heikki.linnakan...@iki.fi Date: Wed

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-01 Thread Jeff Davis
On Wed, 2013-05-01 at 11:33 -0400, Robert Haas wrote: The only time the VM and the data page are out of sync during vacuum is after a crash, right? If that's the case, I didn't think it was a big deal to dirty one extra page (should be extremely rare). Am I missing something? The

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-01 Thread Simon Riggs
On 1 May 2013 19:16, Robert Haas robertmh...@gmail.com wrote: On Wed, May 1, 2013 at 1:02 PM, Simon Riggs si...@2ndquadrant.com wrote: I agree, but that was in the original coding wasn't it? I believe the problem was introduced by this commit: commit fdf9e21196a6f58c6021c967dc5776a16190f295

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-01 Thread Jeff Davis
On Wed, 2013-05-01 at 14:16 -0400, Robert Haas wrote: Now that I'm looking at this, I'm a bit confused by the new logic in visibilitymap_set(). When checksums are enabled, we set the page LSN, which is described like this: we need to protect the heap page from being torn. But how does

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-05-01 Thread Jeff Davis
On Wed, 2013-05-01 at 20:06 +0100, Simon Riggs wrote: Why aren't we writing just one WAL record for this action? ... I thought about that, too. It certainly seems like more than we want to try to do for 9.3 at this point. The other complication is that there's a lot of conditional

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-30 Thread Simon Riggs
On 9 April 2013 08:36, Jeff Davis pg...@j-davis.com wrote: 1. I believe that the issue I brought up at the end of this email: http://www.postgresql.org/message-id/1365035537.7580.380.camel@sussancws0025 is a real issue. In lazy_vacuum_page(), the following sequence can happen when checksums

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-30 Thread Robert Haas
On Tue, Apr 30, 2013 at 6:58 AM, Simon Riggs si...@2ndquadrant.com wrote: On 9 April 2013 08:36, Jeff Davis pg...@j-davis.com wrote: 1. I believe that the issue I brought up at the end of this email: http://www.postgresql.org/message-id/1365035537.7580.380.camel@sussancws0025 is a real

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-30 Thread Simon Riggs
On 30 April 2013 13:34, Robert Haas robertmh...@gmail.com wrote: On Tue, Apr 30, 2013 at 6:58 AM, Simon Riggs si...@2ndquadrant.com wrote: On 9 April 2013 08:36, Jeff Davis pg...@j-davis.com wrote: 1. I believe that the issue I brought up at the end of this email:

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-30 Thread Jeff Davis
On Tue, 2013-04-30 at 08:34 -0400, Robert Haas wrote: Uh, wait a minute. I think this is completely wrong. The buffer is LOCKED for this entire sequence of operations. For a checkpoint to happen, it's got to write every buffer, which it will not be able to do for so long as the buffer is

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-11 Thread Simon Riggs
On 11 April 2013 00:37, Robert Haas robertmh...@gmail.com wrote: On Sat, Apr 6, 2013 at 10:44 AM, Andres Freund and...@2ndquadrant.com wrote: I feel pretty strongly that we shouldn't add any such complications to XLogInsert() itself, its complicated enough already and it should be made

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-10 Thread Robert Haas
On Sat, Apr 6, 2013 at 10:44 AM, Andres Freund and...@2ndquadrant.com wrote: I feel pretty strongly that we shouldn't add any such complications to XLogInsert() itself, its complicated enough already and it should be made simpler, not more complicated. +1, emphatically. XLogInsert is a really

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-09 Thread Jeff Davis
On Sat, 2013-04-06 at 16:44 +0200, Andres Freund wrote: I think we can just make up the rule that changing full page writes also requires SpinLockAcquire(xlogctl-info_lck);. Then its easy enough. And it can hardly be a performance bottleneck given how infrequently its modified. That seems

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-09 Thread Jeff Davis
On Mon, 2013-04-08 at 09:19 +0100, Simon Riggs wrote: Applied, with this as the only code change. Thanks everybody for good research and coding and fast testing. We're in good shape now. Thank you. I have attached two more patches: 1. I believe that the issue I brought up at the end

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-08 Thread Simon Riggs
On 6 April 2013 15:44, Andres Freund and...@2ndquadrant.com wrote: * In xlog_redo, it seemed slightly awkward to call XLogRecGetData twice. Merely a matter of preference but I thought I would mention it. Youre absolutely right, memcpy should have gotten passed 'data', not

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-07 Thread Jaime Casanova
On Sat, Apr 6, 2013 at 1:36 PM, Jeff Janes jeff.ja...@gmail.com wrote: On Fri, Apr 5, 2013 at 6:09 AM, Andres Freund and...@2ndquadrant.com wrote: How does the attached version look? I verified that it survives recovery, but not more. Jeff, any chance you can run this for a round with your

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-06 Thread Andres Freund
On 2013-04-05 16:29:47 -0700, Jeff Davis wrote: On Fri, 2013-04-05 at 15:09 +0200, Andres Freund wrote: How does the attached version look? I verified that it survives recovery, but not more. Comments: * Regarding full page writes, we can: - always write full pages (as in your

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-06 Thread Jeff Janes
On Fri, Apr 5, 2013 at 6:09 AM, Andres Freund and...@2ndquadrant.comwrote: How does the attached version look? I verified that it survives recovery, but not more. Jeff, any chance you can run this for a round with your suite? I've run it for a while now and have found no problems.

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-05 Thread Florian Pflug
On Apr4, 2013, at 23:21 , Jeff Janes jeff.ja...@gmail.com wrote: This brings up a pretty frightening possibility to me, unrelated to data checksums. If a bit gets twiddled in the WAL file due to a hardware issue or a cosmic ray, and then a crash happens, automatic recovery will stop early

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-05 Thread Andres Freund
On 2013-04-04 17:39:16 -0700, Jeff Davis wrote: On Thu, 2013-04-04 at 22:39 +0200, Andres Freund wrote: I don't think its really slower. Earlier the code took WalInsertLock everytime, even if we ended up not logging anything. Thats far more epensive than a single spinlock. And the copy

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-05 Thread Jeff Davis
On Fri, 2013-04-05 at 15:09 +0200, Andres Freund wrote: How does the attached version look? I verified that it survives recovery, but not more. Comments: * Regarding full page writes, we can: - always write full pages (as in your current patch), regardless of the current settings - take

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-05 Thread Jeff Davis
On Fri, 2013-04-05 at 10:34 +0200, Florian Pflug wrote: Maybe we could scan forward to check whether a corrupted WAL record is followed by one or more valid ones with sensible LSNs. If it is, chances are high that we haven't actually hit the end of the WAL. In that case, we could either log a

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-05 Thread Jaime Casanova
On Fri, Apr 5, 2013 at 8:09 AM, Andres Freund and...@2ndquadrant.com wrote: How does the attached version look? I verified that it survives recovery, but not more. I still got errors when executing make installcheck in a just compiled 9.3devel + this_patch, this is when setting wal_level

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-05 Thread Jeff Davis
On Fri, 2013-04-05 at 19:22 -0500, Jaime Casanova wrote: On Fri, Apr 5, 2013 at 8:09 AM, Andres Freund and...@2ndquadrant.com wrote: How does the attached version look? I verified that it survives recovery, but not more. I still got errors when executing make installcheck in a just

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-05 Thread Jaime Casanova
On Fri, Apr 5, 2013 at 7:39 PM, Jeff Davis pg...@j-davis.com wrote: On Fri, 2013-04-05 at 19:22 -0500, Jaime Casanova wrote: On Fri, Apr 5, 2013 at 8:09 AM, Andres Freund and...@2ndquadrant.com wrote: How does the attached version look? I verified that it survives recovery, but not more.

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-04 Thread Simon Riggs
On 4 April 2013 02:39, Andres Freund and...@2ndquadrant.com wrote: Ok, I think I see the bug. And I think its been introduced in the checkpoints patch. Well spotted. (I think you mean checksums patch). If by now the first backend has proceeded to PageSetLSN() we are writing different data

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-04 Thread Andres Freund
On 2013-04-04 13:30:40 +0100, Simon Riggs wrote: On 4 April 2013 02:39, Andres Freund and...@2ndquadrant.com wrote: Ok, I think I see the bug. And I think its been introduced in the checkpoints patch. Well spotted. (I think you mean checksums patch). Heh, yes. I was slightly tired at that

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-04 Thread Simon Riggs
On 4 April 2013 15:53, Andres Freund and...@2ndquadrant.com wrote: Unfortunately I find that approach unacceptably ugly. Yeh. If we can confirm its a fix we can discuss a cleaner patch and that is much better. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-04 Thread Jeff Davis
Andres, Thank you for diagnosing this problem! On Thu, 2013-04-04 at 16:53 +0200, Andres Freund wrote: I think the route you quickly sketched is more realistic. That would remove all knowledge obout XLOG_HINT from generic code hich is a very good thing, I spent like 15minutes yesterday

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-04 Thread Andres Freund
On 2013-04-04 12:59:36 -0700, Jeff Davis wrote: Andres, Thank you for diagnosing this problem! On Thu, 2013-04-04 at 16:53 +0200, Andres Freund wrote: I think the route you quickly sketched is more realistic. That would remove all knowledge obout XLOG_HINT from generic code hich is a

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-04 Thread Jeff Janes
On Thu, Apr 4, 2013 at 5:30 AM, Simon Riggs si...@2ndquadrant.com wrote: On 4 April 2013 02:39, Andres Freund and...@2ndquadrant.com wrote: If by now the first backend has proceeded to PageSetLSN() we are writing different data to disk than the one we computed the checksum of before.

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-04 Thread Jeff Davis
On Thu, 2013-04-04 at 14:21 -0700, Jeff Janes wrote: This brings up a pretty frightening possibility to me, unrelated to data checksums. If a bit gets twiddled in the WAL file due to a hardware issue or a cosmic ray, and then a crash happens, automatic recovery will stop early with the

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-04 Thread Jeff Davis
On Thu, 2013-04-04 at 22:39 +0200, Andres Freund wrote: I don't think its really slower. Earlier the code took WalInsertLock everytime, even if we ended up not logging anything. Thats far more epensive than a single spinlock. And the copy should also only be taken in the case we need to log.

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-04 Thread Tom Lane
Jeff Davis pg...@j-davis.com writes: On Thu, 2013-04-04 at 14:21 -0700, Jeff Janes wrote: This brings up a pretty frightening possibility to me, unrelated to data checksums. If a bit gets twiddled in the WAL file due to a hardware issue or a cosmic ray, and then a crash happens, automatic

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-04 Thread Jeff Davis
On Thu, 2013-04-04 at 21:06 -0400, Tom Lane wrote: I can't escape the feeling that we'd just be reinventing software RAID. There's no reason to think that we can deal with this class of problems better than the storage system can. The goal would be to reliably detect a situation where WAL that

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-03 Thread Andres Freund
On 2013-04-03 15:57:49 -0700, Jeff Janes wrote: I've changed the subject from regression test failed when enabling checksum because I now know they are totally unrelated. My test case didn't need to depend on archiving being on, and so with a simple tweak I rendered the two issues

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-03 Thread Andres Freund
On 2013-04-04 01:52:41 +0200, Andres Freund wrote: On 2013-04-03 15:57:49 -0700, Jeff Janes wrote: I've changed the subject from regression test failed when enabling checksum because I now know they are totally unrelated. My test case didn't need to depend on archiving being on, and so

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-03 Thread Jeff Davis
On Wed, 2013-04-03 at 15:57 -0700, Jeff Janes wrote: You don't know that the cluster is in the bad state until after it goes through recovery because most crashes recover perfectly fine. So it would have to make a side-copy of the cluster after the crash, then recover the original and see

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-03 Thread Andres Freund
On 2013-04-04 02:28:32 +0200, Andres Freund wrote: On 2013-04-04 01:52:41 +0200, Andres Freund wrote: On 2013-04-03 15:57:49 -0700, Jeff Janes wrote: I've changed the subject from regression test failed when enabling checksum because I now know they are totally unrelated. My test

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-03 Thread Tom Lane
and...@anarazel.de (Andres Freund) writes: Looking at the page lsn's with dd I noticed something peculiar: page 0: 01 00 00 00 18 c2 00 31 = 1/3100C218 page 1: 01 00 00 00 80 44 01 31 = 1/31014480 page 10: 01 00 00 00 60 ce 05 31 = 1/3105ce60 page 43: 01 00 00 00 58 7a 16 31 = 1/31167a58

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-03 Thread Andres Freund
On 2013-04-03 20:45:51 -0400, Tom Lane wrote: and...@anarazel.de (Andres Freund) writes: Looking at the page lsn's with dd I noticed something peculiar: page 0: 01 00 00 00 18 c2 00 31 = 1/3100C218 page 1: 01 00 00 00 80 44 01 31 = 1/31014480 page 10: 01 00 00 00 60 ce 05 31 =

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-03 Thread Andres Freund
On 2013-04-04 02:58:43 +0200, Andres Freund wrote: On 2013-04-03 20:45:51 -0400, Tom Lane wrote: and...@anarazel.de (Andres Freund) writes: Looking at the page lsn's with dd I noticed something peculiar: page 0: 01 00 00 00 18 c2 00 31 = 1/3100C218 page 1: 01 00 00 00 80 44 01