Re: [HACKERS] Enabling Checksums

2013-03-06 Thread Andres Freund
On 2013-03-06 13:34:21 +0200, Heikki Linnakangas wrote: On 06.03.2013 10:41, Simon Riggs wrote: On 5 March 2013 18:02, Jeff Davispg...@j-davis.com wrote: Fletcher is probably significantly faster than CRC-16, because I'm just doing int32 addition in a tight loop. Simon originally chose

Re: [HACKERS] Enabling Checksums

2013-03-06 Thread Garick Hamlin
On Wed, Mar 06, 2013 at 01:34:21PM +0200, Heikki Linnakangas wrote: On 06.03.2013 10:41, Simon Riggs wrote: On 5 March 2013 18:02, Jeff Davispg...@j-davis.com wrote: Fletcher is probably significantly faster than CRC-16, because I'm just doing int32 addition in a tight loop. Simon

Re: [HACKERS] Enabling Checksums

2013-03-06 Thread Andres Freund
On 2013-03-06 11:21:21 -0500, Garick Hamlin wrote: If picking a CRC why not a short optimal one rather than truncate CRC32C? CRC32C is available in hardware since SSE4.2. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7

Re: [HACKERS] Enabling Checksums

2013-03-06 Thread Robert Haas
On Mon, Mar 4, 2013 at 3:13 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 04.03.2013 20:58, Greg Smith wrote: There is no such thing as a stable release of btrfs, and no timetable for when there will be one. I could do some benchmarks of that but I didn't think they were very

Re: [HACKERS] Enabling Checksums

2013-03-06 Thread Josh Berkus
There may be good reasons to reject this patch. Or there may not. But I completely disagree with the idea that asking them to solve the problem at the filesystem level is sensible. Yes, can we get back to the main issues with the patch? 1) argument over whether the checksum is sufficient to

Re: [HACKERS] Enabling Checksums

2013-03-06 Thread Josh Berkus
Robert, We've had a few EnterpriseDB customers who have had fantastically painful experiences with PostgreSQL + ZFS. Supposedly, aligning the ZFS block size to the PostgreSQL block size is supposed to make these problems go away, but in my experience it does not have that effect. So I think

Re: [HACKERS] Enabling Checksums

2013-03-06 Thread Robert Haas
On Wed, Mar 6, 2013 at 2:14 PM, Josh Berkus j...@agliodbs.com wrote: Based on Smith's report, I consider (2) to be a deal-killer right now. I was pretty depressed by those numbers, too. The level of overhead reported by him would prevent the users I work with from ever employing checksums on

Re: [HACKERS] Enabling Checksums

2013-03-06 Thread Robert Haas
On Wed, Mar 6, 2013 at 6:00 PM, Josh Berkus j...@agliodbs.com wrote: We've had a few EnterpriseDB customers who have had fantastically painful experiences with PostgreSQL + ZFS. Supposedly, aligning the ZFS block size to the PostgreSQL block size is supposed to make these problems go away,

Re: [HACKERS] Enabling Checksums

2013-03-06 Thread Joshua D. Drake
On 03/06/2013 03:06 PM, Robert Haas wrote: On Wed, Mar 6, 2013 at 6:00 PM, Josh Berkus j...@agliodbs.com wrote: We've had a few EnterpriseDB customers who have had fantastically painful experiences with PostgreSQL + ZFS. Supposedly, aligning the ZFS block size to the PostgreSQL block size is

Re: [HACKERS] Enabling Checksums

2013-03-06 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes: On 2013-03-06 11:21:21 -0500, Garick Hamlin wrote: If picking a CRC why not a short optimal one rather than truncate CRC32C? CRC32C is available in hardware since SSE4.2. I think that should be at most a fourth-order consideration, since we are not

Re: [HACKERS] Enabling Checksums

2013-03-06 Thread Craig Ringer
On 03/06/2013 07:34 PM, Heikki Linnakangas wrote: It'd be difficult to change the algorithm in a future release without breaking on-disk compatibility, On-disk compatibility is broken with major releases anyway, so I don't see this as a huge barrier. -- Craig Ringer

Re: [HACKERS] Enabling Checksums

2013-03-06 Thread Andres Freund
On 2013-03-07 08:37:40 +0800, Craig Ringer wrote: On 03/06/2013 07:34 PM, Heikki Linnakangas wrote: It'd be difficult to change the algorithm in a future release without breaking on-disk compatibility, On-disk compatibility is broken with major releases anyway, so I don't see this as a huge

Re: [HACKERS] Enabling Checksums

2013-03-06 Thread Greg Smith
On 3/6/13 1:34 PM, Robert Haas wrote: We've had a few EnterpriseDB customers who have had fantastically painful experiences with PostgreSQL + ZFS. Supposedly, aligning the ZFS block size to the PostgreSQL block size is supposed to make these problems go away, but in my experience it does not

Re: [HACKERS] Enabling Checksums

2013-03-06 Thread Craig Ringer
On 03/07/2013 08:41 AM, Andres Freund wrote: On 2013-03-07 08:37:40 +0800, Craig Ringer wrote: On 03/06/2013 07:34 PM, Heikki Linnakangas wrote: It'd be difficult to change the algorithm in a future release without breaking on-disk compatibility, On-disk compatibility is broken with major

Re: [HACKERS] Enabling Checksums

2013-03-06 Thread Jim Nasby
On 3/4/13 7:04 PM, Daniel Farina wrote: Corruption has easily occupied more than one person-month of time last year for us. Just FYI for anyone that's experienced corruption... we've looked into doing row-level checksums at work. The only challenge we ran into was how to check them when

Re: [HACKERS] Enabling Checksums

2013-03-06 Thread Jim Nasby
On 3/6/13 1:14 PM, Josh Berkus wrote: There may be good reasons to reject this patch. Or there may not. But I completely disagree with the idea that asking them to solve the problem at the filesystem level is sensible. Yes, can we get back to the main issues with the patch? 1) argument

Re: [HACKERS] Enabling Checksums

2013-03-06 Thread Greg Smith
On 3/6/13 6:34 AM, Heikki Linnakangas wrote: Another thought is that perhaps something like CRC32C would be faster to calculate on modern hardware, and could be safely truncated to 16-bits using the same technique you're using to truncate the Fletcher's Checksum. Greg's tests showed that the

Re: [HACKERS] Enabling Checksums

2013-03-06 Thread Greg Smith
On 3/6/13 1:24 PM, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2013-03-06 11:21:21 -0500, Garick Hamlin wrote: If picking a CRC why not a short optimal one rather than truncate CRC32C? CRC32C is available in hardware since SSE4.2. I think that should be at most a

Re: [HACKERS] Enabling Checksums

2013-03-06 Thread Greg Stark
On Wed, Mar 6, 2013 at 11:04 PM, Robert Haas robertmh...@gmail.com wrote: When we first talked about this feature for 9.2, we were going to exclude hint bits from checksums, in order to avoid this issue; what happened to that? I don't think anyone ever thought that was a particularly

Re: [HACKERS] Enabling Checksums

2013-03-06 Thread Greg Smith
TL;DR summary: on a system I thought was a fair middle of the road server, pgbench tests are averaging about a 2% increase in WAL writes and a 2% slowdown when I turn on checksums. There are a small number of troublesome cases where that overhead rises to closer to 20%, an upper limit that's

Re: [HACKERS] Enabling Checksums

2013-03-06 Thread Daniel Farina
On Wed, Mar 6, 2013 at 8:17 PM, Greg Smith g...@2ndquadrant.com wrote: TL;DR summary: on a system I thought was a fair middle of the road server, pgbench tests are averaging about a 2% increase in WAL writes and a 2% slowdown when I turn on checksums. There are a small number of troublesome

Re: [HACKERS] Enabling Checksums

2013-03-06 Thread Greg Smith
On 3/7/13 12:15 AM, Daniel Farina wrote: I have only done some cursory research, but cpu-time of 20% seem to expected for InnoDB's CRC computation[0]. Although a galling number, this comparison with other systems may be a way to see how much of that overhead is avoidable or just the price of

Re: [HACKERS] Enabling Checksums

2013-03-05 Thread Simon Riggs
On 5 March 2013 01:04, Daniel Farina dan...@heroku.com wrote: Corruption has easily occupied more than one person-month of time last year for us. This year to date I've burned two weeks, although admittedly this was probably the result of statistical clustering. Other colleagues of mine have

Re: [HACKERS] Enabling Checksums

2013-03-05 Thread Heikki Linnakangas
On 04.03.2013 09:11, Simon Riggs wrote: On 3 March 2013 18:24, Greg Smithg...@2ndquadrant.com wrote: The 16-bit checksum feature seems functional, with two sources of overhead. There's some CPU time burned to compute checksums when pages enter the system. And there's extra overhead for WAL

Re: [HACKERS] Enabling Checksums

2013-03-05 Thread Jeff Davis
Thank you for the review. On Tue, 2013-03-05 at 11:35 +0200, Heikki Linnakangas wrote: If you enable checksums, the free space map never gets updated in a standby. It will slowly drift to be completely out of sync with reality, which could lead to significant slowdown and bloat after

Re: [HACKERS] Enabling Checksums

2013-03-04 Thread Heikki Linnakangas
On 04.03.2013 09:11, Simon Riggs wrote: Are there objectors? FWIW, I still think that checksumming belongs in the filesystem, not PostgreSQL. If you go ahead with this anyway, at the very least I'd like to see some sort of a comparison with e.g btrfs. How do performance, error-detection

Re: [HACKERS] Enabling Checksums

2013-03-04 Thread Jeff Davis
On Mon, 2013-03-04 at 10:36 +0200, Heikki Linnakangas wrote: On 04.03.2013 09:11, Simon Riggs wrote: Are there objectors? FWIW, I still think that checksumming belongs in the filesystem, not PostgreSQL. Doing checksums in the filesystem has some downsides. One is that you need to use a

Re: [HACKERS] Enabling Checksums

2013-03-04 Thread Greg Smith
On 3/4/13 2:11 AM, Simon Riggs wrote: It's crunch time. Do you and Jeff believe this patch should be committed to Postgres core? I want to see a GUC to allow turning this off, to avoid the problem I saw where a non-critical header corruption problem can cause an entire page to be unreadable.

Re: [HACKERS] Enabling Checksums

2013-03-04 Thread Jeff Davis
On Mon, 2013-02-25 at 01:30 -0500, Greg Smith wrote: Attached is some bit rot updates to the checksums patches. The replace-tli one still works fine. I fixed a number of conflicts in the larger patch. The one I've attached here isn't 100% to project standards--I don't have all the

Re: [HACKERS] Enabling Checksums

2013-03-04 Thread Heikki Linnakangas
On 04.03.2013 20:58, Greg Smith wrote: There is no such thing as a stable release of btrfs, and no timetable for when there will be one. I could do some benchmarks of that but I didn't think they were very relevant. Who cares how fast something might run when it may not work correctly? btrfs

Re: [HACKERS] Enabling Checksums

2013-03-04 Thread Jeff Davis
On Mon, 2013-03-04 at 11:52 +0800, Craig Ringer wrote: I also suspect that at least in the first release it might be desirable to have an option that essentially says something's gone horribly wrong and we no longer want to check or write checksums, we want a non-checksummed DB that can still

Re: [HACKERS] Enabling Checksums

2013-03-04 Thread Jeff Davis
On Mon, 2013-03-04 at 22:13 +0200, Heikki Linnakangas wrote: On 04.03.2013 20:58, Greg Smith wrote: There is no such thing as a stable release of btrfs, and no timetable for when there will be one. I could do some benchmarks of that but I didn't think they were very relevant. Who cares

Re: [HACKERS] Enabling Checksums

2013-03-04 Thread Heikki Linnakangas
On 04.03.2013 18:00, Jeff Davis wrote: On Mon, 2013-03-04 at 10:36 +0200, Heikki Linnakangas wrote: On 04.03.2013 09:11, Simon Riggs wrote: Are there objectors? FWIW, I still think that checksumming belongs in the filesystem, not PostgreSQL. Doing checksums in the filesystem has some

Re: [HACKERS] Enabling Checksums

2013-03-04 Thread Jeff Davis
On Sun, 2013-03-03 at 22:18 -0500, Greg Smith wrote: As for a design of a GUC that might be useful here, the option itself strikes me as being like archive_mode in its general use. There is an element of parameters like wal_sync_method or enable_cassert though, where the options available

Re: [HACKERS] Enabling Checksums

2013-03-04 Thread Jeff Davis
On Mon, 2013-03-04 at 13:58 -0500, Greg Smith wrote: On 3/4/13 2:11 AM, Simon Riggs wrote: It's crunch time. Do you and Jeff believe this patch should be committed to Postgres core? I want to see a GUC to allow turning this off, to avoid the problem I saw where a non-critical header

Re: [HACKERS] Enabling Checksums

2013-03-04 Thread Jim Nasby
On 3/4/13 10:00 AM, Jeff Davis wrote: On Mon, 2013-03-04 at 10:36 +0200, Heikki Linnakangas wrote: On 04.03.2013 09:11, Simon Riggs wrote: Are there objectors? FWIW, I still think that checksumming belongs in the filesystem, not PostgreSQL. Doing checksums in the filesystem has some

Re: [HACKERS] Enabling Checksums

2013-03-04 Thread Jim Nasby
On 3/4/13 2:48 PM, Jeff Davis wrote: On Mon, 2013-03-04 at 13:58 -0500, Greg Smith wrote: On 3/4/13 2:11 AM, Simon Riggs wrote: It's crunch time. Do you and Jeff believe this patch should be committed to Postgres core? I want to see a GUC to allow turning this off, to avoid the problem I

Re: [HACKERS] Enabling Checksums

2013-03-04 Thread Jeff Davis
On Mon, 2013-03-04 at 22:27 +0200, Heikki Linnakangas wrote: Yeah, fragmentation will certainly hurt some workloads. But how badly, and which workloads, and how does that compare with the work that PostgreSQL has to do to maintain the checksums? I'd like to see some data on those things. I

Re: [HACKERS] Enabling Checksums

2013-03-04 Thread Heikki Linnakangas
On 04.03.2013 22:51, Jim Nasby wrote: The time to object to the concept of a checksuming feature was a long time ago, before a ton of development effort went into this... :( I did. Development went ahead anyway. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)

Re: [HACKERS] Enabling Checksums

2013-03-04 Thread Heikki Linnakangas
On 04.03.2013 22:40, Jeff Davis wrote: Is there any reason why we can't have both postgres and filesystem checksums? Of course not. But if we can get away without checksums in Postgres, that's better, because then we don't need to maintain that feature in Postgres. If the patch gets

Re: [HACKERS] Enabling Checksums

2013-03-04 Thread k...@rice.edu
On Mon, Mar 04, 2013 at 01:00:09PM -0800, Jeff Davis wrote: On Mon, 2013-03-04 at 22:27 +0200, Heikki Linnakangas wrote: If you're serious enough about your data that you want checksums, you should be able to choose your filesystem. I simply disagree. I am targeting my feature at casual

Re: [HACKERS] Enabling Checksums

2013-03-04 Thread Heikki Linnakangas
On 04.03.2013 23:00, Jeff Davis wrote: On Mon, 2013-03-04 at 22:27 +0200, Heikki Linnakangas wrote: Yeah, fragmentation will certainly hurt some workloads. But how badly, and which workloads, and how does that compare with the work that PostgreSQL has to do to maintain the checksums? I'd like

Re: [HACKERS] Enabling Checksums

2013-03-04 Thread Jim Nasby
On 3/4/13 3:00 PM, Heikki Linnakangas wrote: On 04.03.2013 22:51, Jim Nasby wrote: The time to object to the concept of a checksuming feature was a long time ago, before a ton of development effort went into this... :( I did. Development went ahead anyway. Right, because the community felt

Re: [HACKERS] Enabling Checksums

2013-03-04 Thread Heikki Linnakangas
On 04.03.2013 22:51, Jim Nasby wrote: Additionally, no filesystem I'm aware of checksums the data in the filesystem cache. A PG checksum would. The patch says: + * IMPORTANT NOTE - + * The checksum is not valid at all times on a data page. We set it before we + * flush page/buffer, and

Re: [HACKERS] Enabling Checksums

2013-03-04 Thread Stephen Frost
* Heikki Linnakangas (hlinnakan...@vmware.com) wrote: Perhaps we should just wait a few years? If we suspect that this becomes obsolete in a few years, it's probably better to just wait, than add a feature we'll have to keep maintaining. Assuming it gets committed today, it's going to take a

Re: [HACKERS] Enabling Checksums

2013-03-04 Thread Craig Ringer
On 03/05/2013 04:48 AM, Jeff Davis wrote: We would still calculate the checksum and print the warning; and then pass it through the rest of the header checks. If the header checks pass, then it proceeds. If the header checks fail, and if zero_damaged_pages is off, then it would still generate

Re: [HACKERS] Enabling Checksums

2013-03-04 Thread Greg Smith
On 3/4/13 3:13 PM, Heikki Linnakangas wrote: This PostgreSQL patch hasn't seen any production use, either. In fact, I'd consider btrfs to be more mature than this patch. Unless you think that there will be some major changes to the worse in performance in btrfs, it's perfectly valid and useful

Re: [HACKERS] Enabling Checksums

2013-03-04 Thread Jim Nasby
On 3/4/13 5:20 PM, Craig Ringer wrote: On 03/05/2013 04:48 AM, Jeff Davis wrote: We would still calculate the checksum and print the warning; and then pass it through the rest of the header checks. If the header checks pass, then it proceeds. If the header checks fail, and if zero_damaged_pages

Re: [HACKERS] Enabling Checksums

2013-03-04 Thread Craig Ringer
On 03/05/2013 08:15 AM, Jim Nasby wrote: Would it be better to do checksum_logging_level = valid elog levels ? That way someone could set the notification to anything from DEBUG up to PANIC. ISTM the default should be ERROR. That seems nice at first brush, but I don't think it holds up. All

Re: [HACKERS] Enabling Checksums

2013-03-04 Thread Jim Nasby
On 3/4/13 6:22 PM, Craig Ringer wrote: On 03/05/2013 08:15 AM, Jim Nasby wrote: Would it be better to do checksum_logging_level = valid elog levels ? That way someone could set the notification to anything from DEBUG up to PANIC. ISTM the default should be ERROR. That seems nice at first

Re: [HACKERS] Enabling Checksums

2013-03-04 Thread Josh Berkus
Heikki, Perhaps we should just wait a few years? If we suspect that this becomes obsolete in a few years, it's probably better to just wait, than add a feature we'll have to keep maintaining. Assuming it gets committed today, it's going to take a year or two for 9.3 to get released and all

Re: [HACKERS] Enabling Checksums

2013-03-04 Thread Daniel Farina
On Mon, Mar 4, 2013 at 1:22 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 04.03.2013 23:00, Jeff Davis wrote: On Mon, 2013-03-04 at 22:27 +0200, Heikki Linnakangas wrote: Yeah, fragmentation will certainly hurt some workloads. But how badly, and which workloads, and how does that

Re: [HACKERS] Enabling Checksums

2013-03-04 Thread Jeff Davis
On Mon, 2013-03-04 at 14:57 -0600, Jim Nasby wrote: I suggest we paint that GUC along the lines of checksum_failure_log_level, defaulting to ERROR. That way if someone wanted completely bury the elogs to like DEBUG they could. The reason I didn't want to do that is because it's essentially a

Re: [HACKERS] Enabling Checksums

2013-03-04 Thread Jeff Davis
On Mon, 2013-03-04 at 23:22 +0200, Heikki Linnakangas wrote: On 04.03.2013 23:00, Jeff Davis wrote: On Mon, 2013-03-04 at 22:27 +0200, Heikki Linnakangas wrote: Yeah, fragmentation will certainly hurt some workloads. But how badly, and which workloads, and how does that compare with the

Re: [HACKERS] Enabling Checksums

2013-03-04 Thread Jeff Davis
On Mon, 2013-03-04 at 23:11 +0200, Heikki Linnakangas wrote: Of course not. But if we can get away without checksums in Postgres, that's better, because then we don't need to maintain that feature in Postgres. If the patch gets committed, it's not mission accomplished. There will be

Re: [HACKERS] Enabling Checksums

2013-03-04 Thread Jeff Davis
On Sun, 2013-03-03 at 18:05 -0500, Greg Smith wrote: = Test 1 - find worst-case overhead for the checksum calculation on write = This can hit 25% of runtime when you isolate it out. I'm not sure if how I'm running this multiple times makes sense yet. This one is so much slower on my Mac

Re: [HACKERS] Enabling Checksums

2013-03-03 Thread Craig Ringer
On 03/02/2013 12:48 AM, Daniel Farina wrote: On Sun, Feb 24, 2013 at 10:30 PM, Greg Smith g...@2ndquadrant.com wrote: Attached is some bit rot updates to the checksums patches. The replace-tli one still works fine I rather badly want this feature, and if the open issues with the patch

Re: [HACKERS] Enabling Checksums

2013-03-03 Thread Greg Smith
And here's an updated version of the checksum corruption testing wrapper script already. This includes an additional safety check that you've set PGDATA to a location that can be erased. Presumably no one else would like to accidentally do this: rm -rf /* Like I just did. -- Greg Smith

Re: [HACKERS] Enabling Checksums

2013-03-03 Thread Greg Smith
On 12/19/12 6:30 PM, Jeff Davis wrote: I ran a few tests. Test 1 - find worst-case overhead for the checksum calculation on write: Test 2 - worst-case overhead for calculating checksum while reading data Test 3 - worst-case WAL overhead What I've done is wrap all of these tests into a shell

Re: [HACKERS] Enabling Checksums

2013-03-03 Thread Greg Smith
On 3/3/13 9:22 AM, Craig Ringer wrote: Did you get a chance to see whether you can run it in checksum-validation-and-update-off backward compatible mode? This seems like an important thing to have working (and tested for) in case of bugs, performance issues or other unforseen circumstances.

Re: [HACKERS] Enabling Checksums

2013-03-03 Thread Craig Ringer
On 03/04/2013 11:18 AM, Greg Smith wrote: On 3/3/13 9:22 AM, Craig Ringer wrote: Did you get a chance to see whether you can run it in checksum-validation-and-update-off backward compatible mode? This seems like an important thing to have working (and tested for) in case of bugs, performance

Re: [HACKERS] Enabling Checksums

2013-03-03 Thread Greg Smith
On 3/3/13 10:52 PM, Craig Ringer wrote: I also suspect that at least in the first release it might be desirable to have an option that essentially says something's gone horribly wrong and we no longer want to check or write checksums, we want a non-checksummed DB that can still read our data

Re: [HACKERS] Enabling Checksums

2013-03-03 Thread Craig Ringer
On 03/04/2013 12:19 PM, Greg Smith wrote: On 3/3/13 10:52 PM, Craig Ringer wrote: I also suspect that at least in the first release it might be desirable to have an option that essentially says something's gone horribly wrong and we no longer want to check or write checksums, we want a

Re: [HACKERS] Enabling Checksums

2013-03-03 Thread Simon Riggs
On 3 March 2013 18:24, Greg Smith g...@2ndquadrant.com wrote: The 16-bit checksum feature seems functional, with two sources of overhead. There's some CPU time burned to compute checksums when pages enter the system. And there's extra overhead for WAL logging hint bits. I'll quantify both

Re: [HACKERS] Enabling Checksums

2013-03-01 Thread Daniel Farina
On Sun, Feb 24, 2013 at 10:30 PM, Greg Smith g...@2ndquadrant.com wrote: Attached is some bit rot updates to the checksums patches. The replace-tli one still works fine I rather badly want this feature, and if the open issues with the patch has hit zero, I'm thinking about applying it,

Re: [HACKERS] Enabling Checksums

2013-01-28 Thread Robert Haas
On Sun, Jan 27, 2013 at 5:28 PM, Jeff Davis pg...@j-davis.com wrote: There's a maximum of one FPI per page per cycle, and we need the FPI for any modified page in this design regardless. So, deferring the XLOG_HINT WAL record doesn't change the total number of FPIs emitted. The only savings

Re: [HACKERS] Enabling Checksums

2013-01-27 Thread Simon Riggs
On 25 January 2013 20:29, Robert Haas robertmh...@gmail.com wrote: The checksums patch also introduces another behavior into SetBufferCommitInfoNeedsSave, which is to write an XLOG_HINT WAL record if checksums are enabled (to avoid torn page hazards). That's only necessary for changes where

Re: [HACKERS] Enabling Checksums

2013-01-27 Thread Robert Haas
On Sun, Jan 27, 2013 at 3:50 AM, Simon Riggs si...@2ndquadrant.com wrote: If we attempted to defer the FPI last thing before write, we'd need to cope with the case that writes at checkpoint occur after the logical start of the checkpoint, and also with the overhead of additional writes at

Re: [HACKERS] Enabling Checksums

2013-01-27 Thread Jeff Davis
On Sat, 2013-01-26 at 23:23 -0500, Robert Haas wrote: If we were to try to defer writing the WAL until the page was being written, the most it would possibly save is the small XLOG_HINT WAL record; it would not save any FPIs. How is the XLOG_HINT_WAL record kept small and why does it not

Re: [HACKERS] Enabling Checksums

2013-01-26 Thread Robert Haas
On Fri, Jan 25, 2013 at 9:35 PM, Jeff Davis pg...@j-davis.com wrote: On Fri, 2013-01-25 at 15:29 -0500, Robert Haas wrote: I thought Simon had the idea, at some stage, of writing a WAL record to cover hint-bit changes only at the time we *write* the buffer and only if no FPI had already been

Re: [HACKERS] Enabling Checksums

2013-01-25 Thread Robert Haas
On Thu, Jan 10, 2013 at 1:06 AM, Jeff Davis pg...@j-davis.com wrote: On Tue, 2012-12-04 at 01:03 -0800, Jeff Davis wrote: For now, I rebased the patches against master, and did some very minor cleanup. I think there is a problem here when setting PD_ALL_VISIBLE. I thought I had analyzed that

Re: [HACKERS] Enabling Checksums

2013-01-25 Thread Jeff Davis
On Fri, 2013-01-25 at 15:29 -0500, Robert Haas wrote: I thought Simon had the idea, at some stage, of writing a WAL record to cover hint-bit changes only at the time we *write* the buffer and only if no FPI had already been emitted that checkpoint cycle. I'm not sure whether that approach was

Re: [HACKERS] Enabling Checksums

2013-01-24 Thread Jeff Davis
On Wed, 2013-01-16 at 17:38 -0800, Jeff Davis wrote: New version of checksums patch. And another new version of both patches. Changes: * Rebased. * Rename SetBufferCommitInfoNeedsSave to MarkBufferDirtyHint. Now that it's being used more places, it makes sense to give it a more generic name.

Re: [HACKERS] Enabling Checksums

2013-01-16 Thread Jeff Davis
New version of checksums patch. Changes: * rebased * removed two duplicate lines; apparently the result of a bad merge * Added heap page to WAL chain when logging an XLOG_HEAP2_VISIBLE to avoid torn page issues updating PD_ALL_VISIBLE. This is the most significant change. * minor comment

Re: [HACKERS] Enabling Checksums

2013-01-16 Thread Jeff Davis
On Tue, 2013-01-15 at 19:36 -0500, Greg Smith wrote: First rev of a simple corruption program is attached, in very C-ish Python. Great. Did you verify that my patch works as you expect at least in the simple case? The parameters I settled on are to accept a relation name, byte offset,

Re: [HACKERS] Enabling Checksums

2013-01-15 Thread Greg Smith
First rev of a simple corruption program is attached, in very C-ish Python. The parameters I settled on are to accept a relation name, byte offset, byte value, and what sort of operation to do: overwrite, AND, OR, XOR. I like XOR here because you can fix it just by running the program

Re: [HACKERS] Enabling Checksums

2013-01-12 Thread Greg Smith
On 12/19/12 6:30 PM, Jeff Davis wrote: The idea is to prevent interference from the bgwriter or autovacuum. Also, I turn of fsync so that it's measuring the calculation overhead, not the effort of actually writing to disk. With my test server issues sorted, what I did was setup a single

Re: [HACKERS] Enabling Checksums

2013-01-10 Thread Simon Riggs
On 10 January 2013 06:06, Jeff Davis pg...@j-davis.com wrote: The checksums patch also introduces another behavior into SetBufferCommitInfoNeedsSave, which is to write an XLOG_HINT WAL record if checksums are enabled (to avoid torn page hazards). That's only necessary for changes where the

Re: [HACKERS] Enabling Checksums

2013-01-10 Thread Jeff Davis
The checksums patch also introduces another behavior into SetBufferCommitInfoNeedsSave, which is to write an XLOG_HINT WAL record if checksums are enabled (to avoid torn page hazards). That's only necessary for changes where the caller does not write WAL itself and doesn't bump the LSN

Re: [HACKERS] Enabling Checksums

2013-01-09 Thread Jeff Davis
On Tue, 2012-12-04 at 01:03 -0800, Jeff Davis wrote: For now, I rebased the patches against master, and did some very minor cleanup. I think there is a problem here when setting PD_ALL_VISIBLE. I thought I had analyzed that before, but upon review, it doesn't look right. Setting PD_ALL_VISIBLE

Re: [HACKERS] Enabling Checksums

2012-12-20 Thread Martijn van Oosterhout
On Tue, Dec 18, 2012 at 04:06:02AM -0500, Greg Smith wrote: On 12/18/12 3:17 AM, Simon Riggs wrote: Clearly part of the response could involve pg_dump on the damaged structure, at some point. This is the main thing I wanted to try out more, once I have a decent corruption generation tool.

Re: [HACKERS] Enabling Checksums

2012-12-19 Thread Jeff Davis
On Tue, 2012-12-04 at 01:03 -0800, Jeff Davis wrote: 4. We need some general performance testing to show whether this is insane or not. I ran a few tests. Test 1 - find worst-case overhead for the checksum calculation on write: fsync = off bgwriter_lru_maxpages = 0 shared_buffers =

Re: [HACKERS] Enabling Checksums

2012-12-18 Thread Simon Riggs
On 18 December 2012 02:21, Jeff Davis pg...@j-davis.com wrote: On Mon, 2012-12-17 at 19:14 +, Simon Riggs wrote: We'll need a way of expressing some form of corruption tolerance. zero_damaged_pages is just insane, The main problem I see with zero_damaged_pages is that it could

Re: [HACKERS] Enabling Checksums

2012-12-18 Thread Greg Smith
On 12/18/12 3:17 AM, Simon Riggs wrote: Clearly part of the response could involve pg_dump on the damaged structure, at some point. This is the main thing I wanted to try out more, once I have a decent corruption generation tool. If you've corrupted a single record but can still pg_dump the

Re: [HACKERS] Enabling Checksums

2012-12-18 Thread Kevin Grittner
Greg Smith wrote: In general, what I hope people will be able to do is switch over to their standby server, and then investigate further. I think it's unlikely that people willing to pay for block checksums will only have one server. Having some way to nail down if the same block is bad on

Re: [HACKERS] Enabling Checksums

2012-12-18 Thread Greg Stark
There is no good way to make the poor soul who has no standby server happy here. You're just choosing between bad alternatives. The first block error is often just that--the first one, to be joined by others soon afterward. My experience at how drives fail says the second error is a lot more

Re: [HACKERS] Enabling Checksums

2012-12-18 Thread Jeff Davis
On Tue, 2012-12-18 at 08:17 +, Simon Riggs wrote: I think we should discuss whether we accept my premise? Checksums will actually detect more errors than we see now, and people will want to do something about that. Returning to backup is one way of handling it, but on a busy production

Re: [HACKERS] Enabling Checksums

2012-12-18 Thread Jeff Davis
On Tue, 2012-12-18 at 04:06 -0500, Greg Smith wrote: Having some way to nail down if the same block is bad on a given standby seems like a useful interface we should offer, and it shouldn't take too much work. Ideally you won't find the same corruption there. I'd like a way to check the

Re: [HACKERS] Enabling Checksums

2012-12-17 Thread Dimitri Fontaine
Jeff Davis pg...@j-davis.com writes: -A relation name -Corruption type (an entry from this list) -How many blocks to touch I'll just loop based on the count, randomly selecting a block each time and messing with it in that way. For the messing with it part, did you consider zzuf?

Re: [HACKERS] Enabling Checksums

2012-12-17 Thread Simon Riggs
On 14 December 2012 20:15, Greg Smith g...@2ndquadrant.com wrote: On 12/14/12 3:00 PM, Jeff Davis wrote: After some thought, I don't see much value in introducing multiple instances of corruption at a time. I would think that the smallest unit of corruption would be the hardest to detect, so

Re: [HACKERS] Enabling Checksums

2012-12-17 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes: Discussing this makes me realise that we need a more useful response than just your data is corrupt, so user can respond yes, I know, I'm trying to save whats left. We'll need a way of expressing some form of corruption tolerance. zero_damaged_pages

Re: [HACKERS] Enabling Checksums

2012-12-17 Thread Simon Riggs
On 17 December 2012 19:29, Tom Lane t...@sss.pgh.pa.us wrote: Simon Riggs si...@2ndquadrant.com writes: Discussing this makes me realise that we need a more useful response than just your data is corrupt, so user can respond yes, I know, I'm trying to save whats left. We'll need a way of

Re: [HACKERS] Enabling Checksums

2012-12-17 Thread Jeff Davis
On Mon, 2012-12-17 at 19:14 +, Simon Riggs wrote: We'll need a way of expressing some form of corruption tolerance. zero_damaged_pages is just insane, The main problem I see with zero_damaged_pages is that it could potentially write out the zero page, thereby really losing your data if it

Re: [HACKERS] Enabling Checksums

2012-12-14 Thread Jeff Davis
On Wed, 2012-12-12 at 17:52 -0500, Greg Smith wrote: I can take this on, as part of the QA around checksums working as expected. The result would be a Python program; I don't have quite enough time to write this in C or re-learn Perl to do it right now. But this won't be a lot of code.

Re: [HACKERS] Enabling Checksums

2012-12-14 Thread Greg Smith
On 12/14/12 3:00 PM, Jeff Davis wrote: After some thought, I don't see much value in introducing multiple instances of corruption at a time. I would think that the smallest unit of corruption would be the hardest to detect, so by introducing many of them in one pass makes it easier to detect.

Re: [HACKERS] Enabling Checksums

2012-12-12 Thread Greg Smith
On 12/5/12 6:49 PM, Simon Riggs wrote: * Zeroing pages, making pages all 1s * Transposing pages * Moving chunks of data sideways in a block * Flipping bits randomly * Flipping data endianness * Destroying particular catalog tables or structures I can take this on, as part of the QA around

Re: [HACKERS] Enabling Checksums

2012-12-06 Thread Kevin Grittner
Robert Haas wrote: Jeff Davis pg...@j-davis.com wrote: Or, I could write up a test framework in ruby or python, using the appropriate pg driver, and some not-so-portable shell commands to start and stop the server. Then, I can publish that on this list, and that would at least make it easier

Re: [HACKERS] Enabling Checksums

2012-12-05 Thread Robert Haas
On Tue, Dec 4, 2012 at 6:17 PM, Jeff Davis pg...@j-davis.com wrote: Or, I could write up a test framework in ruby or python, using the appropriate pg driver, and some not-so-portable shell commands to start and stop the server. Then, I can publish that on this list, and that would at least

Re: [HACKERS] Enabling Checksums

2012-12-05 Thread Simon Riggs
On 5 December 2012 23:40, Robert Haas robertmh...@gmail.com wrote: On Tue, Dec 4, 2012 at 6:17 PM, Jeff Davis pg...@j-davis.com wrote: Or, I could write up a test framework in ruby or python, using the appropriate pg driver, and some not-so-portable shell commands to start and stop the server.

Re: [HACKERS] Enabling Checksums

2012-12-04 Thread Jeff Davis
On Mon, 2012-12-03 at 13:16 +, Simon Riggs wrote: On 3 December 2012 09:56, Simon Riggs si...@2ndquadrant.com wrote: I think the way forwards for this is... 1. Break out the changes around inCommit flag, since that is just uncontroversial refactoring. I can do that. That reduces the

<    1   2   3   4   >