Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-12 Thread Martijn van Oosterhout
On Thu, Jan 11, 2007 at 11:10:38PM +, Simon Riggs wrote: On Thu, 2007-01-11 at 17:06 +, Gregory Stark wrote: Having a CRC in WAL but not in the heap seems kind of pointless. Yes... If your hardware is unreliable the corruption could anywhere. Agreed. I thought the point

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-11 Thread Simon Riggs
On Wed, 2007-01-10 at 23:32 -0500, Bruce Momjian wrote: Simon Riggs wrote: On Fri, 2007-01-05 at 22:57 -0500, Tom Lane wrote: Jim Nasby [EMAIL PROTECTED] writes: On Jan 5, 2007, at 6:30 AM, Zeugswetter Andreas ADI SD wrote: Ok, so when you need CRC's on a replicate (but not on the

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-11 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes: COPY XLogInsert() #1 on oprofile results at 17% CPU (full_page_writes = on) But what portion of that is actually CRC-related? XLogInsert does quite a lot. Anyway, I can't see degrading the reliability of the system for a gain in the

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-11 Thread Simon Riggs
On Thu, 2007-01-11 at 09:01 -0500, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: COPYXLogInsert() #1 on oprofile results at 17% CPU (full_page_writes = on) But what portion of that is actually CRC-related? XLogInsert does quite a lot. Anyway, I

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-11 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes: What did you think about protecting against torn writes using id numbers every 512 bytes. Pretty much not happening; or are you volunteering to fix every part of the system to tolerate injections of inserted data anywhere in a stored datum?

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-11 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: Pretty much not happening; or are you volunteering to fix every part of the system to tolerate injections of inserted data anywhere in a stored datum? I was thinking to do it at a low level as the xlog records are

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-11 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes: Gregory Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: Pretty much not happening; or are you volunteering to fix every part of the system to tolerate injections of inserted data anywhere in a stored datum? I was thinking to do it at a

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-11 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: You understand wrong ... a tuple sitting on disk is normally read directly from the shared buffer, and I don't think we want to pay for copying it. xlog records Oh, sorry, had the wrong context in mind. I'm still

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-11 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes: Oh, sorry, had the wrong context in mind. I'm still not very impressed with the idea --- a CRC check will catch many kinds of problems, whereas this approach catches exactly one kind of problem. Well in fairness I tossed in a throwaway comment at the end

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-11 Thread Simon Riggs
On Thu, 2007-01-11 at 17:06 +, Gregory Stark wrote: Having a CRC in WAL but not in the heap seems kind of pointless. Yes... If your hardware is unreliable the corruption could anywhere. Agreed. Other DBMS have one setting for the whole server; I've never seen separate settings for WAL

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-10 Thread Bruce Momjian
Simon Riggs wrote: On Fri, 2007-01-05 at 22:57 -0500, Tom Lane wrote: Jim Nasby [EMAIL PROTECTED] writes: On Jan 5, 2007, at 6:30 AM, Zeugswetter Andreas ADI SD wrote: Ok, so when you need CRC's on a replicate (but not on the master) you Which sounds to me like a good reason to

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-06 Thread Simon Riggs
On Fri, 2007-01-05 at 22:57 -0500, Tom Lane wrote: Jim Nasby [EMAIL PROTECTED] writes: On Jan 5, 2007, at 6:30 AM, Zeugswetter Andreas ADI SD wrote: Ok, so when you need CRC's on a replicate (but not on the master) you Which sounds to me like a good reason to allow the option in

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-06 Thread Bruce Momjian
Simon Riggs wrote: Somehow, neither of these statements seem likely to be uttered by a sane DBA ... If I take a backup of a server and bring it up on a new system, the blocks in the backup will not have been CRC checked before they go to disk. If I take the same server and now stream

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-05 Thread Zeugswetter Andreas ADI SD
Recovery can occur with/without same setting of wal_checksum, to avoid complications from crashes immediately after turning GUC on. Surely not. Otherwise even the on setting is not really a defense. Only when the CRC is exactly zero, which happens very very rarely. It

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-05 Thread Simon Riggs
On Fri, 2007-01-05 at 11:01 +0100, Zeugswetter Andreas ADI SD wrote: What's the use-case for changing the variable on the fly anyway? Seems a better solution is just to lock down the setting at postmaster start. I guess that the use case is more for a WAL based replicate, that

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-05 Thread Zeugswetter Andreas ADI SD
What's the use-case for changing the variable on the fly anyway? Seems a better solution is just to lock down the setting at postmaster start. I guess that the use case is more for a WAL based replicate, that has/wants a different setting. Maybe we want a WAL entry for the

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-05 Thread Tom Lane
Zeugswetter Andreas ADI SD [EMAIL PROTECTED] writes: Ok, so when you need CRC's on a replicate (but not on the master) you turn it off during standby replay, but turn it on when you start the replicate for normal operation. Thought: even when it's off, the CRC had better be computed for

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-05 Thread Zeugswetter Andreas ADI SD
Ok, so when you need CRC's on a replicate (but not on the master) you turn it off during standby replay, but turn it on when you start the replicate for normal operation. Thought: even when it's off, the CRC had better be computed for shutdown-checkpoint records. Else there's no way

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-05 Thread Jim Nasby
On Jan 5, 2007, at 6:30 AM, Zeugswetter Andreas ADI SD wrote: Ok, so when you need CRC's on a replicate (but not on the master) you turn it off during standby replay, but turn it on when you start the replicate for normal operation. Which sounds to me like a good reason to allow the option in

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-05 Thread Tom Lane
Jim Nasby [EMAIL PROTECTED] writes: On Jan 5, 2007, at 6:30 AM, Zeugswetter Andreas ADI SD wrote: Ok, so when you need CRC's on a replicate (but not on the master) you Which sounds to me like a good reason to allow the option in recovery.conf as well... Actually, I'm not seeing the

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-05 Thread Joshua D. Drake
Actually, I'm not seeing the use-case for a slave having a different setting from the master at all? My backup server is less reliable than the primary. My backup server is more reliable than the primary. Somehow, neither of these statements seem likely to be uttered by a

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-04 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes: In this thread, I outlined an idea for reducing cost of WAL CRC checking http://archives.postgresql.org/pgsql-hackers/2006-10/msg01299.php wal_checksum = on (default) | off This still seems awfully dangerous to me. Recovery can occur with/without same

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-04 Thread Simon Riggs
On Thu, 2007-01-04 at 10:00 -0500, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: In this thread, I outlined an idea for reducing cost of WAL CRC checking http://archives.postgresql.org/pgsql-hackers/2006-10/msg01299.php wal_checksum = on (default) | off This still seems awfully

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-04 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes: On Thu, 2007-01-04 at 10:00 -0500, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: Recovery can occur with/without same setting of wal_checksum, to avoid complications from crashes immediately after turning GUC on. Surely not. Otherwise even the

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-04 Thread Simon Riggs
On Thu, 2007-01-04 at 11:09 -0500, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: On Thu, 2007-01-04 at 10:00 -0500, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: Recovery can occur with/without same setting of wal_checksum, to avoid complications from crashes immediately

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-04 Thread Florian Weimer
* Simon Riggs: Surely not. Otherwise even the on setting is not really a defense. Only when the CRC is exactly zero, which happens very very rarely. Have you tried switching to Adler32 instead of CRC32? -- Florian Weimer[EMAIL PROTECTED] BFK edv-consulting GmbH

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-04 Thread Simon Riggs
On Thu, 2007-01-04 at 17:58 +0100, Florian Weimer wrote: * Simon Riggs: Surely not. Otherwise even the on setting is not really a defense. Only when the CRC is exactly zero, which happens very very rarely. Have you tried switching to Adler32 instead of CRC32? No. Please explain

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-04 Thread Tom Lane
Florian Weimer [EMAIL PROTECTED] writes: Have you tried switching to Adler32 instead of CRC32? Is anything known about the error detection capabilities of Adler32? There's a lot of math behind CRCs but AFAIR Adler's method is pretty much ad-hoc. regards, tom lane

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-04 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes: On Thu, 2007-01-04 at 11:09 -0500, Tom Lane wrote: It works most of the time doesn't exactly satisfy me. It seemed safer to allow a very rare error through to the next level of error checking rather than to close the door so tight that recovery would not

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-04 Thread Simon Riggs
On Thu, 2007-01-04 at 12:13 -0500, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: On Thu, 2007-01-04 at 11:09 -0500, Tom Lane wrote: It works most of the time doesn't exactly satisfy me. It seemed safer to allow a very rare error through to the next level of error checking rather

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-04 Thread Florian Weimer
* Tom Lane: Florian Weimer [EMAIL PROTECTED] writes: Have you tried switching to Adler32 instead of CRC32? Is anything known about the error detection capabilities of Adler32? There's a lot of math behind CRCs but AFAIR Adler's method is pretty much ad-hoc. Correct me if I'm wrong, but the

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-04 Thread Tom Lane
Florian Weimer [EMAIL PROTECTED] writes: * Tom Lane: There's a lot of math behind CRCs but AFAIR Adler's method is pretty much ad-hoc. Correct me if I'm wrong, but the main reason for the WAL CRC is to detect partial WAL writes (due to improper caching, for instance). Well, that's *a*

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-04 Thread Florian Weimer
* Tom Lane: I think short burst errors are fairly likely: the kind of scenario I'm worried about is a wild store corrupting a word of a WAL entry while it's waiting around to be written in the WAL buffers. Ah, does this mean that each WAL entry gets its own checksum? In this case, Adler32 is

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-04 Thread Tom Lane
Florian Weimer [EMAIL PROTECTED] writes: Ah, does this mean that each WAL entry gets its own checksum? Right. (I had assumed that PostgreSQLs WAL checksumming was justified by the partial write issue. The wild store could easily occur with a heap page, too, and AFAIK, tuples, aren't