Re: [pgsql-patches] [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-12 Thread Martijn van Oosterhout
On Thu, Jan 11, 2007 at 11:10:38PM +, Simon Riggs wrote: > On Thu, 2007-01-11 at 17:06 +, Gregory Stark wrote: > > Having a CRC in WAL but not in the heap seems kind of pointless. > > Yes... > > > If your > > hardware is unreliable the corruption could anywhere. > > Agreed. I thought

Re: [pgsql-patches] [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-11 Thread Simon Riggs
On Thu, 2007-01-11 at 17:06 +, Gregory Stark wrote: > Having a CRC in WAL but not in the heap seems kind of pointless. Yes... > If your > hardware is unreliable the corruption could anywhere. Agreed. Other DBMS have one setting for the whole server; I've never seen separate settings for W

Re: [pgsql-patches] [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-11 Thread Gregory Stark
"Tom Lane" <[EMAIL PROTECTED]> writes: > Oh, sorry, had the wrong context in mind. I'm still not very impressed > with the idea --- a CRC check will catch many kinds of problems, whereas > this approach catches exactly one kind of problem. Well in fairness I tossed in a throwaway comment at the

Re: [pgsql-patches] [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-11 Thread Tom Lane
Gregory Stark <[EMAIL PROTECTED]> writes: > "Tom Lane" <[EMAIL PROTECTED]> writes: >> You understand wrong ... a tuple sitting on disk is normally read >> directly from the shared buffer, and I don't think we want to pay for >> copying it. > "xlog records" Oh, sorry, had the wrong context in mind

Re: [pgsql-patches] [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-11 Thread Gregory Stark
"Tom Lane" <[EMAIL PROTECTED]> writes: > Gregory Stark <[EMAIL PROTECTED]> writes: >> "Tom Lane" <[EMAIL PROTECTED]> writes: >>> Pretty much not happening; or are you volunteering to fix every part of >>> the system to tolerate injections of inserted data anywhere in a stored >>> datum? > >> I was

Re: [pgsql-patches] [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-11 Thread Tom Lane
Gregory Stark <[EMAIL PROTECTED]> writes: > "Tom Lane" <[EMAIL PROTECTED]> writes: >> Pretty much not happening; or are you volunteering to fix every part of >> the system to tolerate injections of inserted data anywhere in a stored >> datum? > I was thinking to do it at a low level as the xlog re

Re: [pgsql-patches] [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-11 Thread Gregory Stark
"Tom Lane" <[EMAIL PROTECTED]> writes: > Gregory Stark <[EMAIL PROTECTED]> writes: >> What did you think about protecting against torn writes using id numbers >> every >> 512 bytes. > > Pretty much not happening; or are you volunteering to fix every part of > the system to tolerate injections of

Re: [pgsql-patches] [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-11 Thread Tom Lane
Gregory Stark <[EMAIL PROTECTED]> writes: > What did you think about protecting against torn writes using id numbers every > 512 bytes. Pretty much not happening; or are you volunteering to fix every part of the system to tolerate injections of inserted data anywhere in a stored datum?

Re: [pgsql-patches] [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-11 Thread Gregory Stark
"Tom Lane" <[EMAIL PROTECTED]> writes: > "Simon Riggs" <[EMAIL PROTECTED]> writes: >> COPY XLogInsert() #1 on oprofile results at 17% CPU >> (full_page_writes = on) > > But what portion of that is actually CRC-related? XLogInsert does quite > a lot. > > Anyway, I can't see de

Re: [pgsql-patches] [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-11 Thread Simon Riggs
On Thu, 2007-01-11 at 09:01 -0500, Tom Lane wrote: > "Simon Riggs" <[EMAIL PROTECTED]> writes: > > COPYXLogInsert() #1 on oprofile results at 17% CPU > > (full_page_writes = on) > > But what portion of that is actually CRC-related? XLogInsert does quite > a lot. > > A

Re: [pgsql-patches] [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-11 Thread Tom Lane
"Simon Riggs" <[EMAIL PROTECTED]> writes: > COPY XLogInsert() #1 on oprofile results at 17% CPU > (full_page_writes = on) But what portion of that is actually CRC-related? XLogInsert does quite a lot. Anyway, I can't see degrading the reliability of the system for a gain i

Re: [pgsql-patches] [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-11 Thread Simon Riggs
On Wed, 2007-01-10 at 23:32 -0500, Bruce Momjian wrote: > Simon Riggs wrote: > > On Fri, 2007-01-05 at 22:57 -0500, Tom Lane wrote: > > > Jim Nasby <[EMAIL PROTECTED]> writes: > > > > On Jan 5, 2007, at 6:30 AM, Zeugswetter Andreas ADI SD wrote: > > > >> Ok, so when you need CRC's on a replicate (b

Re: [pgsql-patches] [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-10 Thread Bruce Momjian
Simon Riggs wrote: > On Fri, 2007-01-05 at 22:57 -0500, Tom Lane wrote: > > Jim Nasby <[EMAIL PROTECTED]> writes: > > > On Jan 5, 2007, at 6:30 AM, Zeugswetter Andreas ADI SD wrote: > > >> Ok, so when you need CRC's on a replicate (but not on the master) you > > > > > Which sounds to me like a goo

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-06 Thread Bruce Momjian
Simon Riggs wrote: > > Somehow, neither of these statements seem likely to be uttered by > > a sane DBA ... > > If I take a backup of a server and bring it up on a new system, the > blocks in the backup will not have been CRC checked before they go to > disk. > > If I take the same server and now

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-06 Thread Simon Riggs
On Fri, 2007-01-05 at 22:57 -0500, Tom Lane wrote: > Jim Nasby <[EMAIL PROTECTED]> writes: > > On Jan 5, 2007, at 6:30 AM, Zeugswetter Andreas ADI SD wrote: > >> Ok, so when you need CRC's on a replicate (but not on the master) you > > > Which sounds to me like a good reason to allow the option in

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-05 Thread Joshua D. Drake
> Actually, I'm not seeing the use-case for a slave having a different > setting from the master at all? > > "My backup server is less reliable than the primary." > > "My backup server is more reliable than the primary." > > Somehow, neither of these statements seem likely to be utt

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-05 Thread Tom Lane
Jim Nasby <[EMAIL PROTECTED]> writes: > On Jan 5, 2007, at 6:30 AM, Zeugswetter Andreas ADI SD wrote: >> Ok, so when you need CRC's on a replicate (but not on the master) you > Which sounds to me like a good reason to allow the option in > recovery.conf as well... Actually, I'm not seeing the u

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-05 Thread Jim Nasby
On Jan 5, 2007, at 6:30 AM, Zeugswetter Andreas ADI SD wrote: Ok, so when you need CRC's on a replicate (but not on the master) you turn it off during standby replay, but turn it on when you start the replicate for normal operation. Which sounds to me like a good reason to allow the option in

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-05 Thread Zeugswetter Andreas ADI SD
> > Ok, so when you need CRC's on a replicate (but not on the master) you > > turn it > > off during standby replay, but turn it on when you start the replicate > > for normal operation. > > Thought: even when it's off, the CRC had better be computed for > shutdown-checkpoint records. Else there

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-05 Thread Tom Lane
"Zeugswetter Andreas ADI SD" <[EMAIL PROTECTED]> writes: > Ok, so when you need CRC's on a replicate (but not on the master) you > turn it > off during standby replay, but turn it on when you start the replicate > for normal operation. Thought: even when it's off, the CRC had better be computed fo

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-05 Thread Zeugswetter Andreas ADI SD
> > > > What's the use-case for changing the variable on the fly anyway? Seems a > > better > > > > solution is just to lock down the setting at postmaster start. > > > > I guess that the use case is more for a WAL based replicate, that > > has/wants a different setting. Maybe we want a WAL entr

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-05 Thread Simon Riggs
On Fri, 2007-01-05 at 11:01 +0100, Zeugswetter Andreas ADI SD wrote: > > > What's the use-case for changing the variable on the fly anyway? Seems a > better > > > solution is just to lock down the setting at postmaster start. > > I guess that the use case is more for a WAL based replicate, that

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-05 Thread Zeugswetter Andreas ADI SD
> > >>> Recovery can occur with/without same setting of wal_checksum, to avoid > > >>> complications from crashes immediately after turning GUC on. > > >> > > >> Surely not. Otherwise even the "on" setting is not really a defense. > > > > > Only when the CRC is exactly zero, which happens very

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-04 Thread Tom Lane
Florian Weimer <[EMAIL PROTECTED]> writes: > Ah, does this mean that each WAL entry gets its own checksum? Right. > (I had assumed that PostgreSQLs WAL checksumming was justified by the > partial write issue. The wild store could easily occur with a heap > page, too, and AFAIK, tuples, aren't ch

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-04 Thread Florian Weimer
* Tom Lane: > I think short burst errors are fairly likely: the kind of scenario I'm > worried about is a wild store corrupting a word of a WAL entry while > it's waiting around to be written in the WAL buffers. Ah, does this mean that each WAL entry gets its own checksum? In this case, Adler32

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-04 Thread Tom Lane
Florian Weimer <[EMAIL PROTECTED]> writes: > * Tom Lane: >> There's a lot of math behind CRCs but AFAIR Adler's method is pretty >> much ad-hoc. > Correct me if I'm wrong, but the main reason for the WAL CRC is to > detect partial WAL writes (due to improper caching, for instance). Well, that's *

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-04 Thread Florian Weimer
* Tom Lane: > Florian Weimer <[EMAIL PROTECTED]> writes: >> Have you tried switching to Adler32 instead of CRC32? > > Is anything known about the error detection capabilities of Adler32? > There's a lot of math behind CRCs but AFAIR Adler's method is pretty > much ad-hoc. Correct me if I'm wrong,

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-04 Thread Simon Riggs
On Thu, 2007-01-04 at 12:13 -0500, Tom Lane wrote: > "Simon Riggs" <[EMAIL PROTECTED]> writes: > > On Thu, 2007-01-04 at 11:09 -0500, Tom Lane wrote: > >> "It works most of the time" doesn't exactly satisfy me. > > > It seemed safer to allow a very rare error through to the next level of > > error

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-04 Thread Tom Lane
"Simon Riggs" <[EMAIL PROTECTED]> writes: > On Thu, 2007-01-04 at 11:09 -0500, Tom Lane wrote: >> "It works most of the time" doesn't exactly satisfy me. > It seemed safer to allow a very rare error through to the next level of > error checking rather than to close the door so tight that recovery

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-04 Thread Tom Lane
Florian Weimer <[EMAIL PROTECTED]> writes: > Have you tried switching to Adler32 instead of CRC32? Is anything known about the error detection capabilities of Adler32? There's a lot of math behind CRCs but AFAIR Adler's method is pretty much ad-hoc. regards, tom lane

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-04 Thread Simon Riggs
On Thu, 2007-01-04 at 17:58 +0100, Florian Weimer wrote: > * Simon Riggs: > > >> Surely not. Otherwise even the "on" setting is not really a defense. > > > > Only when the CRC is exactly zero, which happens very very rarely. > > Have you tried switching to Adler32 instead of CRC32? No. Please e

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-04 Thread Florian Weimer
* Simon Riggs: >> Surely not. Otherwise even the "on" setting is not really a defense. > > Only when the CRC is exactly zero, which happens very very rarely. Have you tried switching to Adler32 instead of CRC32? -- Florian Weimer<[EMAIL PROTECTED]> BFK edv-consulting GmbH

Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-04 Thread Simon Riggs
On Thu, 2007-01-04 at 11:09 -0500, Tom Lane wrote: > "Simon Riggs" <[EMAIL PROTECTED]> writes: > > On Thu, 2007-01-04 at 10:00 -0500, Tom Lane wrote: > >> "Simon Riggs" <[EMAIL PROTECTED]> writes: > >>> Recovery can occur with/without same setting of wal_checksum, to avoid > >>> complications from

Re: [PATCHES] wal_checksum = on (default) | off

2007-01-04 Thread Tom Lane
"Simon Riggs" <[EMAIL PROTECTED]> writes: > On Thu, 2007-01-04 at 10:00 -0500, Tom Lane wrote: >> "Simon Riggs" <[EMAIL PROTECTED]> writes: >>> Recovery can occur with/without same setting of wal_checksum, to avoid >>> complications from crashes immediately after turning GUC on. >> >> Surely not.

Re: [PATCHES] wal_checksum = on (default) | off

2007-01-04 Thread Simon Riggs
On Thu, 2007-01-04 at 10:00 -0500, Tom Lane wrote: > "Simon Riggs" <[EMAIL PROTECTED]> writes: > > In this thread, I outlined an idea for reducing cost of WAL CRC checking > > http://archives.postgresql.org/pgsql-hackers/2006-10/msg01299.php > > wal_checksum = on (default) | off > > This still see

Re: [PATCHES] wal_checksum = on (default) | off

2007-01-04 Thread Tom Lane
"Simon Riggs" <[EMAIL PROTECTED]> writes: > In this thread, I outlined an idea for reducing cost of WAL CRC checking > http://archives.postgresql.org/pgsql-hackers/2006-10/msg01299.php > wal_checksum = on (default) | off This still seems awfully dangerous to me. > Recovery can occur with/without

[PATCHES] wal_checksum = on (default) | off

2007-01-04 Thread Simon Riggs
In this thread, I outlined an idea for reducing cost of WAL CRC checking http://archives.postgresql.org/pgsql-hackers/2006-10/msg01299.php wal_checksum = on (default) | off Recovery can occur with/without same setting of wal_checksum, to avoid complications from crashes immediately after turning