subject:"Re\: \[HACKERS\] Detecting corrupted pages earlier"

Re: [HACKERS] Detecting corrupted pages earlier

2003-04-04 Thread Tom Lane

Kevin Brown <[EMAIL PROTECTED]> writes:
> Shouldn't each variable listed in postgresql.conf.sample have comments
> right above it explaining what it does anyway?

Not really --- if you can't be bothered to consult the Admin Guide when
in doubt, you have no business editing the config file.  A word or two
of hints is one thing, but circles and arrows and a paragraph on the
back of each one just ain't gonna fly.

We went down that path previously with pg_hba.conf, and finally realized
that it was a waste of time to maintain what amounted to two separate
sets of documentation.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Detecting corrupted pages earlier

2003-04-04 Thread Kevin Brown

Andrew Sullivan wrote:
> On Thu, Apr 03, 2003 at 02:39:17PM -0500, Tom Lane wrote:
> > just not listing zero_damaged_pages in postgresql.conf.sample?  We
> > already have several variables deliberately not listed there ...
> 
> Hey, that might be a good solution.  Of course, it doesn't solve the
> "doomsday device" problem, but nobody who uses it can complain that
> they didn't know what the thing would do.

Shouldn't each variable listed in postgresql.conf.sample have comments
right above it explaining what it does anyway?  A self-documenting
configuration file is a really handy thing to have.

If it's documented that way in postgresql.conf.sample (and adjacent to
the variable itself, even) then nobody who changed it would have
grounds to complain about not knowing what the variable did.

I'm much more in favor of being lucid and upfront about everything
than hiding things just because they might be dangerous.

That said, all sorts of warnings and such should be in that bit of
documentation in postgresql.conf.sample, so that it's made abundantly
clear that this particular option is not one to be messing with except
when you know exactly what you're doing...

-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Detecting corrupted pages earlier

2003-04-03 Thread Vincent van Leeuwen

On 2003-04-03 18:40:54 -0500, Tom Lane wrote:
> Vincent van Leeuwen <[EMAIL PROTECTED]> writes:
> > ... This cost us about 10 hours downtime. If I'd had the option I just
> > would've set ZERO_DAMAGED_PAGES to true and let it run for a few days to sort
> > itself out.
> 
> Yikes.  If I understand this correctly, you had both critical data and
> cache data in the same database.  As for the cache stuff, a few quick
> TRUNCATE TABLE commands would have gotten you out of the woods.  As for
> the critical data (the stuff you actually needed to dump and restore),
> do you really want ZERO_DAMAGED_PAGES on for that?  It's a heck of a
> blunt tool.
> 
>   regards, tom lane

No, it wasn't that bad :) The REAL data is on a different server which hasn't
let us down so far (and has reliable hardware and software, and backups :)).
Only the cache database was hurt. The problem with truncating everything was
that rebuilding the cache would cost about 48 hours downtime, as there is A
LOT of data to rebuild. This really is an interim solution, things will be
constructed much better and more reliable in the future, but for now it's
there.

Another reason we went for the dump/restore is that we upgraded to 7.3.2 at
the same time, which we were postponing because weren't looking forward to
that downtime :)

Vincent van Leeuwen
Media Design - http://www.mediadesign.nl/

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Detecting corrupted pages earlier

2003-04-03 Thread Tom Lane

Vincent van Leeuwen <[EMAIL PROTECTED]> writes:
> ... This cost us about 10 hours downtime. If I'd had the option I just
> would've set ZERO_DAMAGED_PAGES to true and let it run for a few days to sort
> itself out.

Yikes.  If I understand this correctly, you had both critical data and
cache data in the same database.  As for the cache stuff, a few quick
TRUNCATE TABLE commands would have gotten you out of the woods.  As for
the critical data (the stuff you actually needed to dump and restore),
do you really want ZERO_DAMAGED_PAGES on for that?  It's a heck of a
blunt tool.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Detecting corrupted pages earlier

2003-04-03 Thread Vincent van Leeuwen

On 2003-04-02 16:18:33 -0500, Tom Lane wrote:
> Kevin Brown <[EMAIL PROTECTED]> writes:
> > Hmm...I don't know that I'd want to go that far -- setting this
> > variable could be regarded as a policy decision.  Some shops may have
> > very good reason for running with ZERO_DAMAGED_PAGES enabled all the
> > time, but I don't know what those reasons might be.
> 
> I would buy this argument if I could imagine even a faintly plausible
> reason for doing that ... but I can't.
> 
>   regards, tom lane
> 

I've been following this discussion with great interest, because I actually
have a situation where running with ZERO_DAMAGED_PAGES on all the time would
be somewhat plausible.

We use a PostgreSQL database purely for caching pages for a very busy website.
A user changes some stuff which causes a page on the site to change, the HTML
for the new page gets generated using the data from another database
(containing all the actual data) and the generated HTML is inserted into this
PG database. When a page is requested that isn't cached yet, it'll be
generated and inserted too. This makes it possible to invalidate the
cache-version of a large amount of pages by simply deleting the relevant rows
and not spending the time to regenerate all that data immediately (and it
makes crashrecovery more robust).

We can afford to lose all the data in the cache DB, because it's all generated
by using other data anyway. But losing all data would be bad from a
performance/uptime perspective, as all the cached data would need to be
regenerated (which takes a few days). Also, making backups once a day and
restoring such a backup when something goes wrong is also impractical, because
in our situation old data is much worse than no data at all. I'm working on a
script to detect old data and delete it so a new page will be generated, but
that isn't finished yet.

Two weeks ago the server running this database screwed up (it crashes pretty
badly) and made some data unreadable. Although I was running with fsync on on
an ext3 partition (with data=writeback, linux 2.4.20, PG 7.2) some of the PG
datafiles got damaged anyway (I blame IDE disks). The damage seemed light
enough to keep running with this dataset (it occasionally borked with
'heap_delete: (am)invalid tid', but since our application attempts a delete
followed by an insert of newly generated data in case of a db error it would
repair itself most of the time). Two crashes later (weirdly patched kernels
hooray) the errors got progressively worse ('missing chunk number 0 for toast
value 79960605' and stuff like that) so we were forced to shut the website
down, dump all the data we could dump (not everything), initdb and restore
that dump. This cost us about 10 hours downtime. If I'd had the option I just
would've set ZERO_DAMAGED_PAGES to true and let it run for a few days to sort
itself out. Alternatively an ALTER TABLE foo ZERO DAMAGED PAGES; would've
worked as well, although that would create a small downtime too.

I know I'm doing a lot of weird things, and that I could avoid a lot of the
problems listed here were I to do things differently, but the fact remains
that I actually have a real-life situation where running (for a while at
least) with ZERO_DAMAGED_PAGES on makes some kind of sense.

Vincent van Leeuwen
Media Design - http://www.mediadesign.nl/

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] Detecting corrupted pages earlier

2003-04-03 Thread Andrew Sullivan

On Thu, Apr 03, 2003 at 02:39:17PM -0500, Tom Lane wrote:
> just not listing zero_damaged_pages in postgresql.conf.sample?  We
> already have several variables deliberately not listed there ...

Hey, that might be a good solution.  Of course, it doesn't solve the
"doomsday device" problem, but nobody who uses it can complain that
they didn't know what the thing would do.

A

-- 

Andrew Sullivan 204-4141 Yonge Street
Liberty RMS   Toronto, Ontario Canada
<[EMAIL PROTECTED]>  M2P 2A8
 +1 416 646 3304 x110

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Detecting corrupted pages earlier

2003-04-03 Thread Tom Lane

Peter Eisentraut <[EMAIL PROTECTED]> writes:
> Tom Lane writes:
>> Andrew Sullivan expressed concern about this, too.  The thing could
>> be made a little more failsafe if we made it impossible to set
>> ZERO_DAMAGED_PAGES to true in postgresql.conf, or by any means other
>> than an actual SET command --- whose impact would then be limited to
>> the current session.  This is kind of an ugly wart on the GUC mechanism,
>> but I think not difficult to do with an assign_hook (it just has to
>> refuse non-interactive settings).

> Fighting against people who randomly change settings without being
> informed about what they do is pointless.

If you don't want an active defense, how about a passive one --- like
just not listing zero_damaged_pages in postgresql.conf.sample?  We
already have several variables deliberately not listed there ...

regards, tom lane


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Detecting corrupted pages earlier

2003-04-03 Thread Peter Eisentraut

Tom Lane writes:

> Andrew Sullivan expressed concern about this, too.  The thing could
> be made a little more failsafe if we made it impossible to set
> ZERO_DAMAGED_PAGES to true in postgresql.conf, or by any means other
> than an actual SET command --- whose impact would then be limited to
> the current session.  This is kind of an ugly wart on the GUC mechanism,
> but I think not difficult to do with an assign_hook (it just has to
> refuse non-interactive settings).

Fighting against people who randomly change settings without being
informed about what they do is pointless.  It's like trying to prevent
'rm -rf /*'.  And it's not like you can easily set anything in
postgresql.conf by accident.

-- 
Peter Eisentraut   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Detecting corrupted pages earlier

2003-04-02 Thread Tom Lane

Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Huh, and what if I accidentaly mistype the number and destroy a valid
> page?  Maybe the command should only succeed if it confirms that the
> page is corrupted.

Good point ... but what if the corruption is subtle enough that the
automatic tests don't notice it?  It could be that only "SELECT * FROM ..."
actually dumps core.  This is an acute problem for tests that are
generic enough to be implemented in just one or a few places --- for
example, the tests I just added have no idea whether they are looking
at a heap or index page, much less what kind of index page.

> There could also be "DESTROY ALL PAGES" and/or "DESTROY n PAGES"
> commands, where the latter zeroes the first n corrupted pages in the
> table, for the case with lots of corrupted pages.  Maybe a command for
> getting a list of the corrupted pages is useful?

We have "DESTROY ALL PAGES", it's called DROP TABLE.  The other doesn't
appeal to me either, because it doesn't say exactly which N pages you're
willing to lose.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] Detecting corrupted pages earlier

2003-04-02 Thread Tom Lane

Andrew Sullivan <[EMAIL PROTECTED]> writes:
> You know you have big-trouble, oh-no, ISP ran over
> the tapes while they were busy pitching magnets through your cage,
> data corruption problems, and this is your best hope for recovery? 
> Great.  Log in, turn on this option, and start working.  But across
> every back end?  It's the doomsday device for databases.

Yeah, it is.  Actually, the big problem with it in my mind is this
scenario: you get a page-header-corrupted error on page X, you
investigate and decide there's no hope for page X, so you turn on
zero_damaged_pages and try to dump the table.  It comes to page X,
complains, zeroes it, proceeds, ... and then comes to damaged page Y,
complains, zeroes it, proceeds.  Maybe you didn't know page Y had
problems.  Maybe you could have gotten something useful out of page Y
if you'd looked first.  Too late now.

What I'd really prefer to see is not a ZERO_DAMAGED_PAGES setting,
but an explicit command to "DESTROY PAGE n OF TABLE foo".  That would
make you manually admit defeat for each individual page before it'd
drop data.  But I don't presently have time to implement such a command
(any volunteers out there?).  Also, I could see where try-to-dump, fail,
DESTROY, try again, lather, rinse, repeat, could get pretty tedious on a
badly damaged table.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Detecting corrupted pages earlier

2003-04-02 Thread Alvaro Herrera

On Wed, Apr 02, 2003 at 11:10:13PM -0500, Tom Lane wrote:

> What I'd really prefer to see is not a ZERO_DAMAGED_PAGES setting,
> but an explicit command to "DESTROY PAGE n OF TABLE foo".  That would
> make you manually admit defeat for each individual page before it'd
> drop data.  But I don't presently have time to implement such a command
> (any volunteers out there?).  Also, I could see where try-to-dump, fail,
> DESTROY, try again, lather, rinse, repeat, could get pretty tedious on a
> badly damaged table.

Huh, and what if I accidentaly mistype the number and destroy a valid
page?  Maybe the command should only succeed if it confirms that the
page is corrupted.

There could also be "DESTROY ALL PAGES" and/or "DESTROY n PAGES"
commands, where the latter zeroes the first n corrupted pages in the
table, for the case with lots of corrupted pages.  Maybe a command for
getting a list of the corrupted pages is useful?

-- 
Alvaro Herrera ()
"La verdad no siempre es bonita, pero el hambre de ella si"

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Detecting corrupted pages earlier

2003-04-02 Thread Christopher Kings-Lynne

> What I'd really prefer to see is not a ZERO_DAMAGED_PAGES setting,
> but an explicit command to "DESTROY PAGE n OF TABLE foo".  That would
> make you manually admit defeat for each individual page before it'd
> drop data.  But I don't presently have time to implement such a command
> (any volunteers out there?).  Also, I could see where try-to-dump, fail,
> DESTROY, try again, lather, rinse, repeat, could get pretty tedious on a
> badly damaged table.

I'm not volunteering, but this would be better:

ALTER TABLE foo ZERO [ PAGE n | BAD PAGES ];

Chris

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] Detecting corrupted pages earlier

2003-04-02 Thread Bruce Momjian

Tom Lane wrote:
> Andrew Sullivan <[EMAIL PROTECTED]> writes:
> > You know you have big-trouble, oh-no, ISP ran over
> > the tapes while they were busy pitching magnets through your cage,
> > data corruption problems, and this is your best hope for recovery? 
> > Great.  Log in, turn on this option, and start working.  But across
> > every back end?  It's the doomsday device for databases.
> 
> Yeah, it is.  Actually, the big problem with it in my mind is this
> scenario: you get a page-header-corrupted error on page X, you
> investigate and decide there's no hope for page X, so you turn on
> zero_damaged_pages and try to dump the table.  It comes to page X,
> complains, zeroes it, proceeds, ... and then comes to damaged page Y,
> complains, zeroes it, proceeds.  Maybe you didn't know page Y had
> problems.  Maybe you could have gotten something useful out of page Y
> if you'd looked first.  Too late now.
> 
> What I'd really prefer to see is not a ZERO_DAMAGED_PAGES setting,
> but an explicit command to "DESTROY PAGE n OF TABLE foo".  That would
> make you manually admit defeat for each individual page before it'd
> drop data.  But I don't presently have time to implement such a command
> (any volunteers out there?).  Also, I could see where try-to-dump, fail,
> DESTROY, try again, lather, rinse, repeat, could get pretty tedious on a
> badly damaged table.

Can we make the GUC setting a numeric, so you can set it to 1 and it
clears just one page, or 0 and it clears all pages?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] Detecting corrupted pages earlier

2003-04-02 Thread Andrew Sullivan

On Wed, Apr 02, 2003 at 02:07:02PM -0700, Jason Earl wrote:

> There are some cases where this particular feature would be useful.
> What needs to be done is to make the feature less dangerous to the
> newbie without making it less useful to the person who actually needs
> the functionality.

I'll repeat what I said before (I think to Tom): "That's a pretty big
foot-gun you've got there."  I cannot possibly imagine what sort of
data-recovery situation would warrant running with the option turned
on all the time.  You know you have big-trouble, oh-no, ISP ran over
the tapes while they were busy pitching magnets through your cage,
data corruption problems, and this is your best hope for recovery? 
Great.  Log in, turn on this option, and start working.  But across
every back end?  It's the doomsday device for databases.  Emergency
recovery options are _good_: everyone has his favourite UNIX file
undeletion story.  But it's sure not a good idea to give everyone
root just in case she or he deletes some file.

A

-- 

Andrew Sullivan 204-4141 Yonge Street
Liberty RMS   Toronto, Ontario Canada
<[EMAIL PROTECTED]>  M2P 2A8
 +1 416 646 3304 x110

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Detecting corrupted pages earlier

2003-04-02 Thread Tom Lane

Kevin Brown <[EMAIL PROTECTED]> writes:
> Hmm...I don't know that I'd want to go that far -- setting this
> variable could be regarded as a policy decision.  Some shops may have
> very good reason for running with ZERO_DAMAGED_PAGES enabled all the
> time, but I don't know what those reasons might be.

I would buy this argument if I could imagine even a faintly plausible
reason for doing that ... but I can't.

regards, tom lane


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] Detecting corrupted pages earlier

2003-04-02 Thread Jason Earl

Andrew Sullivan <[EMAIL PROTECTED]> writes:

> On Wed, Apr 02, 2003 at 03:25:58PM -0500, Tom Lane wrote:
> 
> > the current session.  This is kind of an ugly wart on the GUC mechanism,
> > but I think not difficult to do with an assign_hook (it just has to
> > refuse non-interactive settings).
> 
> It may be an ugly wart, but I think it's only prudent.  I'd be
> willing to bet a fair amount that there is a significant overlap
> between the population which uses Bob's Discount Hardware for 10
> million row, ultra-critical databases and the population which likes
> to twiddle postgresql.conf settings without reading the fine
> manual.  Those folks are going to get burned by this setting unless
> it is very hard to turn on.  (I think the setting is an excellent
> idea, though, for emergency cases.)
> 
> I don't usually like making servers totally idiot-proof, but I can
> just imagine the Slashdot conversations after 7.4 comes out (and for
> at least 60 or 70 years thereafter): "PostgreSQL just randomly zeroes
> a page of data!  It sucks!  Use MySQL instead.  It can recover from
> bad pages on your disk by randomly making up data for you, instead of
> just writing zeros."

Perhaps the real answer is to use a more *descriptive* name for the
variable than ZERO_DAMAGED_PAGES.  Something along the lines of
CLUELESS_DBA_ESCHEWS_BACKUPS, or ONLY_AN_IDIOT_SETS_THIS_VARIABLE.
You don't have to read the manual to see that these variables are not
to be trifled with.

There are some cases where this particular feature would be useful.
What needs to be done is to make the feature less dangerous to the
newbie without making it less useful to the person who actually needs
the functionality.

Jason

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Detecting corrupted pages earlier

2003-04-02 Thread Kevin Brown

Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > Perhaps better would be to throw a message message any time it is turned
> > on, reminding them it should not be left on.  Is that cleaner?
> 
> Where are you going to throw a message to, if it's in postgresql.conf?
> 
> Bleating in the postmaster log probably won't draw the attention of a
> user clueless enough to be trying this ;-)

Perhaps.  I'd bank more on the user checking the postmaster log than
being clueful enough not to use this, though, since the postmaster log
is pretty much a universal thing that has been around a long time
while this option hasn't.  Cluelessness about the use of a rather
esoteric option doesn't necessarily imply total cluelessness.  :-)

-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Detecting corrupted pages earlier

2003-04-02 Thread Kevin Brown

Tom Lane wrote:
> Kevin Brown <[EMAIL PROTECTED]> writes:
> > Tom Lane wrote:
> >> Basically, one should only turn this variable on after giving up on the
> >> possibility of getting any data out of the broken page itself.  It would
> >> be folly to run with it turned on as a normal setting.
> 
> > This statement should *definitely* go into the documentation for the
> > option, then...
> 
> Andrew Sullivan expressed concern about this, too.  The thing could
> be made a little more failsafe if we made it impossible to set
> ZERO_DAMAGED_PAGES to true in postgresql.conf, or by any means other
> than an actual SET command --- whose impact would then be limited to
> the current session.  This is kind of an ugly wart on the GUC mechanism,
> but I think not difficult to do with an assign_hook (it just has to
> refuse non-interactive settings).

Hmm...I don't know that I'd want to go that far -- setting this
variable could be regarded as a policy decision.  Some shops may have
very good reason for running with ZERO_DAMAGED_PAGES enabled all the
time, but I don't know what those reasons might be.  But the fact that
I can't think of a good reason isn't sufficient cause to remove that
as an option.

I would definitely be in favor of issuing a warning ("Running with
ZERO_DAMAGED_PAGES enabled will cause you to lose possibly recoverable
data whenever a damaged page is encountered.  Be sure you know what
you're doing") whenever the variable is set, whether it be at startup
or during a session.

-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Detecting corrupted pages earlier

2003-04-02 Thread Andrew Sullivan

On Wed, Apr 02, 2003 at 03:25:58PM -0500, Tom Lane wrote:

> the current session.  This is kind of an ugly wart on the GUC mechanism,
> but I think not difficult to do with an assign_hook (it just has to
> refuse non-interactive settings).

It may be an ugly wart, but I think it's only prudent.  I'd be
willing to bet a fair amount that there is a significant overlap
between the population which uses Bob's Discount Hardware for 10
million row, ultra-critical databases and the population which likes
to twiddle postgresql.conf settings without reading the fine
manual.  Those folks are going to get burned by this setting unless
it is very hard to turn on.  (I think the setting is an excellent
idea, though, for emergency cases.)

I don't usually like making servers totally idiot-proof, but I can
just imagine the Slashdot conversations after 7.4 comes out (and for
at least 60 or 70 years thereafter): "PostgreSQL just randomly zeroes
a page of data!  It sucks!  Use MySQL instead.  It can recover from
bad pages on your disk by randomly making up data for you, instead of
just writing zeros."

A

-- 

Andrew Sullivan 204-4141 Yonge Street
Liberty RMS   Toronto, Ontario Canada
<[EMAIL PROTECTED]>  M2P 2A8
 +1 416 646 3304 x110

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Detecting corrupted pages earlier

2003-04-02 Thread Tom Lane

Kevin Brown <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> Basically, one should only turn this variable on after giving up on the
>> possibility of getting any data out of the broken page itself.  It would
>> be folly to run with it turned on as a normal setting.

> This statement should *definitely* go into the documentation for the
> option, then...

Andrew Sullivan expressed concern about this, too.  The thing could
be made a little more failsafe if we made it impossible to set
ZERO_DAMAGED_PAGES to true in postgresql.conf, or by any means other
than an actual SET command --- whose impact would then be limited to
the current session.  This is kind of an ugly wart on the GUC mechanism,
but I think not difficult to do with an assign_hook (it just has to
refuse non-interactive settings).

Comments?

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Detecting corrupted pages earlier

2003-03-31 Thread Kevin Brown

Tom Lane wrote:
> Here's what I put in --- feel free to suggest better wording.
> 
> ZERO_DAMAGED_PAGES (boolean)
> 
>  Detection of a damaged page header normally causes PostgreSQL to
>  report an error, aborting the current transaction. Setting
>  zero_damaged_pages to true causes the system to instead report a
>  warning, zero out the damaged page, and continue processing. This
>  behavior will lose data, namely all the rows on the
>  damaged page. But it allows you to get past the error and retrieve
>  rows from any undamaged pages that may be present in the table. So
>  it is useful for recovering data if corruption has occurred due to
>  hardware or software error. The default setting is off, and it can
>  only be changed by a superuser.

I would change "will lose data" to "will destroy data" -- if only to
emphasize the fact that with the option enabled you have no chance of
recoving the data.  Other than that, it looks good to me.

I was going to suggest that perhaps a statement like "running with
this option enabled for normal operation is probably not a good idea"
should be put in there, but whether such a thing belongs in the
documentation or not depends on whether or not you believe the
documentation should be neutral regarding administrative decisions.
The part of the documentation your modification applies to should
probably remain neutral, IMHO, but that implies that perhaps a
paragraph that talks about dealing with damaged pages (and which
references this option) should be placed into the administrator's
guide.

-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Detecting corrupted pages earlier

2003-03-30 Thread Tom Lane

Kevin Brown <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> Basically, one should only turn this variable on after giving up on the
>> possibility of getting any data out of the broken page itself.  It would
>> be folly to run with it turned on as a normal setting.

> This statement should *definitely* go into the documentation for the
> option, then...

Here's what I put in --- feel free to suggest better wording.

ZERO_DAMAGED_PAGES (boolean)

 Detection of a damaged page header normally causes PostgreSQL to
 report an error, aborting the current transaction. Setting
 zero_damaged_pages to true causes the system to instead report a
 warning, zero out the damaged page, and continue processing. This
 behavior will lose data, namely all the rows on the
 damaged page. But it allows you to get past the error and retrieve
 rows from any undamaged pages that may be present in the table. So
 it is useful for recovering data if corruption has occurred due to
 hardware or software error. The default setting is off, and it can
 only be changed by a superuser.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] Detecting corrupted pages earlier

2003-03-29 Thread Kevin Brown

Tom Lane wrote:
> Kris Jurka <[EMAIL PROTECTED]> writes:
> > Is zeroing the pages the only / best option?
> 
> It's the only way to avoid a core dump when the system tries to process
> the page.  And no, I don't want to propagate the notion that "this page
> is broken" beyond the buffer manager, so testing elsewhere isn't an
> acceptable answer.
> 
> Basically, one should only turn this variable on after giving up on the
> possibility of getting any data out of the broken page itself.  It would
> be folly to run with it turned on as a normal setting.

This statement should *definitely* go into the documentation for the
option, then...


-- 
Kevin Brown   [EMAIL PROTECTED]


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Detecting corrupted pages earlier

2003-03-28 Thread Tom Lane

Kris Jurka <[EMAIL PROTECTED]> writes:
> Is zeroing the pages the only / best option?

It's the only way to avoid a core dump when the system tries to process
the page.  And no, I don't want to propagate the notion that "this page
is broken" beyond the buffer manager, so testing elsewhere isn't an
acceptable answer.

Basically, one should only turn this variable on after giving up on the
possibility of getting any data out of the broken page itself.  It would
be folly to run with it turned on as a normal setting.

regards, tom lane


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Detecting corrupted pages earlier

2003-03-28 Thread Kris Jurka

On Fri, 28 Mar 2003, Tom Lane wrote:

> Hiroshi Inoue <[EMAIL PROTECTED]> writes:
> > How about adding a new option to skip corrupted pages ?
>
> I have committed changes to implement checking for damaged page headers,
> along the lines of last month's discussion.  It includes a GUC variable
> to control the response as suggested by Hiroshi.

Is zeroing the pages the only / best option?  Hiroshi suggested skipping
the pages as I recall.  Is there any chance of recovering data from a
trashed page manually?  If so perhaps the GUC variable should allow three
options: error, zero, and skip.

Kris Jurka

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Detecting corrupted pages earlier

2003-03-28 Thread Tom Lane

Hiroshi Inoue <[EMAIL PROTECTED]> writes:
> How about adding a new option to skip corrupted pages ?

I have committed changes to implement checking for damaged page headers,
along the lines of last month's discussion.  It includes a GUC variable
to control the response as suggested by Hiroshi.

Given the number of data-corruption reports we've seen recently, I am
more than half tempted to commit the change into the 7.3.* branch too.
However the GUC variable makes it seem a little like a new feature.

I could backpatch the whole change, or backpatch it without the GUC
variable (so the only possible response is elog(ERROR)), or leave
well enough alone.  Any votes?

The patch is pretty small; I include the non-boilerplate parts below.

regards, tom lane


*** /home/tgl/pgsql/src/backend/storage/buffer/bufmgr.c.origTue Mar 25 09:06:11 
2003
--- /home/tgl/pgsql/src/backend/storage/buffer/bufmgr.c Fri Mar 28 11:54:48 2003
***
*** 59,64 
--- 60,69 
(*((XLogRecPtr*) MAKE_PTR((bufHdr)->data)))
  
  
+ /* GUC variable */
+ bool  zero_damaged_pages = false;
+ 
+ 
  static void WaitIO(BufferDesc *buf);
  static void StartBufferIO(BufferDesc *buf, bool forInput);
  static void TerminateBufferIO(BufferDesc *buf);
***
*** 217,222 
--- 222,241 
{
status = smgrread(DEFAULT_SMGR, reln, blockNum,
  (char *) MAKE_PTR(bufHdr->data));
+   /* check for garbage data */
+   if (status == SM_SUCCESS &&
+   !PageHeaderIsValid((PageHeader) MAKE_PTR(bufHdr->data)))
+   {
+   if (zero_damaged_pages)
+   {
+   elog(WARNING, "Invalid page header in block %u of %s; 
zeroing out page",
+blockNum, RelationGetRelationName(reln));
+   MemSet((char *) MAKE_PTR(bufHdr->data), 0, BLCKSZ);
+   }
+   else
+   elog(ERROR, "Invalid page header in block %u of %s",
+blockNum, RelationGetRelationName(reln));
+   }
}
  
if (isLocalBuf)
*** /home/tgl/pgsql/src/backend/storage/page/bufpage.c.orig Tue Mar 25 09:06:12 
2003
--- /home/tgl/pgsql/src/backend/storage/page/bufpage.c  Fri Mar 28 11:38:43 2003
***
*** 48,53 
--- 46,96 
  }
  
  
+ /*
+  * PageHeaderIsValid
+  *Check that the header fields of a page appear valid.
+  *
+  * This is called when a page has just been read in from disk.  The idea is
+  * to cheaply detect trashed pages before we go nuts following bogus item
+  * pointers, testing invalid transaction identifiers, etc.
+  *
+  * It turns out to be necessary to allow zeroed pages here too.  Even though
+  * this routine is *not* called when deliberately adding a page to a relation,
+  * there are scenarios in which a zeroed page might be found in a table.
+  * (Example: a backend extends a relation, then crashes before it can write
+  * any WAL entry about the new page.  The kernel will already have the
+  * zeroed page in the file, and it will stay that way after restart.)  So we
+  * allow zeroed pages here, and are careful that the page access macros
+  * treat such a page as empty and without free space.  Eventually, VACUUM
+  * will clean up such a page and make it usable.
+  */
+ bool
+ PageHeaderIsValid(PageHeader page)
+ {
+   char   *pagebytes;
+   int i;
+ 
+   /* Check normal case */
+   if (PageGetPageSize(page) == BLCKSZ &&
+   PageGetPageLayoutVersion(page) == PG_PAGE_LAYOUT_VERSION &&
+   page->pd_lower >= SizeOfPageHeaderData &&
+   page->pd_lower <= page->pd_upper &&
+   page->pd_upper <= page->pd_special &&
+   page->pd_special <= BLCKSZ &&
+   page->pd_special == MAXALIGN(page->pd_special))
+   return true;
+ 
+   /* Check all-zeroes case */
+   pagebytes = (char *) page;
+   for (i = 0; i < BLCKSZ; i++)
+   {
+   if (pagebytes[i] != 0)
+   return false;
+   }
+   return true;
+ }
+ 
+ 
  /* 
   *PageAddItem
   *


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Detecting corrupted pages earlier

2003-02-19 Thread Hiroshi Inoue

Tom Lane wrote:
(B> 
(B> Hiroshi Inoue <[EMAIL PROTECTED]> writes:
(B> > Tom Lane wrote:
(B> >> Hiroshi Inoue <[EMAIL PROTECTED]> writes:
(B> >>> Is there a way to make our way around the pages ?
(B> >>
(B> >> If the header is corrupt, I don't think so.
(B> 
(B> > What I asked is how to read all other sane pages.
(B> 
(B> Oh, I see.  You can do "SELECT ... LIMIT n" to get the rows before the
(B> broken page, but there's no way to get the ones after it.  My proposal
(B> won't make this worse, but it won't make it any better either.  Do you
(B> have an idea how to get the rows after the broken page?
(B
(BHow about adding a new option to skip corrupted pages ?
(B
(Bregards,
(BHiroshi Inoue
(Bhttp://www.geocities.jp/inocchichichi/psqlodbc/
(B
(B---(end of broadcast)---
(BTIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Detecting corrupted pages earlier

2003-02-18 Thread Tom Lane

Hiroshi Inoue <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> Hiroshi Inoue <[EMAIL PROTECTED]> writes:
>>> Is there a way to make our way around the pages ?
>> 
>> If the header is corrupt, I don't think so.

> What I asked is how to read all other sane pages.

Oh, I see.  You can do "SELECT ... LIMIT n" to get the rows before the
broken page, but there's no way to get the ones after it.  My proposal
won't make this worse, but it won't make it any better either.  Do you
have an idea how to get the rows after the broken page?

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Detecting corrupted pages earlier

2003-02-18 Thread Hiroshi Inoue

Tom Lane wrote:
(B> 
(B> Hiroshi Inoue <[EMAIL PROTECTED]> writes:
(B> > Tom Lane wrote:
(B> >> I'm thinking of modifying ReadBuffer() so that it errors out if the
(B> 
(B> > What does the *error out* mean ?
(B> 
(B> Mark the buffer as having an I/O error and then elog(ERROR).
(B> 
(B> > Is there a way to make our way around the pages ?
(B> 
(B> If the header is corrupt, I don't think so.
(B
(BWhat I asked is how to read all other sane pages.
(BOnce pages are corrupted users would copy the sane data ASAP.
(B
(Bregards,
(BHiroshi Inoue
(Bhttp://www.geocities.jp/inocchichichi/psqlodbc/
(B
(B---(end of broadcast)---
(BTIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Detecting corrupted pages earlier

2003-02-18 Thread Tom Lane

Hiroshi Inoue <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> I'm thinking of modifying ReadBuffer() so that it errors out if the

> What does the *error out* mean ?

Mark the buffer as having an I/O error and then elog(ERROR).

> Is there a way to make our way around the pages ?

If the header is corrupt, I don't think so.  You'd need at the very
least to fix the bad header fields (particularly pd_lower) before you
could safely try to examine tuples.  (In the cases that I've seen,
some or all of the line pointers are clobbered too, making it even less
likely that any useful data can be extracted automatically.)

Basically I'd rather have accesses to the clobbered page fail with
elog(ERROR) than with more drastic errors.  Right now, the least
dangerous result you are likely to get is elog(FATAL) out of the clog
code, and you can easily get a PANIC or backend coredump instead.

> IMHO CRC isn't sufficient because CRC could be calculated
> even for (silently) corrupted pages.

Yeah, it seems a great expense for only marginal additional protection.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] Detecting corrupted pages earlier

2003-02-18 Thread Hiroshi Inoue

Tom Lane wrote:
(B> 
(B> Postgres has a bad habit of becoming very confused if the page header of
(B> a page on disk has become corrupted.  In particular, bogus values in the
(B> pd_lower field tend to make it look like there are many more tuples than
(B> there really are, and of course these "tuples" contain garbage.  That
(B> leads to core dumps, weird complaints about out-of-range transaction
(B> numbers (the latter generally in the form of an abort referencing a
(B> nonexistent pg_clog file), and other un-fun stuff.
(B> 
(B> I'm thinking of modifying ReadBuffer() so that it errors out if the
(B
(BWhat does the *error out* mean ?
(BIs there a way to make our way around the pages ?
(B 
(B> page read in does not contain either zeroes or a valid-looking header.
(B> (The exception for zeroes seems to be needed for hash indexes, which
(B> tend to initialize pages out-of-order.)  This would make it much easier
(B> for people to recognize situations where a page header has become
(B> corrupted on disk.
(B> 
(B> Comments?  Can anyone think of a scenario where this would be a bad
(B> idea?
(B
(BIIRC there was a similar thread long ago.
(BIMHO CRC isn't sufficient because CRC could be calculated
(Beven for (silently) corrupted pages.
(B
(Bregards, 
(BHiroshi Inoue
(Bhttp://www.geocities.jp/inocchichichi/psqlodbc/
(B
(B---(end of broadcast)---
(BTIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Detecting corrupted pages earlier

2003-02-18 Thread Greg Copeland

On Mon, 2003-02-17 at 22:04, Tom Lane wrote:
> Curt Sampson <[EMAIL PROTECTED]> writes:
> > On Mon, 17 Feb 2003, Tom Lane wrote:
> >> Postgres has a bad habit of becoming very confused if the page header of
> >> a page on disk has become corrupted.
> 
> > What typically causes this corruption?
> 
> Well, I'd like to know that too.  I have seen some cases that were
> identified as hardware problems (disk wrote data to wrong sector, RAM
> dropped some bits, etc).  I'm not convinced that that's the whole story,
> but I have nothing to chew on that could lead to identifying a software
> bug.
> 
> > If it's any kind of a serious problem, maybe it would be worth keeping
> > a CRC of the header at the end of the page somewhere.
> 
> See past discussions about keeping CRCs of page contents.  Ultimately
> I think it's a significant expenditure of CPU for very marginal returns
> --- the layers underneath us are supposed to keep their own CRCs or
> other cross-checks, and a very substantial chunk of the problem seems
> to be bad RAM, against which occasional software CRC checks aren't 
> especially useful.

This is exactly why "magic numbers" or simple algorithmic bit patterns
are commonly used.  If the "magic number" or bit pattern doesn't match
it's page number accordingly, you know something is wrong.  Storage cost
tends to be slightly and CPU overhead low.

I agree with you that a CRC is seems overkill for little return.

Regards,

-- 
Greg Copeland <[EMAIL PROTECTED]>
Copeland Computer Consulting


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] Detecting corrupted pages earlier

2003-02-18 Thread Hannu Krosing

Tom Lane kirjutas T, 18.02.2003 kell 17:21:
> Kevin Brown <[EMAIL PROTECTED]> writes:
> > Tom Lane wrote:
> >> The cases I've been able to study look like the header and a lot of the
> >> following page data have been overwritten with garbage --- when it made
> >> any sense at all, it looked like the contents of non-Postgres files (eg,
> >> plain text), which is why I mentioned the possibility of disks writing
> >> data to the wrong sector.
> 
> > That also sounds suspiciously like the behavior of certain filesystems
> > (Reiserfs, for one) after a crash when the filesystem prior to the
> > crash was highly active with writes.

I was bitten by it about a year ago as well.

> Isn't reiserfs supposed to be more crash-resistant than ext2, rather
> than less so?

It's supposed to be, but when it is run in (default?)
metadata-only-logging mode, then you can well get perfectly good
metadata with unallocated (zero-filled) data pages. There had been some
more severe errors as well.

-
Hannu


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Detecting corrupted pages earlier

2003-02-18 Thread Tom Lane

Kevin Brown <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> The cases I've been able to study look like the header and a lot of the
>> following page data have been overwritten with garbage --- when it made
>> any sense at all, it looked like the contents of non-Postgres files (eg,
>> plain text), which is why I mentioned the possibility of disks writing
>> data to the wrong sector.

> That also sounds suspiciously like the behavior of certain filesystems
> (Reiserfs, for one) after a crash when the filesystem prior to the
> crash was highly active with writes.

Isn't reiserfs supposed to be more crash-resistant than ext2, rather
than less so?

> Had the sites that reported this
> experienced OS crashes or power interruptions?

Can't recall whether they admitted to such or not.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Detecting corrupted pages earlier

2003-02-18 Thread Kevin Brown

Tom Lane wrote:
> The cases I've been able to study look like the header and a lot of the
> following page data have been overwritten with garbage --- when it made
> any sense at all, it looked like the contents of non-Postgres files (eg,
> plain text), which is why I mentioned the possibility of disks writing
> data to the wrong sector.

That also sounds suspiciously like the behavior of certain filesystems
(Reiserfs, for one) after a crash when the filesystem prior to the
crash was highly active with writes.  Had the sites that reported this
experienced OS crashes or power interruptions?

-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Detecting corrupted pages earlier

2003-02-17 Thread Curt Sampson

On Tue, 18 Feb 2003, Tom Lane wrote:

> The header is only a dozen or two bytes long, so torn-page syndrome
> won't result in header corruption.

No. But the checksum would detect both header corruption and torn pages.
Two for the price of one. But I don't think it's worth changing the page
layout for, either. Maybe, if anybody still cares next time the page layout
is changed, pop it in with whatever else is being changed.

cjs
-- 
Curt Sampson  <[EMAIL PROTECTED]>   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] Detecting corrupted pages earlier

2003-02-17 Thread Tom Lane

Curt Sampson <[EMAIL PROTECTED]> writes:
> Well, I wasn't proposing the whole page, just the header. That would be
> significantly cheaper (in fact, there's no real need even for a CRC;
> probably just xoring all of the words in the header into one word would
> be fine) and would tell you if the page was torn during the write, which
> was what I was imagining the problem might be.

The header is only a dozen or two bytes long, so torn-page syndrome
won't result in header corruption.

The cases I've been able to study look like the header and a lot of the
following page data have been overwritten with garbage --- when it made
any sense at all, it looked like the contents of non-Postgres files (eg,
plain text), which is why I mentioned the possibility of disks writing
data to the wrong sector.  Another recent report suggested that all
bytes of the header had been replaced with 0x55, which sounds more like
RAM or disk-controller malfeasance.

You're right that we don't need a heck of a powerful check to catch
this sort of thing.  I was envisioning checks comparable to what's now
in PageAddItem: valid pagesize, valid version, pd_lower and pd_upper and
pd_special sane relative to each other and to the pagesize.  I think this
would be nearly as effective as an XOR sum --- and it has the major
advantage of being compatible with the existing page layout.  I'd like
to think we're done munging the page layout for awhile.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] Detecting corrupted pages earlier

2003-02-17 Thread Curt Sampson

On Mon, 17 Feb 2003, Tom Lane wrote:

> Curt Sampson <[EMAIL PROTECTED]> writes:
>
> > If it's any kind of a serious problem, maybe it would be worth keeping
> > a CRC of the header at the end of the page somewhere.
>
> See past discussions about keeping CRCs of page contents.  Ultimately
> I think it's a significant expenditure of CPU for very marginal returns
> --- the layers underneath us are supposed to keep their own CRCs or
> other cross-checks, and a very substantial chunk of the problem seems
> to be bad RAM, against which occasional software CRC checks aren't
> especially useful.

Well, I wasn't proposing the whole page, just the header. That would be
significantly cheaper (in fact, there's no real need even for a CRC;
probably just xoring all of the words in the header into one word would
be fine) and would tell you if the page was torn during the write, which
was what I was imagining the problem might be.

But bad memory, well, not much you can do about that beyond saying, "buy
ECC, dude."

cjs
-- 
Curt Sampson  <[EMAIL PROTECTED]>   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] Detecting corrupted pages earlier

2003-02-17 Thread Bruce Momjian

Tom Lane wrote:
> Curt Sampson <[EMAIL PROTECTED]> writes:
> > On Mon, 17 Feb 2003, Tom Lane wrote:
> >> Postgres has a bad habit of becoming very confused if the page header of
> >> a page on disk has become corrupted.
> 
> > What typically causes this corruption?
> 
> Well, I'd like to know that too.  I have seen some cases that were
> identified as hardware problems (disk wrote data to wrong sector, RAM
> dropped some bits, etc).  I'm not convinced that that's the whole story,
> but I have nothing to chew on that could lead to identifying a software
> bug.
> 
> > If it's any kind of a serious problem, maybe it would be worth keeping
> > a CRC of the header at the end of the page somewhere.
> 
> See past discussions about keeping CRCs of page contents.  Ultimately
> I think it's a significant expenditure of CPU for very marginal returns
> --- the layers underneath us are supposed to keep their own CRCs or
> other cross-checks, and a very substantial chunk of the problem seems
> to be bad RAM, against which occasional software CRC checks aren't 
> especially useful.

I believe the farthest we got was the idea of adding a CRC page
check option in case you suspected bad hardware.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Detecting corrupted pages earlier

2003-02-17 Thread Tom Lane

Curt Sampson <[EMAIL PROTECTED]> writes:
> On Mon, 17 Feb 2003, Tom Lane wrote:
>> Postgres has a bad habit of becoming very confused if the page header of
>> a page on disk has become corrupted.

> What typically causes this corruption?

Well, I'd like to know that too.  I have seen some cases that were
identified as hardware problems (disk wrote data to wrong sector, RAM
dropped some bits, etc).  I'm not convinced that that's the whole story,
but I have nothing to chew on that could lead to identifying a software
bug.

> If it's any kind of a serious problem, maybe it would be worth keeping
> a CRC of the header at the end of the page somewhere.

See past discussions about keeping CRCs of page contents.  Ultimately
I think it's a significant expenditure of CPU for very marginal returns
--- the layers underneath us are supposed to keep their own CRCs or
other cross-checks, and a very substantial chunk of the problem seems
to be bad RAM, against which occasional software CRC checks aren't 
especially useful.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [HACKERS] Detecting corrupted pages earlier

2003-02-17 Thread Curt Sampson

On Mon, 17 Feb 2003, Tom Lane wrote:

> Postgres has a bad habit of becoming very confused if the page header of
> a page on disk has become corrupted.

What typically causes this corruption?

If it's any kind of a serious problem, maybe it would be worth keeping
a CRC of the header at the end of the page somewhere.

cjs
-- 
Curt Sampson  <[EMAIL PROTECTED]>   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] Detecting corrupted pages earlier

2003-02-17 Thread Sailesh Krishnamurthy

> "Tom" == Tom Lane <[EMAIL PROTECTED]> writes:

Tom> Postgres has a bad habit of becoming very confused if the
Tom> page header of a page on disk has become corrupted.  In
Tom> particular, bogus values in the pd_lower field tend to make

I haven't read this piece of pgsql code very carefully so I apologize
if what I suggest is already present.

One "standard" solution to handle disk page corruption is the use of
"consistency" bits.

The idea is that the bit that starts every 256th byte of a page is a
consistency bit. In a 8K page, you'd have 32 consistency bits.  If the
page is in a "consistent" state, then all 32 bits will be either 0 or
1. When a page is written to disk, the "actual" bit in each c-bit
position is copied out and placed in the header (end/beginning) of the
page. With a 8K page, there will be one word that contains the
"actual" bit. Then the c-bits are all either set or reset depending on
the state when the page was last read: if on read time the c-bits were
set, then on write time they are reset. So when you read a page, if
some of the consistency bits are set and some others are reset then
you know that there was a corruption.

This is of course based on the assumption that most disk arms manage
to atomically write 256 bytes at a time. 

-- 
Pip-pip
Sailesh
http://www.cs.berkeley.edu/~sailesh

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

42 matches

Mail list logo