subject:"Re\: \[GENERAL\] Database corruption."

Re: [GENERAL] database corruption

2016-02-16 Thread Craig Ringer

On 12 February 2016 at 17:56, Oliver Stöneberg wrote: > A few weeks ago we already had a data corruption when the disk was > full. There are other services running on the same machine that could > cause the disk to fill up (e.g. local chaching when the network is > acting up). It happened a few

Re: [GENERAL] database corruption

2016-02-12 Thread Andrew Sullivan

On Fri, Feb 12, 2016 at 07:46:25AM -0500, Bill Moran wrote: > Long term, you need to fix your hardware. Postgres doesn't corrupt > itself just because the disks fill up, so your hardware must be lying > about what writes completed successfully, otherwise, Postgres would > be able to recover after a

Re: [GENERAL] database corruption

2016-02-12 Thread Bill Moran

On Fri, 12 Feb 2016 10:56:04 +0100 "Oliver Stöneberg" wrote: > We are running a 64-bit PostgreSQL 9.4.5 server on Windows Server > 2012. The system is a virtual machine on a VMware ESX 6.0 server and > has 24 GB of memory. The database server is only accessed locally by > two services and ther

Re: [GENERAL] Database corruption event, unlockable rows, possibly bogus virtual xids? Invalid backend server xid

2013-02-21 Thread Ned Wolpert

Tom and Kevin- There were two entries in pg_prepared_xacts. In the test-bed, executing the 'ROLLBACK PREPARED' on both allowed the system to continue processing. All locks I saw in 'pg_locks' where the virtualtransaction started with the '-1/' were also gone. That was indeed the issue. More impo

Re: [GENERAL] Database corruption event, unlockable rows, possibly bogus virtual xids? Invalid backend server xid

2013-02-21 Thread Tom Lane

Ned Wolpert writes: > Event: Running 9.1.6 with hot-standby, archiving 4 months of wal files, > and even a nightly pg_dump all. 50G database. Trying to update or delete a > row in a small (21 row, but heavily used table) would lock up completely. > Never finish. Removed all clients, restarted t

Re: [GENERAL] Database corruption event, unlockable rows, possibly bogus virtual xids? Invalid backend server xid

2013-02-21 Thread Kevin Grittner

Ned Wolpert wrote: > I'm doing a postmortem on a corruption event we had. I have an > idea on what happened, but not sure. I figure I'd share what > happened and see if I'm close to right here. > > Running 9.1.6 with hot-standby, archiving 4 months of wal files, > and even a nightly pg_dump all.

Re: [GENERAL] database corruption questions

2012-10-23 Thread Kevin Grittner

Heine Ferreira wrote: > Are there any best practices for avoiding database corruption? First and foremost, do not turn off fsync or full_page_writes in your configuration. After that the most common causes for database corruption I've seen are bad RAM (ECC RAM is a requirement, not an option for

Re: [GENERAL] database corruption questions

2012-10-17 Thread Craig Ringer

On 10/18/2012 01:06 AM, Daniel Serodio wrote: Craig Ringer wrote: On 10/14/2012 05:53 AM, Heine Ferreira wrote: Hi Are there any best practices for avoiding database corruption? * Maintain rolling backups with proper ageing. For example, keep one a day for the last 7 days, then one a week fo

Re: [GENERAL] database corruption questions

2012-10-17 Thread Daniel Serodio

Craig Ringer wrote: On 10/14/2012 05:53 AM, Heine Ferreira wrote: Hi Are there any best practices for avoiding database corruption? * Maintain rolling backups with proper ageing. For example, keep one a day for the last 7 days, then one a week for the last 4 weeks, then one a month for the

Re: [GENERAL] database corruption questions

2012-10-17 Thread Daniel Serodio (lists)

Craig Ringer wrote: On 10/14/2012 05:53 AM, Heine Ferreira wrote: Hi Are there any best practices for avoiding database corruption? * Maintain rolling backups with proper ageing. For example, keep one a day for the last 7 days, then one a week for the last 4 weeks, then one a month for the

Re: [GENERAL] database corruption questions

2012-10-15 Thread Bruce Momjian

On Sun, Oct 14, 2012 at 11:26:40AM +0800, Craig Ringer wrote: > On 10/14/2012 11:00 AM, John R Pierce wrote: > >On 10/13/12 7:13 PM, Craig Ringer wrote: > >> > >>* Use a good quality hardware RAID controller with a battery backup > >>cache unit if you're using spinning disks in RAID. This is as muc

Re: [GENERAL] database corruption questions

2012-10-13 Thread Craig Ringer

On 10/14/2012 12:02 PM, Chris Angelico wrote: Is there an article somewhere about how best to do a plug-pull test? Or is it as simple as "fire up pgbench, kill the power, bring things back up, and see if anything isn't working"? That's what I'd do and what I've always done in the past, but oth

Re: [GENERAL] database corruption questions

2012-10-13 Thread Chris Angelico

On Sun, Oct 14, 2012 at 1:13 PM, Craig Ringer wrote: > * Never, ever, ever use cheap SSDs. Use good quality hard drives or (after > proper testing) high end SSDs. Read the SSD reviews periodically posted on > this mailing list if considering using SSDs. Make sure the SSD has a > supercapacitor or

Re: [GENERAL] database corruption questions

2012-10-13 Thread Craig Ringer

On 10/14/2012 05:53 AM, Heine Ferreira wrote: Hi Are there any best practices for avoiding database corruption? I forgot to mention, you should also read: http://www.postgresql.org/docs/current/static/wal-reliability.html -- Craig Ringer -- Sent via pgsql-general mailing list (pgsql-gene

Re: [GENERAL] database corruption questions

2012-10-13 Thread Craig Ringer

On 10/14/2012 11:00 AM, John R Pierce wrote: On 10/13/12 7:13 PM, Craig Ringer wrote: * Use a good quality hardware RAID controller with a battery backup cache unit if you're using spinning disks in RAID. This is as much for performance as reliability; a BBU will make an immense difference to d

Re: [GENERAL] database corruption questions

2012-10-13 Thread John R Pierce

On 10/13/12 7:13 PM, Craig Ringer wrote: * Use a good quality hardware RAID controller with a battery backup cache unit if you're using spinning disks in RAID. This is as much for performance as reliability; a BBU will make an immense difference to database performance. a comment on this o

Re: [GENERAL] database corruption questions

2012-10-13 Thread Craig Ringer

On 10/14/2012 05:53 AM, Heine Ferreira wrote: Hi Are there any best practices for avoiding database corruption? * Maintain rolling backups with proper ageing. For example, keep one a day for the last 7 days, then one a week for the last 4 weeks, then one a month for the rest of the year, the

Re: [GENERAL] database corruption questions

2012-10-13 Thread John R Pierce

On 10/13/12 3:04 PM, Leif Biberg Kristensen wrote: Lørdag 13. oktober 2012 23.53.03 skrev Heine Ferreira : >Hi > >Are there any best practices for avoiding database >corruption? In my experience, database corruption always comes down to flaky disk drives. Keep your disks new and shiny eg. les

Re: [GENERAL] database corruption questions

2012-10-13 Thread Leif Biberg Kristensen

Lørdag 13. oktober 2012 23.53.03 skrev Heine Ferreira : > Hi > > Are there any best practices for avoiding database > corruption? In my experience, database corruption always comes down to flaky disk drives. Keep your disks new and shiny eg. less than 3 years, and go for some kind of redundanc

Re: [GENERAL] Database corruption

2010-08-30 Thread Alvaro Herrera

Excerpts from George Woodring's message of lun ago 30 08:17:56 -0400 2010: > I am running 8.3.3 currently on this box. > Last week we had a database corruption issue that started as: > > Aug 24 07:15:19 iprobe028 postgres[20034]: [3-1] ERROR: could not read > block 0 of relation 1663/16554/746340

Re: [GENERAL] Database corruption with duplicate tables.

2010-04-19 Thread George Woodring

The version is 8.3.3, and I use autovacuum for the routine maintenance. The ctid's are distinct grande=# select oid, ctid, relname from pg_class where oid IN (26770910, 26770918, 26770919, 26770920); oid| ctid |relname --+-+--

Re: [GENERAL] Database corruption with duplicate tables.

2010-04-19 Thread Tom Lane

George Woodring writes: > Upon investigation I found that I have a table that is in the database twice > db=> select oid, relname from pg_class where oid IN (26770910, > 26770918, 26770919); >oid|relname > --+--- > 26770910 | av

Re: [GENERAL] Database corruption with duplicate tables.

2010-04-19 Thread Alvaro Herrera

George Woodring wrote: > I have found that I have a database problem after receiving the > following error from pg_dump: Lack of vacuuming, most likely. What version is this? Did you read previous threads about this problem on the archives? -- Alvaro Herrerahttp

Re: [GENERAL] database corruption

2009-04-09 Thread Ivan Sergio Borgonovo

On Wed, 8 Apr 2009 22:14:38 -0400 "Jeff Brenton" wrote: > > There are no filesystem level content size restrictions that I am > aware of on this system. The user pgsql should have full access > to the filesystems indicated except for the root filesystem. finished inodes? A lot of small files

Re: [GENERAL] database corruption

2009-04-08 Thread Albe Laurenz *EXTERN*

This thread is a top posting mess. I'll try to rearrange: Jeff Brenton wrote: > REINDEX INDEX testrun_log_pkey; > > ERROR: could not write block 1832079 of temporary file: No space left > on device > HINT: Perhaps out of disk space? > > There is currently 14GB free on

Re: [GENERAL] database corruption

2009-04-08 Thread Craig Ringer

Jeff Brenton wrote: > I've attempted to re-index the pkey listed but after an hour it fails > with > > REINDEX INDEX testrun_log_pkey; > > ERROR: could not write block 1832079 of temporary file: No space left > on device > > HINT: Perhaps out of disk space? > > There is currently 14GB free

Re: [GENERAL] database corruption

2009-04-08 Thread Jeff Brenton

> From: Adrian Klaver [mailto:akla...@comcast.net] > Sent: Wednesday, April 08, 2009 10:10 PM > To: pgsql-general@postgresql.org > Cc: Jeff Brenton > Subject: Re: [GENERAL] database corruption > > On Wednesday 08 April 2009 6:32:06 pm Jeff Brenton wrote: > > I've encounte

Re: [GENERAL] database corruption

2009-04-08 Thread Joshua D. Drake

kla...@comcast.net] > Sent: Wednesday, April 08, 2009 10:10 PM > To: pgsql-general@postgresql.org > Cc: Jeff Brenton > Subject: Re: [GENERAL] database corruption > > On Wednesday 08 April 2009 6:32:06 pm Jeff Brenton wrote: > > I've encountered some db corruption after re

Re: [GENERAL] database corruption

2009-04-08 Thread Jeff Brenton

il 08, 2009 10:08 PM To: Jeff Brenton Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] database corruption I would imagine you would have better luck dropping the index and recreating. But considering you're 98% full on that drive, it looks like you're about to have other problem

Re: [GENERAL] database corruption

2009-04-08 Thread Jeff Brenton

t: Re: [GENERAL] database corruption On Wednesday 08 April 2009 6:32:06 pm Jeff Brenton wrote: > I've encountered some db corruption after restarting postgres on my > database server running 8.2.4. I think that postgres did not shut down > cleanly. Postgres started appropriately but crash

Re: [GENERAL] database corruption

2009-04-08 Thread Adrian Klaver

On Wednesday 08 April 2009 6:32:06 pm Jeff Brenton wrote: > I've encountered some db corruption after restarting postgres on my > database server running 8.2.4. I think that postgres did not shut down > cleanly. Postgres started appropriately but crashed 45 minutes later. > I used pg_resetxlog af

Re: [GENERAL] database corruption

2009-04-08 Thread Chris

I would imagine you would have better luck dropping the index and recreating. But considering you're 98% full on that drive, it looks like you're about to have other problems... On Wed, Apr 8, 2009 at 8:32 PM, Jeff Brenton wrote: > I’ve encountered some db corruption after restarting postgres

Re: [GENERAL] Database corruption: finding the bad block

2007-07-12 Thread Erik Jones

On Jul 12, 2007, at 8:09 AM, Csaba Nagy wrote: Hi all, I just had the following error on one of our data bases: ERROR: could not access status of transaction 1038286848 DETAIL: could not open file "pg_clog/03DE": No such file or directory I researched on the mailing list and it looks like t

Re: [GENERAL] Database corruption: finding the bad block

2007-07-12 Thread Csaba Nagy

On Thu, 2007-07-12 at 16:18, Simon Riggs wrote: > The corruption could only migrate if the WAL records themselves caused > the damage, which is much less likely than corruption of the data blocks > at hardware level. ISTM that both Slony and Log shipping replication > protect fairly well against bl

Re: [GENERAL] Database corruption: finding the bad block

2007-07-12 Thread Simon Riggs

On Thu, 2007-07-12 at 15:09 +0200, Csaba Nagy wrote: > Luckily I remembered I have a WAL logging based replica, so I > recovered > the rest of the truncated file from the replica's same file... this > being an insert only table I was lucky I guess that this was an > option. > To my surprise, the sa

Re: [GENERAL] Database corruption

2007-05-18 Thread Alvaro Herrera

Shane wrote: > Hello all, > > Whilst running a regular pg_dumpall, I received the > following error from our spamassassin DB. > > pg_dump: ERROR: could not access status of transaction > 4521992 > DETAIL: could not open file "pg_clog/0004": No such file > or directory > pg_dump: SQL command to

Re: [GENERAL] Database corruption.

2007-02-08 Thread Brusser, Michael

>> Zeroing out the whole block containing it is the usual recipe. Something like this worked for me in the past: % dd bs=8k count=X < /dev/zero >> clog-file I had to calculate X, because I usually had a situation with truncated clog-file, and a failed attempt to read it from offset XYZ. And I

Re: [GENERAL] Database corruption.

2007-02-08 Thread Tom Lane

Michael Guerin <[EMAIL PROTECTED]> writes: > You're suggesting to zero out the block in the underlying table files, > or creating the missing pg_clog file and start filling with zero's? The former. Making up clog data is unlikely to help --- the bad xmin is just the first symptom of what's proba

Re: [GENERAL] Database corruption.

2007-02-08 Thread Michael Guerin

Zeroing out the whole block containing it is the usual recipe. I forget the exact command but if you trawl the archives for mention of "dd" and "/dev/zero" you'll probably find it. Keep in mind you want to stop the postmaster first, to ensure it doesn't have a copy of the bad block cached in m

Re: [GENERAL] Database corruption.

2007-02-08 Thread Tom Lane

Michael Guerin <[EMAIL PROTECTED]> writes: > Ok, so I'm trying to track down the rows now (big table slow queries :( > ) How does one zero out a corrupt row, plain delete? I see references > for creating the missing pg_clog file but I don't believe that's what > you're suggesting.. Zeroing ou

Re: [GENERAL] Database corruption.

2007-02-08 Thread Michael Guerin

Tom Lane wrote: Michael Guerin <[EMAIL PROTECTED]> writes: Hmm, that makes it sound like a plain old data-corruption problem, ie, trashed xmin or xmax in some tuple header. Can you do a "select count(*)" from this table without getting the error? no, select count(*) fails around 25

Re: [GENERAL] Database corruption.

2007-02-07 Thread Tom Lane

Michael Guerin <[EMAIL PROTECTED]> writes: >> Hmm, that makes it sound like a plain old data-corruption problem, ie, >> trashed xmin or xmax in some tuple header. Can you do a "select >> count(*)" from this table without getting the error? >> > no, select count(*) fails around 25 millions rows.

Re: [GENERAL] Database corruption.

2007-02-07 Thread Michael Guerin

Hmm, that makes it sound like a plain old data-corruption problem, ie, trashed xmin or xmax in some tuple header. Can you do a "select count(*)" from this table without getting the error? no, select count(*) fails around 25 millions rows. PostgreSQL 8.1RC1 on x86_64-unknown-linux-gnu, com

Re: [GENERAL] Database corruption.

2007-02-07 Thread Tom Lane

Michael Guerin <[EMAIL PROTECTED]> writes: > Also, all files in pg_clog are sequential with the last file being 0135. Hmm, that makes it sound like a plain old data-corruption problem, ie, trashed xmin or xmax in some tuple header. Can you do a "select count(*)" from this table without getting th

Re: [GENERAL] Database corruption.

2007-02-07 Thread Michael Guerin

Also, all files in pg_clog are sequential with the last file being 0135. Michael Guerin wrote: Hi, Our database filled up and now I'm getting this error on one of the tables. Is there any way to recover from this? Please let me know if more information is needed. pg_version

Re: [GENERAL] Database Corruption - last chance recovery options?

2007-01-06 Thread Tom Lane

"Thomas F. O'Connell" <[EMAIL PROTECTED]> writes: >> Michael Best <[EMAIL PROTECTED]> writes: >>> Set your memory requirement too high in postgresql.conf, reload >>> instead of restarting the database, it silently fails sometime later? > Wait, now I'm curious. If a change in postgresql.conf that

Re: [GENERAL] Database Corruption - last chance recovery options?

2007-01-06 Thread Thomas F. O'Connell

On Jan 5, 2007, at 10:01 PM, Tom Lane wrote: Michael Best <[EMAIL PROTECTED]> writes: Set your memory requirement too high in postgresql.conf, reload instead of restarting the database, it silently fails sometime later? Yeah, wouldn't surprise me, since the reload is going to ignore any ch

Re: [GENERAL] Database Corruption - last chance recovery options?

2007-01-05 Thread Tom Lane

Michael Best <[EMAIL PROTECTED]> writes: > Set your memory requirement too high in postgresql.conf, reload instead > of restarting the database, it silently fails sometime later? Yeah, wouldn't surprise me, since the reload is going to ignore any changes related to resizing shared memory. I thin

Re: [GENERAL] Database Corruption - last chance recovery options?

2007-01-05 Thread Michael Best

Thomas F. O'Connell wrote: On Jan 4, 2007, at 11:24 PM, Michael Best wrote: When I finally got the error report in the morning the database was in this state: $ psql dbname dbname=# \dt ERROR: cache lookup failed for relation 20884 Do you have your error logs, and were there any relevant

Re: [GENERAL] Database Corruption - last chance recovery options?

2007-01-05 Thread Thomas F. O'Connell

On Jan 4, 2007, at 11:24 PM, Michael Best wrote: When I finally got the error report in the morning the database was in this state: $ psql dbname dbname=# \dt ERROR: cache lookup failed for relation 20884 Do you have your error logs, and were there any relevant errors in them preceding

Re: [GENERAL] database corruption question

2006-10-11 Thread Tom Lane

[EMAIL PROTECTED] writes: > In the document "Transaction Processing in PostgreSQL" > ( http://www.postgresql.org/files/developer/transactions.pdf ) That's very, very old information. > I read : > "Postgres transactions are only guaranteed atomic if a disk page write > is an atomic action. Not tr

Re: [GENERAL] Database corruption with Postgre 7.4.2 on FreeBSD 6.1?

2006-07-27 Thread Shoaib Mir

It shouldnt run into these problems from time to time, that kind of a scenario only happened to me once so dont know exactly how often this can happen. But a recommendation from my end will be to upgrade to the newer PostgreSQL version as you are using an old release. Also try running some disk che

Re: [GENERAL] Database corruption with Postgre 7.4.2 on FreeBSD 6.1?

2006-07-26 Thread Aaron Glenn

On 7/26/06, aurora <[EMAIL PROTECTED]> wrote: From your experience do you expect the database would run into this from time to time that requires DBA's interventions? Is so it would become a problem for our customers because our product is a standalone system. We don't intend to expose the Postg

Re: [GENERAL] Database corruption with Postgre 7.4.2 on FreeBSD 6.1?

2006-07-26 Thread aurora

From your experience do you expect the database would run into this from time to time that requires DBA's interventions? Is so it would become a problem for our customers because our product is a standalone system. We don't intend to expose the Postgre database underneath. wy Try doing a R

Re: [GENERAL] Database corruption with Postgre 7.4.2 on FreeBSD 6.1?

2006-07-26 Thread Shoaib Mir

Try doing a REINDEX and see if you can recover all data blocks as it appears to me you have some data blocks messed up. If possible try taking the backup for your database as well.Thanks,-- Shoaib Mir EnterpriseDB (www.enterprisedb.com)On 7/27/06, aurora <[EMAIL PROTECTED] > wrote:Hello,We are stre

Re: [GENERAL] Database corruption using 7.4.1

2004-06-21 Thread Florian G. Pflug

On Fri, Jun 18, 2004 at 02:32:16PM -0400, Tom Lane wrote: > > Since that 7.4.2 release-note only talked about crashing queries due to the > > 7.4.1 bug, but not about data-corruption occuring, I wondered if the > > symptoms I have seen are related to the alignment bug in 7.4.1 or not. > > No, I d

Re: [GENERAL] Database corruption using 7.4.1

2004-06-18 Thread Tom Lane

"Florian G. Pflug" <[EMAIL PROTECTED]> writes: > ... I upgraded to 7.4.2, and fixed the system-tables > according to the 7.4.2 release-note. But this didn't really help - the > "analyze table" issued after fixing the system-tables exited with an > error about an invalid page header in one of our ta

Re: [GENERAL] Database Corruption ?

2003-11-14 Thread Tom Lane

"Chris Stokes" <[EMAIL PROTECTED]> writes: > PANIC: XLogWrite: write request 1/812D is past end of log 1/812D This sure looks like the symptom of the 7.3.3 failure-to-restart bug. If you are on 7.3.3 then an update to 7.3.4 will fix it. regards, tom lane

Re: [GENERAL] Database Corruption ?

2003-11-13 Thread Tom Lane

"Chris Stokes" <[EMAIL PROTECTED]> writes: > We use the RPM installation, if I do and rpm -Uvh for the packages to upgrade to the > new 7.3.4 will that be sufficient or does it require some sort of database upgrade > or unload/reload? Not for an update within the 7.3.* series. Just stop postmas

Re: [GENERAL] Database Corruption ?

2003-11-13 Thread Tom Lane

"Chris Stokes" <[EMAIL PROTECTED]> writes: > Just one more question, Where can I read up on this bug, I would like to inform > myself better before I promise a fix to our customer. See the list archives from just before the 7.3.4 release. The failure occurs when the old WAL ends exactly on a pag

Re: [GENERAL] database corruption on VACUUM ANALYZE, now unable to initdb

2001-08-20 Thread Tom Lane

Corey Minter <[EMAIL PROTECTED]> writes: > I don't understand how I > wouldn't be able to run initdb. How much free disk space have you got? regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster

Re: [GENERAL] database corruption?

2000-03-07 Thread Bruce Momjian

Drop index and recreate. > Hi, all. > > I'm relatively new to PostgreSQL, but I've been quite impressed with > it so far. This may be due to too much experience with MySQL. :) > > I'm currently getting this error on my nightly vacuum. These two > indices (as you may have guessed already) ar

Re: [GENERAL] database corruption?

2000-03-06 Thread Ed Loehr

Bryan Henderson wrote: > > > NOTICE: Index error_interface_idx: NUMBER OF INDEX' TUPLES (226766) > > IS NOT THE SAME AS HEAP' (226765) > ... > > NOTICE: Index pg_class_relname_index: NUMBER OF INDEX' TUPLES (74) > > IS NOT THE SAME AS HEAP' (75) > ... > >IIRC, I think the prob

Re: [GENERAL] database corruption?

2000-03-06 Thread Bryan Henderson

> NOTICE: Index error_interface_idx: NUMBER OF INDEX' TUPLES (226766) > IS NOT THE SAME AS HEAP' (226765) ... > NOTICE: Index pg_class_relname_index: NUMBER OF INDEX' TUPLES (74) > IS NOT THE SAME AS HEAP' (75) ... >IIRC, I think the problem and solution is basically the same:

Re: [GENERAL] database corruption?

2000-03-04 Thread Elmar . Haneke

Chris Jones schrieb: > I'm currently getting this error on my nightly vacuum. These two > indices (as you may have guessed already) are on columns named > interface and ewhen, on a table named error. The error table is > constantly being updated. (No comments about the implications of > that,

Re: [GENERAL] database corruption?

2000-03-03 Thread Ed Loehr

Chris Jones wrote: > > NOTICE: Index error_interface_idx: NUMBER OF INDEX' TUPLES (226766) IS NOT THE SAME >AS HEAP' (226765) > NOTICE: Index error_ewhen_idx: NUMBER OF INDEX' TUPLES (226766) IS NOT THE SAME AS >HEAP' (226765) Hope this was not already answered... I believe it means that th

66 matches

Mail list logo