On 12 February 2016 at 17:56, Oliver Stöneberg wrote:
> A few weeks ago we already had a data corruption when the disk was
> full. There are other services running on the same machine that could
> cause the disk to fill up (e.g. local chaching when the network is
> acting up). It happened a few
On Fri, Feb 12, 2016 at 07:46:25AM -0500, Bill Moran wrote:
> Long term, you need to fix your hardware. Postgres doesn't corrupt
> itself just because the disks fill up, so your hardware must be lying
> about what writes completed successfully, otherwise, Postgres would
> be able to recover after a
On Fri, 12 Feb 2016 10:56:04 +0100
"Oliver Stöneberg" wrote:
> We are running a 64-bit PostgreSQL 9.4.5 server on Windows Server
> 2012. The system is a virtual machine on a VMware ESX 6.0 server and
> has 24 GB of memory. The database server is only accessed locally by
> two services and ther
Tom and Kevin-
There were two entries in pg_prepared_xacts. In the test-bed, executing
the 'ROLLBACK PREPARED' on both allowed the system to continue processing.
All locks I saw in 'pg_locks' where the virtualtransaction started with the
'-1/' were also gone. That was indeed the issue. More impo
Ned Wolpert writes:
> Event: Running 9.1.6 with hot-standby, archiving 4 months of wal files,
> and even a nightly pg_dump all. 50G database. Trying to update or delete a
> row in a small (21 row, but heavily used table) would lock up completely.
> Never finish. Removed all clients, restarted t
Ned Wolpert wrote:
> I'm doing a postmortem on a corruption event we had. I have an
> idea on what happened, but not sure. I figure I'd share what
> happened and see if I'm close to right here.
>
> Running 9.1.6 with hot-standby, archiving 4 months of wal files,
> and even a nightly pg_dump all.
Heine Ferreira wrote:
> Are there any best practices for avoiding database corruption?
First and foremost, do not turn off fsync or full_page_writes in your
configuration. After that the most common causes for database
corruption I've seen are bad RAM (ECC RAM is a requirement, not an
option for
On 10/18/2012 01:06 AM, Daniel Serodio wrote:
Craig Ringer wrote:
On 10/14/2012 05:53 AM, Heine Ferreira wrote:
Hi
Are there any best practices for avoiding database
corruption?
* Maintain rolling backups with proper ageing. For example, keep one a
day for the last 7 days, then one a week fo
Craig Ringer wrote:
On 10/14/2012 05:53 AM, Heine Ferreira wrote:
Hi
Are there any best practices for avoiding database
corruption?
* Maintain rolling backups with proper ageing. For example, keep one a
day for the last 7 days, then one a week for the last 4 weeks, then
one a month for the
Craig Ringer wrote:
On 10/14/2012 05:53 AM, Heine Ferreira wrote:
Hi
Are there any best practices for avoiding database
corruption?
* Maintain rolling backups with proper ageing. For example, keep one a
day for the last 7 days, then one a week for the last 4 weeks, then
one a month for the
On Sun, Oct 14, 2012 at 11:26:40AM +0800, Craig Ringer wrote:
> On 10/14/2012 11:00 AM, John R Pierce wrote:
> >On 10/13/12 7:13 PM, Craig Ringer wrote:
> >>
> >>* Use a good quality hardware RAID controller with a battery backup
> >>cache unit if you're using spinning disks in RAID. This is as muc
On 10/14/2012 12:02 PM, Chris Angelico wrote:
Is there an article somewhere about how best to do a plug-pull test?
Or is it as simple as "fire up pgbench, kill the power, bring things
back up, and see if anything isn't working"?
That's what I'd do and what I've always done in the past, but oth
On Sun, Oct 14, 2012 at 1:13 PM, Craig Ringer wrote:
> * Never, ever, ever use cheap SSDs. Use good quality hard drives or (after
> proper testing) high end SSDs. Read the SSD reviews periodically posted on
> this mailing list if considering using SSDs. Make sure the SSD has a
> supercapacitor or
On 10/14/2012 05:53 AM, Heine Ferreira wrote:
Hi
Are there any best practices for avoiding database
corruption?
I forgot to mention, you should also read:
http://www.postgresql.org/docs/current/static/wal-reliability.html
--
Craig Ringer
--
Sent via pgsql-general mailing list (pgsql-gene
On 10/14/2012 11:00 AM, John R Pierce wrote:
On 10/13/12 7:13 PM, Craig Ringer wrote:
* Use a good quality hardware RAID controller with a battery backup
cache unit if you're using spinning disks in RAID. This is as much for
performance as reliability; a BBU will make an immense difference to
d
On 10/13/12 7:13 PM, Craig Ringer wrote:
* Use a good quality hardware RAID controller with a battery backup
cache unit if you're using spinning disks in RAID. This is as much for
performance as reliability; a BBU will make an immense difference to
database performance.
a comment on this o
On 10/14/2012 05:53 AM, Heine Ferreira wrote:
Hi
Are there any best practices for avoiding database
corruption?
* Maintain rolling backups with proper ageing. For example, keep one a
day for the last 7 days, then one a week for the last 4 weeks, then one
a month for the rest of the year, the
On 10/13/12 3:04 PM, Leif Biberg Kristensen wrote:
Lørdag 13. oktober 2012 23.53.03 skrev Heine Ferreira :
>Hi
>
>Are there any best practices for avoiding database
>corruption?
In my experience, database corruption always comes down to flaky disk drives.
Keep your disks new and shiny eg. les
Lørdag 13. oktober 2012 23.53.03 skrev Heine Ferreira :
> Hi
>
> Are there any best practices for avoiding database
> corruption?
In my experience, database corruption always comes down to flaky disk drives.
Keep your disks new and shiny eg. less than 3 years, and go for some kind of
redundanc
Excerpts from George Woodring's message of lun ago 30 08:17:56 -0400 2010:
> I am running 8.3.3 currently on this box.
> Last week we had a database corruption issue that started as:
>
> Aug 24 07:15:19 iprobe028 postgres[20034]: [3-1] ERROR: could not read
> block 0 of relation 1663/16554/746340
The version is 8.3.3, and I use autovacuum for the routine maintenance.
The ctid's are distinct
grande=# select oid, ctid, relname from pg_class where oid IN
(26770910, 26770918, 26770919, 26770920);
oid| ctid |relname
--+-+--
George Woodring writes:
> Upon investigation I found that I have a table that is in the database twice
> db=> select oid, relname from pg_class where oid IN (26770910,
> 26770918, 26770919);
>oid|relname
> --+---
> 26770910 | av
George Woodring wrote:
> I have found that I have a database problem after receiving the
> following error from pg_dump:
Lack of vacuuming, most likely. What version is this? Did you read
previous threads about this problem on the archives?
--
Alvaro Herrerahttp
On Wed, 8 Apr 2009 22:14:38 -0400
"Jeff Brenton" wrote:
>
> There are no filesystem level content size restrictions that I am
> aware of on this system. The user pgsql should have full access
> to the filesystems indicated except for the root filesystem.
finished inodes?
A lot of small files
This thread is a top posting mess. I'll try to rearrange:
Jeff Brenton wrote:
> REINDEX INDEX testrun_log_pkey;
>
> ERROR: could not write block 1832079 of temporary file: No space left
> on device
> HINT: Perhaps out of disk space?
>
> There is currently 14GB free on
Jeff Brenton wrote:
> I've attempted to re-index the pkey listed but after an hour it fails
> with
>
> REINDEX INDEX testrun_log_pkey;
>
> ERROR: could not write block 1832079 of temporary file: No space left
> on device
>
> HINT: Perhaps out of disk space?
>
> There is currently 14GB free
> From: Adrian Klaver [mailto:akla...@comcast.net]
> Sent: Wednesday, April 08, 2009 10:10 PM
> To: pgsql-general@postgresql.org
> Cc: Jeff Brenton
> Subject: Re: [GENERAL] database corruption
>
> On Wednesday 08 April 2009 6:32:06 pm Jeff Brenton wrote:
> > I've encounte
kla...@comcast.net]
> Sent: Wednesday, April 08, 2009 10:10 PM
> To: pgsql-general@postgresql.org
> Cc: Jeff Brenton
> Subject: Re: [GENERAL] database corruption
>
> On Wednesday 08 April 2009 6:32:06 pm Jeff Brenton wrote:
> > I've encountered some db corruption after re
il 08, 2009 10:08 PM
To: Jeff Brenton
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] database corruption
I would imagine you would have better luck dropping the index and
recreating. But considering you're 98% full on that drive, it looks
like you're about to have other problem
t: Re: [GENERAL] database corruption
On Wednesday 08 April 2009 6:32:06 pm Jeff Brenton wrote:
> I've encountered some db corruption after restarting postgres on my
> database server running 8.2.4. I think that postgres did not shut
down
> cleanly. Postgres started appropriately but crash
On Wednesday 08 April 2009 6:32:06 pm Jeff Brenton wrote:
> I've encountered some db corruption after restarting postgres on my
> database server running 8.2.4. I think that postgres did not shut down
> cleanly. Postgres started appropriately but crashed 45 minutes later.
> I used pg_resetxlog af
I would imagine you would have better luck dropping the index and
recreating. But considering you're 98% full on that drive, it looks like
you're about to have other problems...
On Wed, Apr 8, 2009 at 8:32 PM, Jeff Brenton wrote:
> I’ve encountered some db corruption after restarting postgres
On Jul 12, 2007, at 8:09 AM, Csaba Nagy wrote:
Hi all,
I just had the following error on one of our data bases:
ERROR: could not access status of transaction 1038286848
DETAIL: could not open file "pg_clog/03DE": No such file or directory
I researched on the mailing list and it looks like t
On Thu, 2007-07-12 at 16:18, Simon Riggs wrote:
> The corruption could only migrate if the WAL records themselves caused
> the damage, which is much less likely than corruption of the data blocks
> at hardware level. ISTM that both Slony and Log shipping replication
> protect fairly well against bl
On Thu, 2007-07-12 at 15:09 +0200, Csaba Nagy wrote:
> Luckily I remembered I have a WAL logging based replica, so I
> recovered
> the rest of the truncated file from the replica's same file... this
> being an insert only table I was lucky I guess that this was an
> option.
> To my surprise, the sa
Shane wrote:
> Hello all,
>
> Whilst running a regular pg_dumpall, I received the
> following error from our spamassassin DB.
>
> pg_dump: ERROR: could not access status of transaction
> 4521992
> DETAIL: could not open file "pg_clog/0004": No such file
> or directory
> pg_dump: SQL command to
>> Zeroing out the whole block containing it is the usual recipe.
Something like this worked for me in the past:
% dd bs=8k count=X < /dev/zero >> clog-file
I had to calculate X, because I usually had a situation with truncated
clog-file, and a failed attempt to read it from offset XYZ.
And I
Michael Guerin <[EMAIL PROTECTED]> writes:
> You're suggesting to zero out the block in the underlying table files,
> or creating the missing pg_clog file and start filling with zero's?
The former. Making up clog data is unlikely to help --- the bad xmin is
just the first symptom of what's proba
Zeroing out the whole block containing it is the usual recipe. I forget
the exact command but if you trawl the archives for mention of "dd" and
"/dev/zero" you'll probably find it. Keep in mind you want to stop the
postmaster first, to ensure it doesn't have a copy of the bad block
cached in m
Michael Guerin <[EMAIL PROTECTED]> writes:
> Ok, so I'm trying to track down the rows now (big table slow queries :(
> ) How does one zero out a corrupt row, plain delete? I see references
> for creating the missing pg_clog file but I don't believe that's what
> you're suggesting..
Zeroing ou
Tom Lane wrote:
Michael Guerin <[EMAIL PROTECTED]> writes:
Hmm, that makes it sound like a plain old data-corruption problem, ie,
trashed xmin or xmax in some tuple header. Can you do a "select
count(*)" from this table without getting the error?
no, select count(*) fails around 25
Michael Guerin <[EMAIL PROTECTED]> writes:
>> Hmm, that makes it sound like a plain old data-corruption problem, ie,
>> trashed xmin or xmax in some tuple header. Can you do a "select
>> count(*)" from this table without getting the error?
>>
> no, select count(*) fails around 25 millions rows.
Hmm, that makes it sound like a plain old data-corruption problem, ie,
trashed xmin or xmax in some tuple header. Can you do a "select
count(*)" from this table without getting the error?
no, select count(*) fails around 25 millions rows.
PostgreSQL 8.1RC1 on x86_64-unknown-linux-gnu, com
Michael Guerin <[EMAIL PROTECTED]> writes:
> Also, all files in pg_clog are sequential with the last file being 0135.
Hmm, that makes it sound like a plain old data-corruption problem, ie,
trashed xmin or xmax in some tuple header. Can you do a "select
count(*)" from this table without getting th
Also, all files in pg_clog are sequential with the last file being 0135.
Michael Guerin wrote:
Hi,
Our database filled up and now I'm getting this error on one of the
tables. Is there any way to recover from this? Please let me know if
more information is needed.
pg_version
"Thomas F. O'Connell" <[EMAIL PROTECTED]> writes:
>> Michael Best <[EMAIL PROTECTED]> writes:
>>> Set your memory requirement too high in postgresql.conf, reload
>>> instead of restarting the database, it silently fails sometime later?
> Wait, now I'm curious. If a change in postgresql.conf that
On Jan 5, 2007, at 10:01 PM, Tom Lane wrote:
Michael Best <[EMAIL PROTECTED]> writes:
Set your memory requirement too high in postgresql.conf, reload
instead
of restarting the database, it silently fails sometime later?
Yeah, wouldn't surprise me, since the reload is going to ignore any
ch
Michael Best <[EMAIL PROTECTED]> writes:
> Set your memory requirement too high in postgresql.conf, reload instead
> of restarting the database, it silently fails sometime later?
Yeah, wouldn't surprise me, since the reload is going to ignore any
changes related to resizing shared memory. I thin
Thomas F. O'Connell wrote:
On Jan 4, 2007, at 11:24 PM, Michael Best wrote:
When I finally got the error report in the morning the database was in
this state:
$ psql dbname
dbname=# \dt
ERROR: cache lookup failed for relation 20884
Do you have your error logs, and were there any relevant
On Jan 4, 2007, at 11:24 PM, Michael Best wrote:
When I finally got the error report in the morning the database was
in this state:
$ psql dbname
dbname=# \dt
ERROR: cache lookup failed for relation 20884
Do you have your error logs, and were there any relevant errors in
them preceding
[EMAIL PROTECTED] writes:
> In the document "Transaction Processing in PostgreSQL"
> ( http://www.postgresql.org/files/developer/transactions.pdf )
That's very, very old information.
> I read :
> "Postgres transactions are only guaranteed atomic if a disk page write
> is an atomic action.
Not tr
It shouldnt run into these problems from time to time, that kind of a scenario only happened to me once so dont know exactly how often this can happen. But a recommendation from my end will be to upgrade to the newer PostgreSQL version as you are using an old release. Also try running some disk che
On 7/26/06, aurora <[EMAIL PROTECTED]> wrote:
From your experience do you expect the database would run into this from
time to time that requires DBA's interventions? Is so it would become a
problem for our customers because our product is a standalone system. We
don't intend to expose the Postg
From your experience do you expect the database would run into this from
time to time that requires DBA's interventions? Is so it would become a
problem for our customers because our product is a standalone system. We
don't intend to expose the Postgre database underneath.
wy
Try doing a R
Try doing a REINDEX and see if you can recover all data blocks as it appears to me you have some data blocks messed up. If possible try taking the backup for your database as well.Thanks,-- Shoaib Mir
EnterpriseDB (www.enterprisedb.com)On 7/27/06, aurora <[EMAIL PROTECTED]
> wrote:Hello,We are stre
On Fri, Jun 18, 2004 at 02:32:16PM -0400, Tom Lane wrote:
> > Since that 7.4.2 release-note only talked about crashing queries due to the
> > 7.4.1 bug, but not about data-corruption occuring, I wondered if the
> > symptoms I have seen are related to the alignment bug in 7.4.1 or not.
>
> No, I d
"Florian G. Pflug" <[EMAIL PROTECTED]> writes:
> ... I upgraded to 7.4.2, and fixed the system-tables
> according to the 7.4.2 release-note. But this didn't really help - the
> "analyze table" issued after fixing the system-tables exited with an
> error about an invalid page header in one of our ta
"Chris Stokes" <[EMAIL PROTECTED]> writes:
> PANIC: XLogWrite: write request 1/812D is past end of log 1/812D
This sure looks like the symptom of the 7.3.3 failure-to-restart bug.
If you are on 7.3.3 then an update to 7.3.4 will fix it.
regards, tom lane
"Chris Stokes" <[EMAIL PROTECTED]> writes:
> We use the RPM installation, if I do and rpm -Uvh for the packages to upgrade to the
> new 7.3.4 will that be sufficient or does it require some sort of database upgrade
> or unload/reload?
Not for an update within the 7.3.* series. Just stop postmas
"Chris Stokes" <[EMAIL PROTECTED]> writes:
> Just one more question, Where can I read up on this bug, I would like to inform
> myself better before I promise a fix to our customer.
See the list archives from just before the 7.3.4 release. The failure
occurs when the old WAL ends exactly on a pag
Corey Minter <[EMAIL PROTECTED]> writes:
> I don't understand how I
> wouldn't be able to run initdb.
How much free disk space have you got?
regards, tom lane
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster
Drop index and recreate.
> Hi, all.
>
> I'm relatively new to PostgreSQL, but I've been quite impressed with
> it so far. This may be due to too much experience with MySQL. :)
>
> I'm currently getting this error on my nightly vacuum. These two
> indices (as you may have guessed already) ar
Bryan Henderson wrote:
>
> > NOTICE: Index error_interface_idx: NUMBER OF INDEX' TUPLES (226766)
> > IS NOT THE SAME AS HEAP' (226765)
> ...
> > NOTICE: Index pg_class_relname_index: NUMBER OF INDEX' TUPLES (74)
> > IS NOT THE SAME AS HEAP' (75)
> ...
> >IIRC, I think the prob
> NOTICE: Index error_interface_idx: NUMBER OF INDEX' TUPLES (226766)
> IS NOT THE SAME AS HEAP' (226765)
...
> NOTICE: Index pg_class_relname_index: NUMBER OF INDEX' TUPLES (74)
> IS NOT THE SAME AS HEAP' (75)
...
>IIRC, I think the problem and solution is basically the same:
Chris Jones schrieb:
> I'm currently getting this error on my nightly vacuum. These two
> indices (as you may have guessed already) are on columns named
> interface and ewhen, on a table named error. The error table is
> constantly being updated. (No comments about the implications of
> that,
Chris Jones wrote:
>
> NOTICE: Index error_interface_idx: NUMBER OF INDEX' TUPLES (226766) IS NOT THE SAME
>AS HEAP' (226765)
> NOTICE: Index error_ewhen_idx: NUMBER OF INDEX' TUPLES (226766) IS NOT THE SAME AS
>HEAP' (226765)
Hope this was not already answered...
I believe it means that th
66 matches
Mail list logo