On 04.05.2011 21:50, Scott Marlowe wrote:

> Then another pg_clog file disappeared.
> 
> Is it possible there's some rogue process deleting files in pg_clog
> somehow?

I don't think.


> Have you run an fsck on this drive to make sure it's not got
> any file system errors?

Also, don't think there is any corruption here. AFAIR, this system never 
crashed.
Could be there is some silent corruption though - but if there really was one, 
we would likely see the kernel complaining, stale files elsewhere, and so on.

Without such clues on filesystem corruption, I can't afford downtime.


I didn't mention, but the application first talks to pgpool, which talks to two 
database servers (i.e. inserts to both).

The real fun begins here - this is from two different servers:

db10:/var/log/postgresql# zgrep "No such" *
postgresql_log:May  4 18:24:28 db10 postgres[15751]: [23-2] 2011-05-04 18:24:28 
SGT DETAIL:  Could not open file "pg_clog/0601": No such file or directory.
postgresql_log:May  4 22:43:44 db10 postgres[15773]: [555-2] 2011-05-04 
22:43:44 SGT DETAIL:  Could not open file "pg_clog/0601": No such file or 
directory.
postgresql_log:May  4 22:44:30 db10 postgres[15791]: [1841-2] 2011-05-04 
22:44:30 SGT DETAIL:  Could not open file "pg_clog/0601": No such file or 
directory.
postgresql_log:May  4 22:55:53 db10 postgres[15741]: [4114-2] 2011-05-04 
22:55:53 SGT DETAIL:  Could not open file "pg_clog/0601": No such file or 
directory.


db20:/var/log/postgresql# zgrep "No such" *
postgresql_log:May  4 18:24:28 db20 postgres[27114]: [2-2] 2011-05-04 18:24:28 
SGT DETAIL:  Could not open file "pg_clog/0601": No such file or directory.
postgresql_log:May  4 22:43:44 db20 postgres[27116]: [2-2] 2011-05-04 22:43:44 
SGT DETAIL:  Could not open file "pg_clog/0601": No such file or directory.
postgresql_log:May  4 22:44:30 db20 postgres[27138]: [2-2] 2011-05-04 22:44:30 
SGT DETAIL:  Could not open file "pg_clog/0601": No such file or directory.
postgresql_log:May  4 22:55:53 db20 postgres[27104]: [2-2] 2011-05-04 22:55:53 
SGT DETAIL:  Could not open file "pg_clog/0601": No such file or directory.


I can't exclude some corruption happened much earlier on db10; the whole 
database (as binary files) was copied to db20 almost 2 months ago.

Why would it start showing pg_clog files missing just 2 days ago, and not 
earlier? Hmm.


-- 
Tomasz Chmielewski
http://wpkg.org

-- 
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Reply via email to