Actually this seems like a very strange filesystem /hw problem. The wal segments keep "changing" even after I stoped the database and noone is supposly accesing it:
root@lemur:/var/lib/postgresql/9.1/main/pg_xlog# md5sum 000000010000001000000049 6fd36722641dc2857bb950437c052fa3 000000010000001000000049 root@lemur:/var/lib/postgresql/9.1/main/pg_xlog# md5sum 000000010000001000000049 26e9c82d123513528824bdf9815dbd2b 000000010000001000000049 root@lemur:/var/lib/postgresql/9.1/main/pg_xlog# md5sum 000000010000001000000049 649111a77ac7ec26f4ddeed18e039faa 000000010000001000000049 root@lemur:/var/lib/postgresql/9.1/main/pg_xlog# lsof 000000010000001000000049 root@lemur:/var/lib/postgresql/9.1/main/pg_xlog# md5sum 000000010000001000000049 ac9ba79e672bc5df2c126044e9054ff7 000000010000001000000049 root@lemur:/var/lib/postgresql/9.1/main/pg_xlog# md5sum 000000010000001000000049 8956e59a4542599e8ded7450b7cab5a6 000000010000001000000049 root@lemur:/var/lib/postgresql/9.1/main/pg_xlog# md5sum 000000010000001000000049 514dccfe7f5df4c55747e14e6c13268f 000000010000001000000049 root@lemur:/var/lib/postgresql/9.1/main/pg_xlog# md5sum 000000010000001000000049 f2c53795afcbc7c150443a3cdd3550bb 000000010000001000000049 root@lemur:/var/lib/postgresql/9.1/main/pg_xlog# md5sum 000000010000001000000049 79687effd43c0e51a127a677e14a815c 000000010000001000000049 root@lemur:/var/lib/postgresql/9.1/main/pg_xlog# md5sum 000000010000001000000049 51b66cd72ed3fb11aa57fab244696e0f 000000010000001000000049 root@lemur:/var/lib/postgresql/9.1/main/pg_xlog# md5sum 000000010000001000000049 bf1a2ec5847c40a0b9200769cff601e4 000000010000001000000049 root@lemur:/var/lib/postgresql/9.1/main/pg_xlog# lsof 000000010000001000000049 root@lemur:/var/lib/postgresql/9.1/main/pg_xlog# Maybe this is off-topic but has anyone seen something like this? I'm on Ubuntu 12.04. This is the hard drive mount line (the hard drive is used exclusivly for the pg_xlog directory): /dev/sdb1 on /storage/sdb1 type ext4 (rw,noatime,errors=remount-ro) Thanks! On Fri, Apr 26, 2013 at 4:25 PM, German Becker <german.bec...@gmail.com>wrote: > Hi I have reverted to cp as archive command, but know under heavy load (> > 150 WAL segments in a minute) it happens that some wal segments gets > corrupted: > > postgres@lemur:~/9.1/main/pg_xlog$ md5sum 000000010000001000000049 > f1906d2745224430f811496df466203f 000000010000001000000049 > postgres@lemur:~/9.1/main/pg_xlog$ md5sum > ~/backups/wal/000000010000001000000049 > 7e73fe759e41e427497360a815f9d3e1 > /var/lib/postgresql/backups/wal/000000010000001000000049 > > > > > > On Fri, Apr 26, 2013 at 10:55 AM, Albe Laurenz <laurenz.a...@wien.gv.at>wrote: > >> German Becker wrote: >> > Here is the archive part of the config: >> > >> > archive_mode = on # allows archiving to be done >> > # (change requires restart) >> > archive_command = '/var/lib/postgresql/scripts/archive_copy.sh %p %f' >> # command to use to >> > archive a logfile segment >> > #archive_timeout = 0 # force a logfile segment switch after >> this >> > # number of seconds; 0 disables >> >> So the problem might be in that script. >> >> > The archive coommand makes a local copy and then it copies to the >> backup server via ssh. Both copies >> > are md5-checked and retried up to 3 times in case of failure. >> >> archive_command should not retry the operation, but rather >> return a non-zero return code. >> >> > I have seen under heavy load that some WALs are skipped, some have less >> size, some are corrupted (i,e, >> > the loop fails 3 times). >> > I'm not sure about the return value (checking it). What is the expected >> behaviour of the archiver? >> > Will it retry de archive if archive command returns differnt than 0? >> Will it retain the WAL segment >> > until it is succesfuly archived? >> >> See >> http://www.postgresql.org/docs/current/static/continuous-archiving.html#BACKUP-ARCHIVING-WAL >> >> archive_command should exit with zero only if the >> WAL segment was archived successfully. >> PostgreSQL will retry and retain the WAL segment until >> archival succeeds. >> >> Yours, >> Laurenz Albe >> > >