The new thread on 7.4.5 losing committed transactions popped up just as I discovered something that was at least unexpected to me.
In doing the cleanup from my pg_resetxlogs from today's earlier fun, I found some missing rows and some duplicate row versions showing up in my restore. All of this was within a 90 second period, which makes sense to me. What doesn't make sense to me is that I'm missing 19 records in one table that were committed 3 hours before my crash. There were no errors before the crash, there were no errors in the dump after the pg_resetxlog. I have application logs that confirm these records were present; not only do I have logs showing they were saved, but logs from later processes manipulating these records. I'm running 7.4.5 on RHAS 3 x86-64 on 4x244 32GB system. It's NFS attached. Derogatory remarks about NFS welcome, but you're preaching to the choir. :) The only thing unusual thing I noticed today was abominable performance for several hours before the crash (Load=30, iowait=95%). This machine has been running for weeks with excellent performance - generally 4 times faster than my dual Xeon 2.4Ghz, 12GB RAM, 6x36GB U320 RAID 1+0 systems. Typically in my benchmarking sessions and application runs, I rarely saw any read activity - it appeared that everything was pulled straight out of the disk buffer cache. Today, NFS was choked with reads, despite having 10GB of RAM free (!). Nothing has changed on this machine in at least 4 weeks. Any ideas are appreciated. While I'm sure the crash is hardware/config related, the missing 19 records from something committed 3 hours earlier is confusing. :) As always, any insight is appreciated. We are very committed to PostgreSQL after booting a large Oracle installation out 16 months ago. thanks! ---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives? http://archives.postgresql.org