Re: [HACKERS] streaming replication, "frozen snapshot backup on it" and missing relfile (postgres 9.2.3 on xfs + LVM)

2013-05-29 Thread David Powers
It's another possibility, but I think it's still somewhat remote given how long we've been using this method with this code. It's sadly hard to test because taking the full backup without the hard linking is fairly expensive (the databases comprise multiple terabytes). As a possibly unsatisfying

Re: [HACKERS] streaming replication, "frozen snapshot backup on it" and missing relfile (postgres 9.2.3 on xfs + LVM)

2013-05-23 Thread David Powers
Thanks for the response. I have some evidence against an issue in the backup procedure (though I'm not ruling it out). We moved back to taking the backup off of the primary and all errors for all three clusters went away. All of the hardware is the same, OS and postgres versions are largely the

Re: [HACKERS] streaming replication, "frozen snapshot backup on it" and missing relfile (postgres 9.2.3 on xfs + LVM)

2013-05-16 Thread David Powers
I'll try to get the primary upgraded over the weekend when we can afford a restart. In the meantime I have a single test showing that a shutdown, snapshot, restart produces a backup that passes the vacuum analyze test. I'm going to run a full vacuum today. -David On Wed, May 15, 2013 at 3:53 P

Re: [HACKERS] streaming replication, "frozen snapshot backup on it" and missing relfile (postgres 9.2.3 on xfs + LVM)

2013-05-15 Thread David Powers
First, thanks for the replies. This sort of thing is frustrating and hard to diagnose at a distance, and any help is appreciated. Here is some more background: We have 3 9.2.4 databases using the following setup: - A primary box - A standby box running as a hot streaming replica from the primar