I can confirm that both the pg_clog and pg_subtrans errors do occur when using pg_basebackup instead of rsync. The data itself seems to be fine because using the exact same data I can start up a warm standby no problem, it is just the hot standby that will not start up.
On Sat, Oct 15, 2011 at 7:33 PM, Chris Redekop <ch...@replicon.com> wrote: > > > Linas, could you capture the output of pg_controldata *and* increase > the > > > log level to DEBUG1 on the standby? We should then see nextXid value of > > > the checkpoint the recovery is starting from. > > > > I'll try to do that whenever I'm in that territory again... Incidentally, > > recently there was a lot of unrelated-to-this-post work to polish things > up > > for a talk being given at PGWest 2011 Today :) > > > > > I also checked what rsync does when a file vanishes after rsync > computed the > > > file list, but before it is sent. rsync 3.0.7 on OSX, at least, > complains > > > loudly, and doesn't sync the file. It BTW also exits non-zero, with a > special > > > exit code for precisely that failure case. > > > > To be precise, my script has logic to accept the exit code 24, just as > > stated in PG manual: > > > > Docs> For example, some versions of rsync return a separate exit code for > > Docs> "vanished source files", and you can write a driver script to > accept > > Docs> this exit code as a non-error case. > > I also am running into this issue and can reproduce it very reliably. For > me, however, it happens even when doing the "fast backup" like so: > pg_start_backup('whatever', true)...my traffic is more write-heavy than > linas's tho, so that might have something to do with it. Yesterday it > reliably errored out on pg_clog every time, but today it is > failing sporadically on pg_subtrans (which seems to be past where the > pg_clog error was)....the only thing that has changed is that I've changed > the log level to debug1....I wouldn't think that could be related though. > I've linked the requested pg_controldata and debug1 logs for both errors. > Both links contain the output from pg_start_backup, rsync, pg_stop_backup, > pg_controldata, and then the postgres debug1 log produced from a subsequent > startup attempt. > > pg_clog: http://pastebin.com/mTfdcjwH > pg_subtrans: http://pastebin.com/qAXEHAQt > > Any workarounds would be very appreciated.....would copying clog+subtrans > before or after the rest of the data directory (or something like that) make > any difference? > > Thanks! >