hrmz, still basically the same behaviour. I think it might be a *little* better with this patch. Before when under load it would start up quickly maybe 2 or 3 times out of 10 attempts....with this patch it might be up to 4 or 5 times out of 10...ish...or maybe it was just fluke *shrug*. I'm still only seeing your log statement a single time (I'm running at debug2). I have discovered something though - when the standby is in this state if I force a checkpoint on the primary then the standby comes right up. Is there anything I check or try for you to help figure this out?....or is it actually as designed that it could take 10-ish minutes to start up even after all clients have disconnected from the primary?
On Thu, Oct 27, 2011 at 11:27 AM, Simon Riggs <si...@2ndquadrant.com> wrote: > On Thu, Oct 27, 2011 at 5:26 PM, Chris Redekop <ch...@replicon.com> wrote: > > > Thanks for the patch Simon, but unfortunately it does not resolve the > issue > > I am seeing. The standby still refuses to finish starting up until long > > after all clients have disconnected from the primary (>10 minutes). I do > > see your new log statement on startup, but only once - it does not > repeat. > > Is there any way for me to see what the oldest xid on the standby is > via > > controldata or something like that? The standby does stream to keep up > with > > the primary while the primary has load, and then it becomes idle when the > > primary becomes idle (when I kill all the connections)....so it appears > to > > be current...but it just doesn't finish starting up > > I'm not sure if it's relevant, but after it has sat idle for a couple > > minutes I start seeing these statements in the log (with the same offset > > every time): > > DEBUG: skipping restartpoint, already performed at 9/95000020 > > OK, so it looks like there are 2 opportunities to improve, not just one. > > Try this. > > -- > Simon Riggs http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services >