On Sun, Oct 23, 2011 at 9:48 PM, Daniel Farina <dan...@heroku.com> wrote:
> On Mon, Oct 17, 2011 at 11:30 PM, Chris Redekop <ch...@replicon.com> wrote:
>> Well, on the other hand maybe there is something wrong with the data.
>>  Here's the test/steps I just did -
>> 1. I do the pg_basebackup when the master is under load, hot slave now will
>> not start up but warm slave will.
>> 2. I start a warm slave and let it catch up to current
>> 3. On the slave I change 'hot_standby=on' and do a 'service postgresql
>> restart'
>> 4. The postgres fails to restart with the same error.
>> 5. I turn hot_standby back off and postgres starts back up fine as a warm
>> slave
>> 6. I then turn off the load, the slave is all caught up, master and slave
>> are both sitting idle
>> 7. I, again, change 'hot_standby=on' and do a service restart
>> 8. Again it fails, with the same error, even though there is no longer any
>> load.
>> 9. I repeat this warmstart/hotstart cycle a couple more times until to my
>> surprise, instead of failing, it successfully starts up as a hot standby
>> (this is after maybe 5 minutes or so of sitting idle)
>> So...given that it continued to fail even after the load had been turned of,
>> that makes me believe that the data which was copied over was invalid in
>> some way.  And when a checkpoint/logrotation/somethingelse occurred when not
>> under load it cleared itself up....I'm shooting in the dark here
>> Anyone have any suggestions/ideas/things to try?
> Having digged at this a little -- but not too much -- the problem
> seems to be that postgres is reading the commit logs way, way too
> early, that is to say, before it has played enough WAL to be
> 'consistent' (the WAL between pg_start and pg_stop backup).  I have
> not been able to reproduce this problem (I think) after the message
> from postgres suggesting it has reached a consistent state; at that
> time I am able to go into hot-standby mode.
> The message is like: "consistent recovery state reached at %X/%X".
> (this is the errmsg)
> It doesn't seem meaningful for StartupCLOG (or, indeed, any of the
> hot-standby path functionality) to be called before that code is
> executed, but it is anyway right now.  I'm not sure if this oversight
> is simply an oversight, or indicative of a misplaced assumption
> somewhere.  Basically, my thoughts for a fix are to suppress
> hot_standby = on (in spirit) before the consistent recovery state is
> reached.

Not sure about that, but I'll look at where this comes from.

 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to