Haribabu Kommi <kommi.harib...@gmail.com> writes:
> I can think of a case where the "launcher_determine_sleep" function
> returns a big sleep value because of system time change.
> Because of that it is possible that the launcher is not generating
> workers to do the vacuum. May be I am wrong.

I talked with Alvaro about this and we agreed that's most likely what
happened.  The launcher tracks future times-to-wake-up as absolute times,
so shortly after the system clock went backwards, it could have computed
that the next time to wake up was 20 years in the future, and issued a
sleep() call for 20 years.  Fixing the system clock after that would not
have caused it to wake up again.

It looks like a SIGHUP (pg_ctl reload) ought to be enough to wake it up,
or of course you could restart the server.

In HEAD this doesn't seem like it could cause an indefinite sleep because
if nothing else, sinval queue overrun would eventually wake the launcher
even without any manual action from the DBA.  But the loop logic is
different in 9.1.

launcher_determine_sleep() does have a minimum sleep time, and it seems
like we could fairly cheaply guard against this kind of scenario by also
enforcing a maximum sleep time (of say 5 or 10 minutes).  Not quite
convinced whether it's worth the trouble though.

                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to