I have an 8.3.1 instance on Linux and since June 29th the autovacuum process
has claimed to be working on the same three tables.  That's OK, I am a very
patient man, and these are very large tables.  Today I started to get
transaction wraparound warnings, so I go and check it out.  Turns out the
autovacuum processes are all just doing nothing.  When I strace them, they
are all three blocked on syscalls.

So I restart the database and run a vacuum.  Of course, once the wraparound
warning is reached, there's no way to disable the autovac, so now my vacuum
maintenance job is competing with three invulnerable autovacuum processes.
I am thinking of sending them SIGSTOP.

Anyway, I have some issues.  One, of course, is that the autovacuum should
not have been deadlocked or otherwise stalled like that.  Perhaps it needs a
watchdog of some kind.  Has anyone else experienced an issue like that in
8.3.1?  The only thing I can see in the release notes that indicates this
problem may have been fixed is the following:

"Repair two places where SIGTERM exit of a backend could leave corrupted
state in shared memory (Tom)"

However I don't know who or what would have sent SIGTERM to the autovacuum
children.

Secondly, there really does need to be an autovacuum=off,really,thanks so
that my maintenance can proceed without competition for i/o resources.  Is
there any way to make that happen?  Is my SIGSTOP idea dangerous?

-jwb

Reply via email to