I have an 8.3.1 instance on Linux and since June 29th the autovacuum process has claimed to be working on the same three tables. That's OK, I am a very patient man, and these are very large tables. Today I started to get transaction wraparound warnings, so I go and check it out. Turns out the autovacuum processes are all just doing nothing. When I strace them, they are all three blocked on syscalls.
So I restart the database and run a vacuum. Of course, once the wraparound warning is reached, there's no way to disable the autovac, so now my vacuum maintenance job is competing with three invulnerable autovacuum processes. I am thinking of sending them SIGSTOP. Anyway, I have some issues. One, of course, is that the autovacuum should not have been deadlocked or otherwise stalled like that. Perhaps it needs a watchdog of some kind. Has anyone else experienced an issue like that in 8.3.1? The only thing I can see in the release notes that indicates this problem may have been fixed is the following: "Repair two places where SIGTERM exit of a backend could leave corrupted state in shared memory (Tom)" However I don't know who or what would have sent SIGTERM to the autovacuum children. Secondly, there really does need to be an autovacuum=off,really,thanks so that my maintenance can proceed without competition for i/o resources. Is there any way to make that happen? Is my SIGSTOP idea dangerous? -jwb