Package: heartbeat Version: 1:3.0.3-2 Severity: important During the upgrade of a couple of systems to squeeze, I encountered a problem that took me a while to debug (although it's nothing specific to heartbeat changes in squeeze).
More specifically, I was using heartbeat in a simple setup with a couple of NFS mounts and LSB resources. One of them was postgresql, which was getting skipped altogether as a resource. After some head-scratching, debugging and code-reading, the problem was pinpointed to this: Before starting a resource, ResourceManager tries to find if that service is already running (for LSB scripts, that's “/etc/init.d/foo status”) and if so, skips it¹ altogether. So, while ResourceManager tries to behave as per LSB wrt exit codes, it fails to do so when running status on init scripts. Instead of looking at the exit code, it performs the following horrendous heuristic (ResourceManager:209): case `$spath $arg status` in *[Nn][Oo][Tt]\ *[Rr]unning*) return 3;; *[Rr]unning*|*OK*) return 0;; *) return 3;; esac That bit me during the upgrade, because PostgreSQL's init script in squeeze produces the following output: ## when running $ /etc/init.d/postgresql status Running clusters: 8.4/main ## when not running # /etc/init.d/postgresql status Running clusters: The second one is meant to say that /nothing/ is running but is mistakenly considered by heartbeat as running because of the string match above. There's no way to workaround this without writing your own resource (which I did). But I think this would be better solved in heartbeat itself, by e.g. adhering to LSB and checking the exit code of the init script itself, instead of pattern matching on its output. [ I'm setting severity to important on this one, as my feeling is that postgresql+heartbeat installations are common ] Regards, Faidon ¹: Without logging /anything/, which probably is a bug on its own. -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org