Hello, Whenever we fail-over ldirectord we notice that the shut down process (on the node which just switched from "primary" to "secondary") leaves behind checking children which keep hammering the real servers for no benefit.
Here is a simple fix (hack?) to make the children realise that their parent is gone and they should exit. They simply check their parent process ID and if it's 1 (init) then it means that there is no reason for them to hang around. My environment is CentOS 5.2 x86_64 running inside a Xen DomU with heartbeat-ldirectord-2.1.3-3.el5.centos, but the patch below is against 2.1.4 from http://hg.linux-ha.org/lha-2.1/archive/STABLE-2.1.4.tar.bz2 --- ldirectord/ldirectord.in.orig 2009-03-02 16:59:46.000000000 +1100 +++ ldirectord/ldirectord.in 2009-03-02 17:03:41.000000000 +1100 @@ -2311,6 +2311,11 @@ service_set($v, $r, "down", {force => 1}); } while (1) { + if (getppid() == 1) + { + &ld_log("parent of $$ died; exiting\n"); + exit 1; + } foreach my $r (@$real) { $0 = "$virtual_id checking $$r{server}"; _check_real($v, $r); I'd be glad to hear whether you like this patch or think it's an ugly workaround which shouldn't be necessary. Cheers, --Amos _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
