This is a continuation of trying to get ldirectord working under
pacemaker. I have a working installation of ldirectord. I know this
because if I manually configure the eth0:0 pseudo-interface with the
virtual server address, and manually start ldirectord with
# /usr/sbin/ldirectord /etc/ha.d/ldirectord.cf start
...then everything works. I can connect to the virtual service address
and port, and I get properly redirected to one of the real servers.
ipvsadm shows normal output. All looks good.
However, if I try to start the ldirectord resource, it starts, then
fails, then starts, then fails, etc. This will continue until I issue a
"resource ldirectord stop" command in the CRM shell.
So it has to be something with how I configured it, but I'm damned if I
can figure it out. Here is what I have that involves this resource:
primitive ldirectord ocf:heartbeat:ldirectord \
op start interval="20" timeout="15" \
op stop interval="20" timeout="15" \
op monitor interval="20" timeout="20" \
colocation vdir-ipi-with-ldirectord inf: vdir-ipi ldirectord
order vdir-ipi-before-ldirectord inf: vdir-ipi ldirectord
The vdir-ipi is an IPAddr resource that will start fine and results in
the eth0:0 alias interface being configured and brought up.
When I issue a "resource start ldirectord" command from the crm shell,
what I get from lrmd is repeats of this sequence:
Oct 28 18:12:24 vmx1.ucar.edu lrmd: [4842]: info: rsc:vdir-ipi:5464:
start
Oct 28 18:12:24 vmx1.ucar.edu lrmd: [4842]: info: Managed vdir-ipi:start
process 4923 exited with return code 0.
Oct 28 18:12:25 vmx1.ucar.edu lrmd: [4842]: info: rsc:ldirectord:5466:
start
Oct 28 18:12:25 vmx1.ucar.edu lrmd: [4842]: info: RA output:
(ldirectord:start:stdout) /usr/sbin/ldirectord /etc/ha.d/ldirectord.cf
start
Oct 28 18:12:26 vmx1.ucar.edu lrmd: [4842]: info: Managed
ldirectord:start process 5103 exited with return code 0.
Oct 28 18:12:27 vmx1.ucar.edu lrmd: [4842]: info: rsc:ldirectord:5467:
start
Oct 28 18:12:27 vmx1.ucar.edu lrmd: [4842]: info: perform_op:2906:
operation start[5467] on ocf::ldirectord::ldirectord for client 4845,
its parameters: CRM_meta_interval=[20000] CRM_meta_timeout=[15000]
crm_feature_set=[3.0.1] CRM_meta_name=[start] for rsc is already
running.
Oct 28 18:12:27 vmx1.ucar.edu lrmd: [4842]: info: perform_op:2916:
postponing all ops on resource ldirectord by 1000 ms
Oct 28 18:12:27 vmx1.ucar.edu lrmd: [4842]: info: perform_op:2906:
operation start[5467] on ocf::ldirectord::ldirectord for client 4845,
its parameters: CRM_meta_interval=[20000] CRM_meta_timeout=[15000]
crm_feature_set=[3.0.1] CRM_meta_name=[start] for rsc is already
running.
Oct 28 18:12:27 vmx1.ucar.edu lrmd: [4842]: info: perform_op:2910:
operations on resource ldirectord already delayed
Oct 28 18:12:27 vmx1.ucar.edu lrmd: [4842]: info: Managed
ldirectord:start process 5221 exited with return code 0.
Oct 28 18:12:27 vmx1.ucar.edu lrmd: [4842]: info: rsc:ldirectord:5468:
stop
Oct 28 18:12:27 vmx1.ucar.edu lrmd: [4842]: info: Managed
ldirectord:stop process 5226 exited with return code 0.
Oct 28 18:12:28 vmx1.ucar.edu lrmd: [4842]: WARN: Managed
ldirectord:monitor process 5265 exited with return code 7.
Oct 28 18:12:29 vmx1.ucar.edu lrmd: [4842]: info: cancel_op: operation
monitor[5469] on ocf::ldirectord::ldirectord for client 4845, its
parameters: CRM_meta_interval=[20000] CRM_meta_timeout=[20000]
crm_feature_set=[3.0.1] CRM_meta_name=[monitor] cancelled
Oct 28 18:12:29 vmx1.ucar.edu lrmd: [4842]: info: rsc:ldirectord:5470:
stop
Oct 28 18:12:29 vmx1.ucar.edu lrmd: [4842]: info: Managed
ldirectord:stop process 5296 exited with return code 0.
And then it repeats:
Oct 28 18:12:31 vmx1.ucar.edu lrmd: [4842]: info: rsc:ldirectord:5471:
start
etc.
How can I figure out what I have done wrong here?
Thanks,
--Greg
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems