Lars Marowsky-Bree wrote:
On 2008-04-03T13:59:36, Dejan Muhamedagic <[EMAIL PROTECTED]> wrote:

Any crm* program is significantly slower on a non-DC node
regardless of whether something's happening in the cluster. It's
always been like that.

I can confirm that. It's been for me ever since I started using heartbeat.

Hm, I've not personally observed that in my test cluster, or at least
not noticed anything out of line.

"Significantly" slower is bad; we mandate that "DC or not DC" is _not_
the question, and that users shouldn't care about this designation.

Could anyone who reproduces this report a few more details? Is it the
local node, the time it takes to process on the DC, or the network
roundtrip? (Should be observable using tcpdump/wireshark)

Just 2 measurements:

dktest2sles10:~# time crmadmin -D
Designated Controller is: dktest2sles10

real    0m0.005s
user    0m0.004s
sys     0m0.000s

dktest1sles10:~/cib# time crmadmin -D
Designated Controller is: dktest2sles10

real    0m1.014s
user    0m0.000s
sys     0m0.004s

dktest2sles10:~# time cibadmin -Q &> /dev/null

real    0m0.009s
user    0m0.004s
sys     0m0.004s

dktest1sles10:~/cib# time cibadmin -Q &> /dev/null

real    0m1.713s
user    0m0.004s
sys     0m0.004s

tcpdump:

y.x.z.103 is the DC
y.x.z.102 is the other node

08:22:16.803702 IP 10.200.200.102.32952 > 10.200.200.103.694: UDP, length 217 08:22:16.803626 IP 10.250.250.102.32951 > 10.250.250.103.694: UDP, length 221 08:22:16.803637 IP 10.250.250.102.32951 > 10.250.250.103.694: UDP, length 217 08:22:16.929482 IP 10.250.250.103.32869 > 10.250.250.102.694: UDP, length 221 08:22:16.929528 IP 10.200.200.103.32870 > 10.200.200.102.694: UDP, length 221

up to here, it's been just the normal heartbeat packets I think. Notice the roughly identical length.

Then I do:

debian dktest1sles10:~/cib# date +%H:%M:%S:%N; time cibadmin -Q &> /dev/null
08:22:16:041111482

real    0m1.189s
user    0m0.008s
sys     0m0.00

08:22:16.929976 IP 10.250.250.103.32869 > 10.250.250.102.694: UDP, length 2263 08:22:16.930026 IP 10.200.200.103.32870 > 10.200.200.102.694: UDP, length 2263
08:22:16.930029 IP 10.200.200.103 > 10.200.200.102: udp
08:22:16.929979 IP 10.250.250.103 > 10.250.250.102: udp

Both servers received an ntpdate sync against the same timesource a minute earlier. So to me, it looks like it's the DC who needs some time to process the request. The cluster had one primitive resource at that time and should have been pretty much idle.

Regards
Dominik
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to