2007/6/13, Thomas Ã…kerblom (HF/EBC) <[EMAIL PROTECTED]>:

Hi All.
One of our tester has a system with four servers and a standby (heartbeat
2.0.7).
servers:
lab14b-ts-lim1
lab14b-ts-lim2
lab14b-ts-lim3
lab14b-ts-lim4
standby:
lab14b-ts-limx
For each server there is a group of three services (two IP addresses and a
telephony service).
group_lim1
group_lim2
group_lim3
group_lim4
If one server goes down the standby should take over the failing group.
At one point the tester powered down lim1 and lim2 and after a short time
powered them up again.
Lim1 09:55:06
Lim2 09:54:20
Group_lim1 starts on lim1 and group_lim2 starts on lim2
At 12:25:05 the standby suddenly begins to start group_lim3 and at the
same time group_lim2.
The services for these two groups go up and down for a while until the IPs
for group_lim3 and the telephony service for group_lim2 are running.


I see this in messages-lim2.txt:
===
Jun 11 12:24:59 lab14b-ts-lim2 heartbeat: [3362]: info: Link
lab14b-ts-lim3:eth0 dead.
Jun 11 12:25:02 lab14b-ts-lim2 kernel: tg3: eth1: Link is down.
Jun 11 12:25:03 lab14b-ts-lim2 kernel: tg3: eth0: Link is down.
===
and soon:
===
Jun 11 12:25:09 lab14b-ts-lim2 kernel: tg3: eth1: Link is up at 100 Mbps,
full duplex.
Jun 11 12:25:09 lab14b-ts-lim2 kernel: tg3: eth1: Flow control is off for TX
and off for RX.
Jun 11 12:25:10 lab14b-ts-lim2 kernel: tg3: eth0: Link is up at 100 Mbps,
full duplex.
Jun 11 12:25:10 lab14b-ts-lim2 kernel: tg3: eth0: Flow control is off for TX
and off for RX.
===
lab14b-ts-lim2 went off-line for a moment for a reason we don't know yet.
Heartbeat handled this right as far as I understand.

At this time I can see the message 'Another DC detected' in lim2 and in
limx.
Crm_mon will in all nodes show that lim3 is OFFLINE, even on lim3 itself.
I'm confused, can someone shed some light on this matter.
I attach logs that I think may be of interest.

<<nodgp.zip>>
BR.
*** Thomas
This communication is confidential and intended solely for the
addressee(s). Any unauthorized review, use, disclosure or distribution is
prohibited. If you believe this message has been sent to you in error,
please notify the sender by replying to this transmission and delete the
message without disclosing it. Thank you.
E-mail including attachments is susceptible to data corruption,
interruption, unauthorized amendment, tampering and viruses, and we only
send and receive e-mails on the basis that we are not liable for any such
corruption, interception, amendment, tampering or viruses or any
consequences thereof.



_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to