On Monday 01 August 2011 13:58:55 Trevor Hemsley wrote:
> Today they did it again. And then several more times - about every 20
> minutes in fact. The servers are in a remote data centre and I have no
> console access and the iLO's on these two servers are not set up and I'm
> unable to use them so I can see no output on the console. There's no
> information in /var/log about what the problem is, all I see is that one
> of the servers reboots itself and then 5 to 10 seconds later, the 2nd
> one follows it. I've seen from the logs that it's not always the same
> one that reboots first, sometimes it's one and sometimes the other. The
> only way I've managed to get the servers out of their 20 minute reboot
> loop is to stop drbd on one of the pair and migrate all my VMs to run on
> the other with all the DRBD devices in standalone mode. This seems to me
> to indicate that DRBD is most probably involved in the reboot.

Just a shot in the dark (because I was hit by the same last friday): Is there 
a watchdog active and set to a timeout of 20 minutes? Could be the 
corresponding userspace tool was removed or rendered unusable during the 
update...

Good luck,

Arnold

Attachment: signature.asc
Description: This is a digitally signed message part.

_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to