Hi Antoine, There was a pull request to change the default value: https://github.com/apache/cloudstack/pull/10111
I personally agree with the change, but it is better to discuss it with a wider group of users. you can share your opinion on github. -Wei On Fri, Mar 28, 2025 at 5:17 PM Antoine Boucher <antoi...@haltondc.com> wrote: > Thank you, Wei, as always. > > This is a half-empty versus half-full glass issue. > > Based on our experience, there is more to lose than gain. I would suggest > setting the default to > reboot.host.and.alert.management.on.heartbeat.timeout=false. > > Regards, > Antoine > > > > *Antoine Boucher* > antoi...@haltondc.com > [o] +1-226-505-9734 > www.haltondc.com > > > > Confidentiality Warning: This message and any attachments are intended > only for the use of the intended recipient(s), are confidential, and may be > privileged. If you are not the intended recipient, you are hereby notified > that any review, retransmission, conversion to hard copy, > copying, circulation or other use of this message and any attachments is > strictly prohibited. If you are not the intended recipient, please notify > the sender immediately by return e-mail, and delete this message and any > attachments from your system. > > > On Mar 28, 2025, at 3:22 AM, Wei ZHOU <ustcweiz...@gmail.com> wrote: > > Hi, > > Currently this is the default behavior that the host is rebooted in case of > NFS failure. > > You can add the line to agent.properties and restart cloudstack-agent to > make it effective. > > reboot.host.and.alert.management.on.heartbeat.timeout=false > > > > -Wei > > On Fri, Mar 28, 2025 at 5:06 AM Antoine Boucher > <antoi...@haltondc.com.invalid> wrote: > > We experienced unexpected cascading reboots across all hosts, followed by > HA kicking in and migrating VMs. Amid the chaos, we discovered that a newly > added zone-wide NFS server, used only by one stopped test VM, had gone > offline. Once we disabled that NFS server in the UI, everything slowly > stabilized. > > We have a large number of NFS servers online in the zone. Is this expected > behavior? Can one NFS server going offline with just a single stopped VM > trigger mass host reboots? This feels like operational madness. > > Regards, Antoine > > Antoine Boucher > antoi...@haltondc.com > [o] +1-226-505-9734 > www.haltondc.com > > >