Damned, I can't force openmanage to set the timer under 60s :(
#omconfig system recovery timer=10 Error! Recovery reset time must be between 60 and 720 seconds. I'll try to see if we can disable it. ----- Mail original ----- De: "aderumier" <aderum...@odiso.com> À: "pve-devel" <pve-devel@pve.proxmox.com> Envoyé: Jeudi 3 Décembre 2015 18:24:40 Objet: Re: [pve-devel] Blacklisting HP hardware watchdog timer module ? I just found a strange bug with ipmi_watchdog, dell openmanage related at boot the timeout is correclty setup to 10s root@kvmtest1 ~ # ipmitool mc watchdog get Watchdog Timer Use: SMS/OS (0x44) Watchdog Timer Is: Started/Running Watchdog Timer Actions: Hard Reset (0x01) Pre-timeout interval: 0 seconds Timer Expiration Flags: 0x10 Initial Countdown: 10 sec Present Countdown: 9 sec but after some minutes (5-10min), I'm seeing it at 480s # ipmitool mc watchdog get Watchdog Timer Use: SMS/OS (0xc4) Watchdog Timer Is: Started/Running Watchdog Timer Actions: No action (0x00) Pre-timeout interval: 0 seconds Timer Expiration Flags: 0x10 Initial Countdown: 480 sec Present Countdown: 479 sec In the dell openmanage, I'm seeing a reset configuration option at 480s. (I think it's the openmanage service which overwrite the value). I'll add a note in the wiki about this too. ----- Mail original ----- De: "aderumier" <aderum...@odiso.com> À: "dietmar" <diet...@proxmox.com> Cc: "pve-devel" <pve-devel@pve.proxmox.com> Envoyé: Jeudi 3 Décembre 2015 17:48:14 Objet: Re: [pve-devel] Blacklisting HP hardware watchdog timer module ? >>The timeout must be 60 seconds!! Never change that. >> >>We set the timeout to 60s when we start watchdog-mux. Ah ok. I thinked we need to define it manually What is the difference between this 2 timeout ? + int watchdog_timeout = 10; + int client_watchdog_timeout = 60; ipmitool give me 10s, so it's seem to works fine :) # ipmitool mc watchdog get Initial Countdown: 10 sec > Another question, I have done some tests 2weeks ago with a customer, > and I think I had some problem, if the node reboot too fast > (pve-ha-manager see the node down, but it's coming up again before the vm was > migrated). > Is it a known bug ? >>What bug exactly? I don't remember exactly, but lrm or crm was stuck, because node (and vms) had rebooted too fast. I don't have access to customer logs sorry. ----- Mail original ----- De: "dietmar" <diet...@proxmox.com> À: "aderumier" <aderum...@odiso.com> Cc: "pve-devel" <pve-devel@pve.proxmox.com> Envoyé: Jeudi 3 Décembre 2015 17:28:55 Objet: Re: [pve-devel] Blacklisting HP hardware watchdog timer module ? > BTW, what is the best timeout for the watchdog ? > I think that pve ha manager wait for around 1min before migrating vm ? > if yes, the watchdog timeout should be lower ? The timeout must be 60 seconds!! Never change that. We set the timeout to 60s when we start watchdog-mux. > Another question, I have done some tests 2weeks ago with a customer, > and I think I had some problem, if the node reboot too fast > (pve-ha-manager see the node down, but it's coming up again before the vm was > migrated). > Is it a known bug ? What bug exactly? _______________________________________________ pve-devel mailing list pve-devel@pve.proxmox.com http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel _______________________________________________ pve-devel mailing list pve-devel@pve.proxmox.com http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel _______________________________________________ pve-devel mailing list pve-devel@pve.proxmox.com http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel