Hi,
I have been running nut successfully for a long time with my Debian 11 server.
I upgraded my server to Debian 12 today, which upgraded nut also from 2.7.4-13
to 2.8.0-7. I noticed that after upgrade there was a upssched process running
and taking 100% cpu time.
I checked if there were any changes to configuration file formats with nut
upgrade and only differences I noticed were a terminology change from
master/slave to primary/secondary in upsd.users and upsmon.conf, and dummy mode
change for the dummy-ups. I didn’t expect these being the culprit, but I made
the changes to my configuration. No change, after five minutes of starting nut,
upssched will be running with 100% cpu usage.
Any ideas what could be causing this? Here is my configuration and in the end
are some log snippets:
---------------------------------------------------
heartbeat.conf
# heartbeat.conf -- 10 minute heartbeat
ups.status: OL
TIMER 300
ups.status: OB
TIMER 300
---------------------------------------------------
hosts.conf
MONITOR eaton5s@localhost "Eaton 5S 700"
---------------------------------------------------
nut.conf
MODE=netserver
---------------------------------------------------
ups.conf
maxretry = 3
[eaton5s]
driver = usbhid-ups
port = auto
desc = "Eaton 5S 700i UPS"
vendorid = "0463"
productid = "FFFF"
pollinterval = 15
[heartbeat]
driver = dummy-ups
mode=dummy-loop
port = heartbeat.conf
desc = "Watch over NUT"
---------------------------------------------------
upsd.conf
LISTEN 0.0.0.0 3493
---------------------------------------------------
upsmon.conf
RUN_AS_USER root
MONITOR eaton5s@localhost 1 upsmaster xxx primary
MONITOR heartbeat@localhost 0 upsmaster xxx primary
MINSUPPLIES 1
NOTIFYCMD /sbin/upssched
SHUTDOWNCMD "/sbin/shutdown -h +0"
POLLFREQ 5
POLLFREQALERT 5
HOSTSYNC 15
DEADTIME 15
POWERDOWNFLAG /etc/killpower
RBWARNTIME 43200
NOCOMMWARNTIME 300
FINALDELAY 5
NOTIFYMSG ONLINE "UPS %s: On line power."
NOTIFYMSG ONBATT "UPS %s: On battery."
NOTIFYMSG LOWBATT "UPS %s: Battery is low."
NOTIFYMSG REPLBATT "UPS %s: Battery needs to be replaced."
NOTIFYMSG FSD "UPS %s: Forced shutdown in progress."
NOTIFYMSG SHUTDOWN "Auto logout and shutdown proceeding."
NOTIFYMSG COMMOK "UPS %s: Communications (re-)established."
NOTIFYMSG COMMBAD "UPS %s: Communications lost."
NOTIFYMSG NOCOMM "UPS %s: Not available."
NOTIFYMSG NOPARENT "upsmon parent dead, shutdown impossible."
NOTIFYFLAG ONLINE EXEC
NOTIFYFLAG ONBATT EXEC
NOTIFYFLAG LOWBATT SYSLOG+WALL
NOTIFYFLAG REPLBATT SYSLOG+WALL
NOTIFYFLAG FSD SYSLOG+WALL
NOTIFYFLAG SHUTDOWN SYSLOG+WALL
NOTIFYFLAG COMMOK SYSLOG+WALL
NOTIFYFLAG COMMBAD SYSLOG+WALL
NOTIFYFLAG NOCOMM SYSLOG+WALL
NOTIFYFLAG NOPARENT SYSLOG+WALL
---------------------------------------------------
upssched.conf
CMDSCRIPT /etc/nut/upssched-cmd
PIPEFN /var/run/nut/upssched.pipe
LOCKFN /var/run/nut/upssched.lock
AT ONBATT eaton5s@localhost EXECUTE onbatt
AT ONBATT eaton5s@localhost START-TIMER beeperoff 15
AT ONLINE eaton5s@localhost EXECUTE online
AT ONLINE eaton5s@localhost CANCEL-TIMER beeperoff
AT ONBATT eaton5s@localhost START-TIMER shutdown-ws2019 300
AT ONLINE eaton5s@localhost CANCEL-TIMER shutdown-ws2019
AT ONBATT heartbeat@localhost CANCEL-TIMER heartbeat-failure-timer
AT ONBATT heartbeat@localhost START-TIMER heartbeat-failure-timer 660
---------------------------------------------------
upsd.users
[upsmaster]
password = xxx
upsmon primary
actions = set
actions = fsd
instcmds = all
[upsslave]
password = xxx
upsmon secondary
---------------------------------------------------
upssched-cmd
#! /bin/bash
UPS="eaton5s"
UPS_USERNAME="upsmaster"
UPS_PASSWORD="xxx"
[email protected]<mailto:[email protected]>
[email protected]<mailto:[email protected]>
STATUS=`upsc $UPS ups.status 2>&1 | grep -v '^Init SSL'`
CHARGE=`upsc $UPS battery.charge 2>&1 | grep -v '^Init SSL'`
CHMSG="[$STATUS]:$CHARGE%"
MSG="dummy message - $1"
case $1 in
online) MSG="$UPS, $CHMSG - On line power - power supply had been
restored." ;;
onbatt) MSG="$UPS, $CHMSG - On battery - power failure - save your
work!" ;;
lowbatt) MSG="$UPS, $CHMSG - Battery is low - shutdown now!" ;;
beeperoff)
logger -i -t upsshed-cmd "Turning off Beeper"
STATUS=`upsc ${UPS} ups.beeper.status 2>&1 | grep -v '^Init
SSL'`
if [[ "${STATUS}" == "enabled" ]] ; then
upscmd -u ${UPS_USERNAME} -p ${UPS_PASSWORD} ${UPS}
beeper.toggle
fi
;;
shutdown-ws2019)
MSG="$UPS, $CHMSG - Stopping WS2019"
logger -i -t upsshed-cmd "Stopping WS2019"
setsid /usr/local/bin/shutdown-ws2019 &
;;
upsgone)
logger -i -t upssched-cmd "The UPS has been gone for awhile"
;;
heartbeat-failure-timer)
MSG="NUT heart beat fails. $CHMSG"
# Email
MSG1="Hello, upssched-cmd reports NUT heartbeat has failed."
MSG2="Current status: $CHMSG \n\n$0 $1"
MSG3="\n\n$( ps -elf | grep -E 'ups[dms]|nut' )"
echo -e "$MSG1 $MSG2 $MSG3" | /usr/bin/mail -r "$EMAIL_FROM" \
-s "NUT heart beat fails. Currently $CHMSG" "$EMAIL_TO" ;;
*)
logger -i -t upssched-cmd "Unrecognized command: $1"
;;
esac
logger -i -t upssched-cmd $MSG
wall "$MSG"
---------------------------------------------------
---------------------------------------------------
This is normal ups activity in daemon log from the previous version:
Jun 10 23:10:50 fricka upssched[16007]: Cancelling timer:
heartbeat-failure-timer
Jun 10 23:10:50 fricka upssched[16007]: New timer: heartbeat-failure-timer (660
seconds)
Jun 10 23:20:50 fricka upssched[16007]: Cancelling timer:
heartbeat-failure-timer
Jun 10 23:20:50 fricka upssched[16007]: New timer: heartbeat-failure-timer (660
seconds)
Jun 10 23:30:55 fricka upssched[16007]: Cancelling timer:
heartbeat-failure-timer
Jun 10 23:30:55 fricka upssched[16007]: New timer: heartbeat-failure-timer (660
seconds)
Jun 10 23:40:55 fricka upssched[16007]: Cancelling timer:
heartbeat-failure-timer
Jun 10 23:40:55 fricka upssched[16007]: New timer: heartbeat-failure-timer (660
seconds)
Jun 10 23:50:55 fricka upssched[16007]: Cancelling timer:
heartbeat-failure-timer
Jun 10 23:50:55 fricka upssched[16007]: New timer: heartbeat-failure-timer (660
seconds)
---------------------------------------------------
This is daemon log from the new version, first two lines are the last lines of
the startup messages. One same upssched process (124835) running with 100% cpu
from 21:07:
Jun 11 21:02:40 fricka dummy-ups[121803]: sock_connect: enabling asynchronous
mode (auto)
Jun 11 21:02:41 fricka usbhid-ups[121820]: sock_connect: enabling asynchronous
mode (auto)
Jun 11 21:07:40 fricka upssched[124835]: Timer daemon started
Jun 11 21:07:40 fricka upssched[124835]: New timer: heartbeat-failure-timer
(660 seconds)
Jun 11 21:07:40 fricka nut-monitor[124832]: Network UPS Tools upsmon 2.8.0
Jun 11 21:12:40 fricka nut-monitor[128005]: Network UPS Tools upsmon 2.8.0
Jun 11 21:17:40 fricka upssched[124835]: Cancelling timer:
heartbeat-failure-timer
Jun 11 21:17:40 fricka upssched[124835]: New timer: heartbeat-failure-timer
(660 seconds)
Jun 11 21:17:40 fricka nut-monitor[131233]: Network UPS Tools upsmon 2.8.0
Jun 11 21:22:45 fricka nut-monitor[134330]: Network UPS Tools upsmon 2.8.0
Jun 11 21:27:45 fricka upssched[124835]: Cancelling timer:
heartbeat-failure-timer
Jun 11 21:27:45 fricka upssched[124835]: New timer: heartbeat-failure-timer
(660 seconds)
Jun 11 21:27:45 fricka nut-monitor[137838]: Network UPS Tools upsmon 2.8.0
---------------------------------------------------
_______________________________________________
Nut-upsuser mailing list
[email protected]
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/nut-upsuser