On 1/9/12 9:53 AM, Arnaud Quette wrote: > 2012/1/6 William Seligman <selig...@nevis.columbia.edu> > >> I've googled and RTFM'ed, but still can't solve this one. I hope you folks >> can. >> >> This affects my entire computer cluster, but let's start simple: I've got >> a computer running NUT; OS is Scientific Linux 5.5; kernel >> 2.6.18-274.12.1.el5xen. It connects to an APC SMART-UPS via an APC >> SmartCard using the snmp-ups driver. It generally works: upsmon will detect >> if the battery is low (I get an e-mail message); I can control the UPS, >> inspect it variables, set variables, issue commands, and so on. > > If "On battery" and "Low battery" are both detected, there should be no > issue. > >> There's just one thing that does not happen: when the UPS goes critical, >> the computer does not shut down. The upsmon daemon does not display any >> messages, does not write to the syslog, does not send e-mail, etc.; even >> though I've configured it to do so in upsmon.conf.>> >> I've tried nut-2.2.2, nut-2.4.3, and nut-2.6.2, and the symptom is the >> same. > > Using the latest version, when possible, is always a good idea.
Installing nut-2.6.2 on a Scientific Linux 5.5 system was a bit difficult, and played havoc with my regular yum updates. After I've finished debugging this problem, I'm going to completely reinstall the OS to make sure I've got a consistent set of RPMs. >> I tried issuing a "graceful reboot" command via the APC SmartCard's web and >> telnet interface. It made no difference; the system still did not shut >> down. >> >> Now let's extend the problem to my cluster: I have a variety of different >> computers, all running Scientific Linux 5.5, connecting through different >> switches, connecting to different flavors of APC SMART-UPSes, via >> SmartCards, each ranging in age from six months to five years. They all >> exhibit this same symptom, as I painfully discovered during a recent power >> outage: they all sent me e-mail when the UPSes went to low battery, but >> none turned off when the UPS went critical. Given the range of hardware >> involved, this must be a common software problem. >> >> The systems will shut down properly if I do "upsmon -c fsd", so it doesn't >> appear to be a permissions problem. >> >> I don't think this is the upsdrv_shutdown() issue described in the snmp-ups >> man page; I do not care if the UPS shuts down when the computer does, nor >> do I want it to. I just want upsmon to shut down the system when the UPS >> goes critical. >> >> Here are my config files; the system is tanya, its UPS is tanya-ups. Any >> advice? >> >> ups.conf: >> >> [tanya-ups] >> driver = snmp-ups >> port = tanya-ups >> community = private >> mibs = apcc >> >> upsd.conf: >> >> # LISTEN 0.0.0.0 3493 >> >> upsd.users: >> >> [admin] >> password = nowayjose >> actions = SET >> instcmds = all >> upsmon master >> > > it's also a good idea to separate monitoring and administrative users. > Ie: > [admin] > password = XXX > actions = SET > instcmds = all > > [monuser] > password = XXX > upsmon master > >> upsmon.conf: >> >> MONITOR tanya-ups@localhost 1 admin nowayjose master >> MINSUPPLIES 1 >> SHUTDOWNCMD "/sbin/shutdown -h +0" >> NOTIFYCMD /home/bin/notify.sh # sends me e-mail >> POLLFREQ 5 >> POLLFREQALERT 5 >> HOSTSYNC 15 >> DEADTIME 15 >> POWERDOWNFLAG /etc/killpower >> NOTIFYFLAG ONLINE SYSLOG >> NOTIFYFLAG ONBATT SYSLOG+WALL >> NOTIFYFLAG LOWBATT SYSLOG+WALL >> NOTIFYFLAG FSD SYSLOG+WALL+EXEC >> NOTIFYFLAG COMMOK SYSLOG >> NOTIFYFLAG COMMBAD SYSLOG >> NOTIFYFLAG SHUTDOWN SYSLOG+WALL+EXEC >> NOTIFYFLAG REPLBATT SYSLOG+WALL+EXEC >> NOTIFYFLAG NOCOMM SYSLOG >> NOTIFYFLAG NOPARENT SYSLOG+WALL >> RBWARNTIME 43200 >> NOCOMMWARNTIME 300 >> FINALDELAY 5 > > Your config seems fine. > An interesting test to do would be to stop upsmon, but keep snmp-ups and > upsd, then discharge your UPS and to ensure that you indeed get an > ups.status == "OB LB", which triggers the call to upsmon.conf->SHUTDOWNCMD. > Note that you need both "OB" and "LB", since you may have "low battery" and > be "online" at the same time! This is a good idea, and I ran the test. I disconnected the UPS, and periodically checked the output of: upsc tanya-ups@localhost ups.status Eventually this command returned "OB LB" as you said. But upsmon did nothing. I waited and eventually the UPS shut power to the system in a hard crash. So the UPS is sending the correct signals, and snmp-ups is reporting the correct status. Is there anything else I can check to trace the cause of the problem? -- Bill Seligman | Phone: (914) 591-2823 Nevis Labs, Columbia Univ | mailto://selig...@nevis.columbia.edu PO Box 137 | Irvington NY 10533 USA | http://www.nevis.columbia.edu/~seligman/
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ Nut-upsuser mailing list Nut-upsuser@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/nut-upsuser