Here are a series of my recent changes to NUT. The first few in the set are primarily little fixes and improvements.
In among those are a few for .gitignore files which of course you can ignore for SVN, and there's one for a commit to a generated file which of course should not be tracked in any VCS. Then there are a couple or three to do with generating the header files used by nut-scanner. These probably could have been collapsed into one, but I left them separate to show more clearly what some of the problems are with the crazy attempts to use scripts to parse C code instead of using the compiler. The final one in that group is a half-assed attempt to generate one of the headers using a helper function directly from the compiled data structures it is derived from, and thus totally eliminating the need for the broken python script in the first place. Even this though is wrong -- the code needing the data structures from the driver should be linking directly with shared .o files to access it instead of re-inventing new data structures and trying to populate them from the existing data structures. The same thing should be done to eliminate the horrid perl script in there too. I then made some improvements to the SNMP driver to make it actually work properly with my AP9605 SNMP card, and which should make it work properly now with any SNMP agent implementing APC's POWERNET MIB. I also discovered the blazer driver does work pretty well with my GE Digital Energy GT Series UPS, at least with the 1000-3000 VA models. I added some more info about APC cables that I'd been keeping track of independently. I had independently made a similar change to the apcsmart driver to keep it from failing when tcgetattr() reported some irrelevant differences in the port settings. What's actually in the patch now is my merge of the change from upstream which is basically just an "improved" log message. I've also added some suggested coding improvements which I think will make things easier to maintain down the line, notably using clear syntax that's easy to modify safely for defining bit flag value macros, as well as a strong suggestion to NEVER EVER use comment syntax to comment out code blocks -- always use the pre-processor -- it's much safer! Finally I introduce the first of my new features: The "EPO" command. This is very similar to "FSD", but fundamentally different in that it goes a bit deeper into the infrastructure and it has a different purpose and ultimate affect on the systems being managed. The basic idea is to provide the moral equivalent, though not in quite such draconian and dangerous hard-core way, of an Emergency Power Off (big red) switch. The critical difference with FSD is that EPO is intended to require manual human intervention to recover from, and that it is also intended to completely and entirely remove power from everything if at all possible, even if mains power is still fully and smoothly functioning. I'm really not sure if "FSD" has a true purpose other than as a test command to see if everything will restart after mains power returns, since of course FSD tries to simulate the effect of mains power returning after a full shutdown has been committed to and is in progress. EPO on the other hand is a key requirement of my next feature: The ability of a UPS driver to declare that the UPS is dying of some critical condition and that it must be shut down in such a way that manual human intervention is required to restart it. EPO is also intended to be triggered automatically, whereas FSD (I think) is always intended to be manually introduced by a human systems manager. I.e. in an ideal configuration everything should restart and reboot and return to operational status after "upsmon -c fsd" once mains power returns or if power was never actually off; whereas with "upsmon -c epo" then everything should power down and stay off even if mains power remains on and steady. For example I would have used "FSD" to shut down in power blackouts where I knew the power could not return before the batteries ran low, and thus I would have conserved battery charge for the inevitable short hiccups that occur after a long blackout, but still been able to enjoy automatic restart after the blackout in case power returns while I'm sleeping, etc. Finally I add some features to the three drivers I was able to test which make use of this new "DYING" state to power things down safely but quickly when they detect operating temperatures above a configurable maximum value. The one driver that already supported use of the ALARM state also sets an alarm when the temperature rises above a configurable warning value. The idea here is that if the HVAC fails in your computer room then you can have everything automatically shut down _AND_ stop pouring BTUs out into the room, and of course hopefully first raise an alarm so that a human can try to intervene before an emergency power off is actually necessary to prevent equipment damage. Indeed the motivation behind these new features is because HVAC fails far more frequently in my client's server room, and with far more dire consequences, than the power fails. Indeed they have only one tiny UPS that can run only the most critical core equipment, but everything has come near to suffering serious physical damage when ambient temperatures have shot up above 45C in an extremely short time after HVAC failure, which of course is usually on a Saturday night. These changes are a work in progress to some extent -- I still have not fully tested the EPO of a running network, but I hope to do that very soon. The drivers do report alarms (where implemented) and they report the "DYING" status when their temperature sensors report above-maximum values. _______________________________________________ Nut-upsdev mailing list [email protected] http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/nut-upsdev
