Note that I'm not subscribed to the nut-upsdev list at the moment due to issues with the mailer and DNS for the list server.
So, please CC me on any replies. Thanks! At Thu, 8 Mar 2012 23:01:39 -0500, Charles Lepple <[email protected]> wrote: Subject: Re: [Nut-upsdev] some fixes, improvements, and new features (EPO and DYING) for NUT > > On Mar 8, 2012, at 6:21 PM, Greg A. Woods wrote: > > > Here are a series of my recent changes to NUT. > > > > The first few in the set are primarily little fixes and improvements. > > > > In among those are a few for .gitignore files which of course you can > > ignore for SVN, and there's one for a commit to a generated file which > > of course should not be tracked in any VCS. > > We are actually in the process of trying to move the NUT source code > over to Git, but both conversions by git-svn and Eric S. Raymond's > reposurgeon are not quite there yet. (We are leaning towards > reposurgeon, which involves a little more tweaking of commits, but > produces better results for a one-way SVN-to-Git conversion, including > .gitignore files generated from svn:ignore properties.) You might want to look at "git svn" in the latest release again. Also, ignore all the half-assed half-brained ideas floating around out there on the internets abou how to use it. Most of the people writing about it are only taking into consideration the most basic uses. I've written up some more-or-less un-published notes on doing more complete conversions that were based on my work to make use of Git to maintain local changes to FreeBSD, yet another project that uses SVN. I've further refined them and the result I have now in my git-svn cloned copy of the NUT repository seems complete and fully functional, at least compared to what I can see of the SVN repository independently. Here's a link to my notes: http://www.planix.com/~woods/git.html Section 6 is what you'll be interested in -- don't pay much heed to any of the rest -- most of it is not very well tested in practice. Only the SVN conversions procedures have received much actual use. I could make a bundle or tar.gz of my working repos available too if you'd like to just have a quick peek to see if my way of converting from SVN had the kinds of results you were looking for. BTW, Meld is an amazing tool for picking apart changes between files, and just for viewing changes too. > Agreed in principle, although I haven't looked to see if collapsing > any of the unused bits will lead to binary incompatibility. Given how > distributions tend to lag behind the latest code, we often suggest > that people just drop in a replacement driver to test certain changes > without disrupting the rest of the install. This could be completely > unwarranted fears on my part, though. Entirely unwarrented indeed! :-) The inter-process comunications is, so far as I can tell, entirely free of any magic binary flag values. > This is an interesting distinction (one that a few drivers make in > their different shutdown commands, but that is not currently tied to > FSD). Quite a few drivers allow for the distinction in terms of the commands they accept, but of course with the confusing over-loaded half-specified ideas previously embodied in "FSD", there isn't any way to really make use of those commands separately in the existing infrastructure, thus my addition of a new control word. Note that I think it's critically important not to over-load the meaning of these state and status report words, especially not between device driver programs and the rest of the control infrastructure and communications protocols. Thus the need for a separate "I'm DYING!" status from the drivers and an "OK, we're initiating Emergency Power Off" control from upsmon. I think it might make sense for a driver to issue a status of "being shut down administratively out-of-band" so that NUT can learn of a shutdown initiated from, say, SNMP. I think this should be a word unique from "FSD" though just to keep things clear, and it's definitely got to be different from "DYING" too. It would represent a situation where the operator has commanded the UPS to power down the load, either from the front panel or from SNMP or SSH or whatever, but the systems would expect to be powered back up again when mains power returns, so though they might use "halt -p" to power themselves off, they should do so in such a way that they can reboot when power returns. When a driver says "DYING" it is an emergency situation and both the load and the UPS itself must be powered off permanently and quickly lest physical damage be incurred by continuing to operate. After DYING triggers an EPO the only way back should be if a human restarts the UPS from its front panel. It really must be a bit like a traditional "big red switch" though of course here the idea is to give everything the same chance to save the files as when "OB+LB" status does. I really really hate over-loading terms in protocols or state diagrams. It leads to enormous confusion, headaches, and inflexible designs. (I also hate using different terms for different sides of the same thing. A UPS, for example, is either "OnLine" (i.e. on mains power) or not (i.e. on battery). It might be talking to a control computer but have its load switched off too, but that's whole different thing. There's absolutely no need for the "OnBattery" state, and it only confuses things. However that's woven into the protocol now and it will be difficult to extract without going through a deprecation phase and a major revision, and perhaps adding a protocol version identifier.) (i.e. "LB" should be sufficient to trigger shutdown!) (But I'm not proposing an entire protocol rewrite just yet!) > The reason why I advocated usurping the "FSD" status was because it is > the only other status besides "OB LB" that is currently guaranteed to > trigger a shutdown. I wonder if we could just use FSD with some other > status option to indicate whether the driver should request a restart > when the power returns. FSD as-is is useless for my purposes, as I hope you'll see when you get a chance to look closer at the EPO and DYING patches. One could rip out FSD and rename my EPO feature to be FSD, but then one would lose the two important unique features of FSD which are based on its ability to trigger a "normal" restart. (I.e. the ability to test normal restart without actually removing mains power, and the ability to trigger an early shutdown in order to conserve battery in the even where the operator knows the mains outage will extend long beyond the current battery capacity.) I definitely wasn't going to rip out or otherwise break operator initiated FSD in my patch. :-) > It's definitely a feature I would like to see merged at some > point. Now that you mention this, I think there are several UPS > protocols which support a bitmask for alarm conditions which will > trigger a shutdown (including overtemp). We will want to make sure > that the procedure for setting that event mask is not terribly > different depending on whether the shutdown is triggered by the UPS > hardware, or by NUT monitoring other UPS status (as I believe you are > proposing with the DYING status). My idea is that a UPS can, through its driver, identify any condition which would trigger an alarm for a warning, and a DYING state (with the alarm still set as an explanation for the DYING state for an exceeded maximum (or minimum) condition; though even that much overloading of alarm status could be confusing without explicitly including the knowledge of whether it's a warning or a min/max-exceeded kind of alarm, and then we would end up with an invalid protocol state where DYING could be issued without an alarm, but that's what I've got now in some drivers -- maybe it would be better if DYING was like ALARM and included an explanation, but as-is if each bit of code which sets DYING writes a log entry so all is not lost). Of course with so few drivers currently implementing anything in the alarm feature, it's hard to get a higher level view of how things could be. In theory the same DYING feature could have been used for low-battery too, but that'll mean a whole protocol rewrite. Clearly this kind of lifetime and growth wasn't planned into NUT's design, but it's kind of like in biology where things might be a bit overly complicated and convoluted in places, however it works! -- Greg A. Woods Planix, Inc. <[email protected]> +1 250 762-7675 http://www.planix.com/
pgpkciclKLXgX.pgp
Description: PGP signature
_______________________________________________ Nut-upsdev mailing list [email protected] http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/nut-upsdev
