On Dec 4, 2013, at 3:35 PM, Roger Price wrote:

> I would like nut to become more loquacious, and to log a much more complete 
> report of its activity.  At present nut reports that its components have 
> started operation but does not automatically log their activity when UPS's 
> switch between OB and OL.  I believe that this under-reporting of important 
> facts is too minimalist - it would be better for system administrators and 
> for the nut support team if a much more complete report were available of all 
> OB/OL activity by each component.

In principle, more logging sounds like a good idea. What syslog level 
adjustments would you propose?

> Looking at the source code, it seems that much of what is needed is already 
> in place, but behind "if" conditions that ensure that little or nothing gets 
> through.  Long ago I wrote software, including a compiler, but my C 
> programming is limited to a class exercise many many years ago, and its based 
> on this "experience" that I'm guessing that in upssched.c function exec_cmd 
> the code
> 
>  snprintf(buf, sizeof(buf), "%s %s", cmdscript, cmd);
>  err = system(buf);
>  if (WIFEXITED(err)) {
>       if (WEXITSTATUS(err)) {
>               upslogx(LOG_INFO, "exec_cmd(%s) returned %d", buf, 
> WEXITSTATUS(err));
>       }
> 
> attempts to send a command to the operating system, possibly to execute a 
> Bash script.  If system(buf) fails, the tests block the error message. Surely 
> the error message is essential.  An unattended box is now in an emergency 
> situation.  After the inevitable IT failure the system should be auditable to 
> discover what went wrong and what should be done to prevent it happening in 
> the future.  Such an audit expects to find "exec_cmd(%s) returned %d" in the 
> log.

Are you looking for:

 * more diagnostics depending on the value of err,
 * logging of all return codes, even success

or both?

> "But these problems should be found by testing!", one might argue. Firstly, 
> the testing would be facilitated by this error message, and secondly, no 
> amount of testing will ever cover every situation met in the real world.
> 
> I believe nut would be improved by
> 
> 1. Logging a summary of the state of the nut system and the UPS's every 24 
> hours.

I would personally prefer that NUT didn't do this by default. (Then again, I 
don't do a lot of sysadmin work for critical systems, so take that with a grain 
of salt.) To me, this seems like a call to 'upsc' should be placed in a nightly 
cron job. If you have multiple UPSes, you can iterate over them. We could add 
an example script to the NUT source tree for that.

> 2. Automatically logging a record of driver, upsd, upsmon and upssched 
> activity for each OB/OL change.

Fair point. I don't think logging at every single point is necessary, but if 
it's configurable, that would work.

> 3. Replacing the upsmon NOTIFYFLAG "SYSLOG" by "NOSYSLOG".  All notifications 
> are logged unless the sysadmin explicitly calls for no logging.


I suspect I am missing something here. The default upsmon.conf logs everything 
to syslog (and wall) by default. Unless that part is broken (and I confess I 
haven't thoroughly tested it recently), wouldn't the defaults work without 
breaking existing installations? I agree that it is better to err on the side 
of logging more information, but I don't think we need to break the existing 
syntax to do that.

If anything, I would want finer-grained control over the syslog level for some 
of these events.

-- 
Charles Lepple
clepple@gmail




_______________________________________________
Nut-upsuser mailing list
[email protected]
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/nut-upsuser

Reply via email to