On Mar 22, 2010, at 8:53 AM, Charlie Garrison wrote:

Good evening,

On 22/03/10 at 7:55 AM -0400, Charles Lepple <[email protected]> wrote:

On Mon, Mar 22, 2010 at 7:12 AM, Charlie Garrison <[email protected] > wrote:

For the portion of the log quoted above, I admit I am not familiar
enough with this driver to say whether it has failed at this point.
Maybe someone from Eaton can comment on that.

Is that a hint that I should be contacting them, or do we know they have devs on this list?

Arnaud works for Eaton, but I don't specifically know who has access to PowerWare devices for testing.

454959.198654 Warning: excessive comm failures, limiting error reporting 454959.198672 Communications with UPS lost: Error executing command

The preceding two lines are generated from nutusb_comm_fail() in
bcmxcp_usb.c. What do you get from 'grep "Communications with UPS
lost" name-of-logfile' ?

$ grep "Communications with UPS lost" /var/log/nut-driver.log
10578.415713 Communications with UPS lost: get_answer: checksum error! 11180.245313 Communications with UPS lost: get_answer: checksum error! 38876.625676 Communications with UPS lost: get_answer: checksum error! 46843.166524 Communications with UPS lost: get_answer: checksum error! 47954.281275 Communications with UPS lost: get_answer: checksum error! 52548.325592 Communications with UPS lost: get_answer: checksum error! 55334.000485 Communications with UPS lost: get_answer: checksum error! 69408.920215 Communications with UPS lost: get_answer: checksum error! 73109.467953 Communications with UPS lost: get_answer: checksum error! 81290.330831 Communications with UPS lost: get_answer: checksum error! 205872.380255 Communications with UPS lost: get_answer: checksum error! 281913.375308 Communications with UPS lost: get_answer: checksum error! 394369.162435 Communications with UPS lost: get_answer: checksum error!
454959.198672   Communications with UPS lost: Error executing command
454979.199631   Communications with UPS lost: Error executing command

Note, the above possibly includes entries from the previous kill/ restart, not just the last one. Although I'm pretty sure I rotated the log file last time, so that should be from one run of the driver.

So the checksum errors were occurring while other aspects of the driver seemed to be working properly?

Does anyone have suggestions on how I can get the driver working on my
system? IOW, any ideas on how it can recover without me having to
dis/connect the USB cable and kill/restart the driver?

So if I remember from your previous emails, killing and restarting the
driver without reconnecting the USB cable does /not/ solve the
problem?

That is correct. And this time I was testing whether *only* dis/ connecting the cable would allow the driver to recover. It did allow the loop processing to continue, but it didn't properly recover. I had to kill the driver daemon as well.

That sounds like an issue with the firmware on the UPS
itself. There is a function to reset the device that we could try, but
I think we may need to add some more debugging to figure out what
error codes should trigger this:

http://libusb.sourceforge.net/doc/function.usbreset.html

Sorry, I'm missing the relevance here. Are you suggesting a one-time reset? Or should I add usb_reset somewhere in bcmxcp_usb.c? (My C skills aren't good enough to add that command and then open the device again.)

I guess that's more-or-less a reminder to me, once we find out the difference between occasional benign errors (maybe including the checksum error mentioned above) and the non-recoverable errors.

Did you have a debug statement around lines 150-160 in your code? I
would have thought we would see the error codes from
usb_interrupt_read().

I did, but I removed them to reduce verbosity. I'll uncomment & recompile and run again.

At some point, I would like to reorganize the debug levels so that it's just a matter of passing fewer "-D" options to the driver. Right now, the debug levels seem a bit haphazard to me (but maybe that's just because I am not as familiar with this driver).


_______________________________________________
Nut-upsdev mailing list
[email protected]
http://lists.alioth.debian.org/mailman/listinfo/nut-upsdev

Reply via email to