On Jan 14, 2011, at 9:31 PM, Alfred Ganz wrote:
Charles,
Here is some more insight into my problem.
* I am now able to get a crash on a virtual machine, so life has
become a bit easier
kernel oops, usbhid-ups crash, or simply a failure to launch usbhid-ups?
* disabling the UPS, then immediately after re-enabling it, the first
libhid-detach-device fails after about 10 sec, with:
hid_force_open failed with return code 7.
i.e. no device has been found.
Following the first libhid-detach-device immediately by several
more,
they all fail the same way, but without another 10 sec delay.
Finally, adding a sleep 1 followed by another libhid-detach-device
will succeed.
* disabling the UPS, then waiting 20 sec after re-enabling it, the
first libhid-detach-device will succeed.
Note, I wasn't able to reduce this delay significantly, so it seems
that the total delay can be smaller when doing the above
failing operations.
* The same behavior occurs when using lsusb -d instead of the above
libhid-detach-device.
* usbhid-ups crashes if the last preceeding libhid-detach-device
fails,
but it will not fail if there is a successful libhid-detach-device
preceeding it, or if there is a longer inactive delay.
Unfortunately, the timing is for the virtual machine, and I don't
expect
things to be similar on the real machine, not to speak of the boot
context with other devices present.
As you suspected, it looks like usbhid-ups crashes if things have not
reached quiescence or some other kind of availability. However, I have
no idea how at boot time adding an active USB device can achieve this
(or maybe achieve it much more quickly).
It would of course be nice to make usbhid-ups have a builtin method
for
detecting such a state and at the same time be able to detect the
absence of the device in question. However, I think the appropriate
thing is to determine such a method outside of usbhid-ups first. If at
all possible, I would prefer to do this with some shell script, but if
push comes to shuff, I might have to resort to some C code as well.
I don't want to downplay the significance of the problem on your end,
but it is really up to the kernel to protect itself from race
conditions and crashes caused by userspace applications accessing
devices. To that end, I agree that something should be done outside
usbhid-ups.
We've had a few discussions on how the drivers should deal with USB
devices which are not there. My take on this is that we will try to
reconnect if it is a temporary disappearance, but we won't retry for
long at startup. I personally think that if the device node is not
ready by the time NUT starts, either NUT is being started too early,
or the device is not to be trusted with something as critical as
notification of power events.
That said, the HAL-style drivers are started when the device is
plugged in. While that might be nice, I don't think that's a fair
comparison because they tend to provide information about the power
situation, rather than being part of a reliable monitoring and
shutdown system.
Any advice on what might work would of course be much appreciated.
One workaround would be to patch the kernel to blacklist the UPS from
the kernel raw HID driver. Of course, this doesn't play well with
prebuilt binary kernels.
Along these lines, it should be possible to blacklist the kernel HID
module which has attached to the UPS. I haven't followed this portion
of the kernel much lately (and all bets are off in RedHat kernels),
but with any luck, it might be separate from the keyboard/mouse HID-to-
input-layer module.
A less intrusive way might be to watch the /dev space for the node
corresponding to the HID interface, and wait a few seconds after that
appears before detaching.
Thanks, AG
P.S. What happened to the mail server at lists.alioth.debian.org
We haven't heard any updates, and I don't see any tracker items
explaining what happened, but it seems to be back now.
_______________________________________________
Nut-upsdev mailing list
[email protected]
http://lists.alioth.debian.org/mailman/listinfo/nut-upsdev