Re: [Nut-upsuser] nutdrv_qx hangs after send: QS
Hi, Thanks for all the help. I have a spare pcengines ALIX box lying around and I am going to repurpose that as a dedicated UPS monitoring system. Regards, Richard On Mon, Apr 6, 2015 at 2:27 PM Charles Lepple clep...@gmail.com wrote: On Apr 6, 2015, at 8:46 AM, Richard Flint richard.fl...@gmail.com wrote: Unfortunately this approach isn't going to work. I've done some further research and it would appear that it is the underlying ugen device and not libusb that is failing to honor the timeout. https://mail-index.netbsd.org/tech-misc/2006/03/17/.html The person above worked around this by having the device opened in non blocking mode using the O_NONBLOCK flag but this required changing libusb. Unfortunately, the ugen drivers in Solaris, NetBSD, FreeBSD, etc. are similarly named, but not necessarily the same code. At least with the *BSDs, you can inspect the code to see what has changed (and the BSD USB support has improved significantly across the board since 2006 when that message was posted). OpenUSB seems to work around the blocking issue by creating a separate timeout thread, so I still think there is some hope there. On the other hand, if that doesn't work, I would consider setting up another system dedicated to monitoring the UPS, and setup NUT on Solaris to be a client to the other system. Linux on x86_64 is probably the best supported OS for NUT and a USB UPS, but others have had success with embedded ARM and MIPS Linux systems. I have a Soekris net5501 running FreeBSD monitoring the USB UPS in my home data center in the basement, and it has been very reliable: http://soekris.com/products/net5501.html (also available in a rack-mount form factor; I have no connection to Soekris other than using their equipment) -- Charles Lepple clepple@gmail ___ Nut-upsuser mailing list Nut-upsuser@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/nut-upsuser
Re: [Nut-upsuser] nutdrv_qx hangs after send: QS
On Apr 6, 2015, at 8:46 AM, Richard Flint richard.fl...@gmail.com wrote: Unfortunately this approach isn't going to work. I've done some further research and it would appear that it is the underlying ugen device and not libusb that is failing to honor the timeout. https://mail-index.netbsd.org/tech-misc/2006/03/17/.html The person above worked around this by having the device opened in non blocking mode using the O_NONBLOCK flag but this required changing libusb. Unfortunately, the ugen drivers in Solaris, NetBSD, FreeBSD, etc. are similarly named, but not necessarily the same code. At least with the *BSDs, you can inspect the code to see what has changed (and the BSD USB support has improved significantly across the board since 2006 when that message was posted). OpenUSB seems to work around the blocking issue by creating a separate timeout thread, so I still think there is some hope there. On the other hand, if that doesn't work, I would consider setting up another system dedicated to monitoring the UPS, and setup NUT on Solaris to be a client to the other system. Linux on x86_64 is probably the best supported OS for NUT and a USB UPS, but others have had success with embedded ARM and MIPS Linux systems. I have a Soekris net5501 running FreeBSD monitoring the USB UPS in my home data center in the basement, and it has been very reliable: http://soekris.com/products/net5501.html (also available in a rack-mount form factor; I have no connection to Soekris other than using their equipment) -- Charles Lepple clepple@gmail ___ Nut-upsuser mailing list Nut-upsuser@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/nut-upsuser
Re: [Nut-upsuser] nutdrv_qx hangs after send: QS
On Apr 5, 2015, at 12:26 AM, Richard Flint richard.fl...@gmail.com wrote: Any idea how i can get NUT to build against this libopenusb which has been installed by Solaris? ... It might be possible to do the following: • install openusb into an alternate directory (e.g. $HOME/local) • set PKG_CONFIG_PATH to anything that doesn't contain the system libusb.pc • put $HOME/local/bin (or wherever openusb-config gets installed) at the front of the $PATH, and symlink openusb-config to libusb-config • reconfigure NUT Same sort of thing, just symlink or copy the openusb-config file such that NUT's configure script picks that up first (it's looking for libusb.pc first, then libusb-config). If that works, we can add openusb-config into the search. -- Charles Lepple clepple@gmail ___ Nut-upsuser mailing list Nut-upsuser@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/nut-upsuser
Re: [Nut-upsuser] nutdrv_qx hangs after send: QS
Hi, I have used the truss command as directed. I have attached both the driver output and the last few sections of truss output leading to the hang. Both outputs end at the CTRL+C i pressed when i was forced to end the processes. Hope this is helpful. Please bear with me if i didn't run it with the right options etc - as I mentioned, I'm a little new to Solaris. Regards, Richard On Sun, Apr 5, 2015 at 1:09 AM Richard Flint richard.fl...@gmail.com wrote: Thank you for the rapid response. I will try and investigate getting answers to some of your points but I'm a little new to Solaris so I'll need some time. Glancing at the configure output, it looks like it built against v0.1.7 of libusb (yes i think that is derived from the one you mention), checking for libusb version via pkg-config... 0.1.7 found checking for libusb cflags... checking for libusb ldflags... -lusb checking for usb.h... yes checking for usb_init... yes checking for usb_detach_kernel_driver_np... no I will first investigate how to set the debug level. Regards, Richard On Sun, Apr 5, 2015 at 12:44 AM Charles Lepple clep...@gmail.com wrote: On Apr 4, 2015, at 7:19 PM, Richard Flint richard.fl...@gmail.com wrote: More extensive debugging by running the driver sudo ./nutdrv_qx -u root -a MY_UPS -DD indicates the driver works normally then will randomly stop working at stop send: QS. The debug logs show values successfully retrieved repeatedly until something like: Quick update... send: QS read: (247.9 239.1 248.0 005 50.0 27.5 --.- 1001 update_status: OL update_status: !LB update_status: !CAL update_status: !FSD upsdrv_updateinfo... Quick update... send: QS (driver hangs here) I'm using Generic Q* USB/Serial driver 0.06 (2.7.2) with USB communication driver 0.32. Playing with pollinterval didn't help - Is there anything further I can do to help troubleshoot this problem? Thanks, this narrows it down a good deal. @zykh made some changes to nutdrv_qx since the 2.7.2 release, but at first glance, I don't think those will alter the symptoms you are seeing. Can you provide some detail on the libusb port that you built against? If it is derived from the original sourceforge.net libusb-0.1, does it have a USB_DEBUG environment variable that can be set to log extra information? Also, is it possible to do a system call trace to figure out what libusb and the OS are doing at the time of the hang? It's been a while since I last used Solaris, but if memory serves, you could use something like truss to approximate what strace does on Linux. -- Charles Lepple clepple@gmail Network UPS Tools - Generic Q* USB/Serial driver 0.06 (2.7.2) USB communication driver 0.32 0.00 debug level is '6' 0.000364 upsdrv_initups... 0.010093 Checking device (0665/5161) (/dev/usb/665.5161/0) 0.022032 - VendorID: 0665 0.022123 - ProductID: 5161 0.022182 - Manufacturer: INNO TECH 0.022239 - Product: USB to Serial 0.022287 - Serial Number: unknown 0.022338 - Bus: /dev/usb 0.022400 Trying to match device 0.022623 Device matches 0.022831 send_to_all: SETINFO ups.vendorid 0665 0.022903 send_to_all: SETINFO ups.productid 5161 0.022968 Skipping protocol Voltronic 0.01 0.025017 send: M 0.151082 read: V 0.151274 send_to_all: SETINFO ups.firmware.aux PMV 0.154074 send: QS 0.479154 read: (245.4 239.1 244.2 005 50.0 27.4 --.- 1001 0.479363 send_to_all: SETINFO input.voltage 245.4 0.479420 Using protocol: Voltronic-QS 0.01 0.479474 send_to_all: SETINFO device.type ups 0.479541 send_to_all: SETINFO driver.version 2.7.2 0.479592 send_to_all: SETINFO driver.version.internal 0.06 0.479646 send_to_all: SETINFO driver.name nutdrv_qx 0.479700 upsdrv_initinfo... 0.479803 send_to_all: SETINFO driver.version.data Voltronic-QS 0.01 0.482082 send: QS 0.807110 read: (245.4 239.1 244.2 005 50.0 27.4 --.- 1001 0.807209 send_to_all: SETINFO input.voltage.fault 239.1 0.807235 send_to_all: SETINFO output.voltage 244.2 0.807256 send_to_all: SETINFO ups.load 5 0.807291 send_to_all: SETINFO input.frequency 50.0 0.807312 send_to_all: SETINFO battery.voltage 27.40 0.807331 ups_infoval_set: non numerical value [ups.temperature: --.-] 0.807347 update_status: OL 0.807375 update_status: !LB 0.807398 send_to_all: SETINFO ups.type offline / line interactive 0.807426 update_status: !CAL 0.807442 update_status: !FSD 0.807459 send_to_all: SETINFO ups.beeper.status enabled 0.810075 send: F 1.023160 read: #230.0 008 24.00 50.0 1.023274 send_to_all: SETINFO input.voltage.nominal 230 1.023304 send_to_all: SETINFO input.current.nominal 8.0 1.023329 send_to_all: SETINFO battery.voltage.nominal 24.0 1.023354
Re: [Nut-upsuser] nutdrv_qx hangs after send: QS
Thank you for the rapid response. I will try and investigate getting answers to some of your points but I'm a little new to Solaris so I'll need some time. Glancing at the configure output, it looks like it built against v0.1.7 of libusb (yes i think that is derived from the one you mention), checking for libusb version via pkg-config... 0.1.7 found checking for libusb cflags... checking for libusb ldflags... -lusb checking for usb.h... yes checking for usb_init... yes checking for usb_detach_kernel_driver_np... no I will first investigate how to set the debug level. Regards, Richard On Sun, Apr 5, 2015 at 12:44 AM Charles Lepple clep...@gmail.com wrote: On Apr 4, 2015, at 7:19 PM, Richard Flint richard.fl...@gmail.com wrote: More extensive debugging by running the driver sudo ./nutdrv_qx -u root -a MY_UPS -DD indicates the driver works normally then will randomly stop working at stop send: QS. The debug logs show values successfully retrieved repeatedly until something like: Quick update... send: QS read: (247.9 239.1 248.0 005 50.0 27.5 --.- 1001 update_status: OL update_status: !LB update_status: !CAL update_status: !FSD upsdrv_updateinfo... Quick update... send: QS (driver hangs here) I'm using Generic Q* USB/Serial driver 0.06 (2.7.2) with USB communication driver 0.32. Playing with pollinterval didn't help - Is there anything further I can do to help troubleshoot this problem? Thanks, this narrows it down a good deal. @zykh made some changes to nutdrv_qx since the 2.7.2 release, but at first glance, I don't think those will alter the symptoms you are seeing. Can you provide some detail on the libusb port that you built against? If it is derived from the original sourceforge.net libusb-0.1, does it have a USB_DEBUG environment variable that can be set to log extra information? Also, is it possible to do a system call trace to figure out what libusb and the OS are doing at the time of the hang? It's been a while since I last used Solaris, but if memory serves, you could use something like truss to approximate what strace does on Linux. -- Charles Lepple clepple@gmail ___ Nut-upsuser mailing list Nut-upsuser@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/nut-upsuser
Re: [Nut-upsuser] nutdrv_qx hangs after send: QS
On Apr 4, 2015, at 10:45 PM, Richard Flint richard.fl...@gmail.com wrote: Hi, Apologies for the many replies. I have found this documentation: http://www.lehman.cuny.edu/cgi-bin/man-cgi?ugen+7 (I am using the ugen driver). Right, AFAIK ugen is the kernel driver that libusb and openusb talk to (/dev/usb/*). Updated diagram: upsc --- upsd --- nutdrv_qx --- libusb -+- ugen driver --- Solaris kernel --- UPS Richard On Sun, Apr 5, 2015 at 3:34 AM Richard Flint richard.fl...@gmail.com wrote: Hi, I have to admit this sounds like it could screw up the system if not done right - particularly because the solaris packaging system is unlikely to allow me to remove the libusb package if many things are dependent on it. Are there any other options - e.g. doing something with the libusb that ships with solaris - are we sure it doesn't support timeouts? Not entirely certain that we can't do more with the system libusb. NUT passes timeouts in milliseconds. If the NUT driver blocks for more than 5 seconds on the read (I see both 1000 and 5000 ms in various places in the code), then libusb isn't honoring that timeout. Without documentation or the source code, I wouldn't know what else is needed to make the timeouts work. It might be possible to do the following: • install openusb into an alternate directory (e.g. $HOME/local) • set PKG_CONFIG_PATH to anything that doesn't contain the system libusb.pc • put $HOME/local/bin (or wherever openusb-config gets installed) at the front of the $PATH, and symlink openusb-config to libusb-config • reconfigure NUT By installing into $HOME/local (not as root), you can be certain you are not overwriting the system libusb. Unfortunately, openusb doesn't have a *.pc file, otherwise the installation process would be a lot simpler. I think I found some code relating to it here: https://java.net/projects/solaris-userland/sources/gate/show/components/libusb/wrapper/src Unfortunately, that wrapper code is calling into a platform-specific library which doesn't seem to be posted there. (The purpose seems to be abstracting away the differences between Solaris and Sun Ray systems.) Timeouts are passed straight through, but that just moves the question down a layer into that Solaris-specific libusb plugin. Regards, Richard On Sun, Apr 5, 2015 at 3:10 AM Charles Lepple clep...@gmail.com wrote: On Apr 4, 2015, at 9:48 PM, Richard Flint richard.fl...@gmail.com wrote: Again, apologies for my ignorance - are you suggesting that if the NUT application was built against openusb this would probably be fixed? Yes, that is my current theory. It's a little complicated in practice - openusb has a different API than libusb-0.1.x, but it supposedly includes a compatibility layer. If openusb works, I would not expect it to wait for more than 1 or 5 seconds when reading the reply. If so I'm happy to give this a try - any idea how can I tell NUT to build against openusb instead of libusb? Not sure exactly, but to be safe, I'd make an extra backup of wherever libusb is installed - my concern is that other things might be using libusb, and openusb could interfere. Ideally, openusb is a strict superset of libusb, but I haven't used it myself. openusb does seem to use the same library name as libusb, so if libusb was installed by a package, you might want to uninstall libusb first to avoid conflicts. NUT uses either the generic pkg-config tool or a libusb-config binary to find the USB library. openusb seems to install an openusb-config file which could be symlinked to libusb-config in /usr/local/bin (once the original libusb package is out of the way). At that point, you can re-run the NUT ./configure script, and it should list the openusb version number (1.1.11?) instead of 0.1.7. -- Charles Lepple clepple@gmail ___ Nut-upsuser mailing list Nut-upsuser@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/nut-upsuser
Re: [Nut-upsuser] nutdrv_qx hangs after send: QS
On Apr 4, 2015, at 8:53 PM, Richard Flint richard.fl...@gmail.com wrote: I have used the truss command as directed. I have attached both the driver output and the last few sections of truss output leading to the hang. Both outputs end at the CTRL+C i pressed when i was forced to end the processes. Hope this is helpful. Please bear with me if i didn't run it with the right options etc - as I mentioned, I'm a little new to Solaris. Well, it did provide some information, but I'll be honest that I don't know what else we can get from truss. Stepping back a bit, here is the stack: upsc --- upsd --- nutdrv_qx --- libusb -+- Solaris kernel --- UPS truss is tracing at the + sign: the boundary between user space and kernel space. When the driver gets a valid response, the truss log looks like this: 5834: Q u i c k u p d a t e . . . 5834: write(2, \n, 1) = 1 5834: write(5, 0x082DA320, 16)= 16 5834: !\t\002\0\0\b\0 Q S\r\0\0\0\0\0 5834: write(2,1 6 1, 4) = 4 5834: write(2, ., 1) = 1 5834: write(2, 3 9 5 1 0 8, 6) = 6 5834: write(2, \t, 1) = 1 5834: write(2, s e n d : Q S, 8) = 8 5834: write(2, \n, 1) = 1 5834: open(/dev/usb/665.5161/0/if0in1stat, O_RDWR) = 8 5834: write(8, 01, 1) = 1 5834: open(/dev/usb/665.5161/0/if0in1, O_RDONLY)= 9 5834: read(9, ( 2 4 5 . 4 2, 8) = 8 5834: close(9)= 0 5834: close(8)= 0 ... The failed poll at the end looks like this: 5834: Q u i c k u p d a t e . . . 5834: write(2, \n, 1) = 1 5834: write(5, 0x082DA320, 16)(sleeping...) 5834: write(5, 0x082DA320, 16)= 16 5834: !\t\002\0\0\b\0 Q S\r\0\0\0\0\0 5834: write(2,1 6 9, 4) = 4 5834: write(2, ., 1) = 1 5834: write(2, 8 5 8 9 3 0, 6) = 6 5834: write(2, \t, 1) = 1 5834: write(2, s e n d : Q S, 8) = 8 5834: write(2, \n, 1) = 1 5834: open(/dev/usb/665.5161/0/if0in1stat, O_RDWR) = 8 5834: write(8, 01, 1) = 1 5834: open(/dev/usb/665.5161/0/if0in1, O_RDONLY)= 9 5834: read(9, 0xFC9BC650, 8) (sleeping...) ^C I did run across a comment in the NUT configure.ac that says dnl FIXME: Sun's libusb doesn't support timeout (so blocks notification). This is unfortunate, since other libusb platforms will time out after ~5 seconds, and most of the drivers have retry logic to handle that. I couldn't find any relevant source code when searching for Solaris libusb 0.1.7 with Google. Do you have any other information about the libusb that is installed there? There is also this fork of libusb that claims to support Solaris, and the code seems to have timeouts: http://sourceforge.net/projects/openusb/ -- Charles Lepple clepple@gmail ___ Nut-upsuser mailing list Nut-upsuser@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/nut-upsuser
Re: [Nut-upsuser] nutdrv_qx hangs after send: QS
Again, apologies for my ignorance - are you suggesting that if the NUT application was built against openusb this would probably be fixed? If so I'm happy to give this a try - any idea how can I tell NUT to build against openusb instead of libusb? Regards, Richard On Sun, Apr 5, 2015 at 2:28 AM Charles Lepple clep...@gmail.com wrote: On Apr 4, 2015, at 8:53 PM, Richard Flint richard.fl...@gmail.com wrote: I have used the truss command as directed. I have attached both the driver output and the last few sections of truss output leading to the hang. Both outputs end at the CTRL+C i pressed when i was forced to end the processes. Hope this is helpful. Please bear with me if i didn't run it with the right options etc - as I mentioned, I'm a little new to Solaris. Well, it did provide some information, but I'll be honest that I don't know what else we can get from truss. Stepping back a bit, here is the stack: upsc --- upsd --- nutdrv_qx --- libusb -+- Solaris kernel --- UPS truss is tracing at the + sign: the boundary between user space and kernel space. When the driver gets a valid response, the truss log looks like this: 5834: Q u i c k u p d a t e . . . 5834: write(2, \n, 1) = 1 5834: write(5, 0x082DA320, 16)= 16 5834: !\t\002\0\0\b\0 Q S\r\0\0\0\0\0 5834: write(2,1 6 1, 4) = 4 5834: write(2, ., 1) = 1 5834: write(2, 3 9 5 1 0 8, 6) = 6 5834: write(2, \t, 1) = 1 5834: write(2, s e n d : Q S, 8) = 8 5834: write(2, \n, 1) = 1 5834: open(/dev/usb/665.5161/0/if0in1stat, O_RDWR) = 8 5834: write(8, 01, 1) = 1 5834: open(/dev/usb/665.5161/0/if0in1, O_RDONLY)= 9 *5834: read(9, ( 2 4 5 . 4 2, 8) = 8* 5834: close(9)= 0 5834: close(8)= 0 ... The failed poll at the end looks like this: 5834: Q u i c k u p d a t e . . . 5834: write(2, \n, 1) = 1 5834: write(5, 0x082DA320, 16)(sleeping...) 5834: write(5, 0x082DA320, 16)= 16 5834: !\t\002\0\0\b\0 Q S\r\0\0\0\0\0 5834: write(2,1 6 9, 4) = 4 5834: write(2, ., 1) = 1 5834: write(2, 8 5 8 9 3 0, 6) = 6 5834: write(2, \t, 1) = 1 5834: write(2, s e n d : Q S, 8) = 8 5834: write(2, \n, 1) = 1 5834: open(/dev/usb/665.5161/0/if0in1stat, O_RDWR) = 8 5834: write(8, 01, 1) = 1 5834: open(/dev/usb/665.5161/0/if0in1, O_RDONLY)= 9 *5834: read(9, 0xFC9BC650, 8) (sleeping...)* ^C I did run across a comment in the NUT configure.ac that says dnl FIXME: Sun's libusb doesn't support timeout (so blocks notification). This is unfortunate, since other libusb platforms will time out after ~5 seconds, and most of the drivers have retry logic to handle that. I couldn't find any relevant source code when searching for Solaris libusb 0.1.7 with Google. Do you have any other information about the libusb that is installed there? There is also this fork of libusb that claims to support Solaris, and the code seems to have timeouts: http://sourceforge.net/projects/openusb/ -- Charles Lepple clepple@gmail ___ Nut-upsuser mailing list Nut-upsuser@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/nut-upsuser
Re: [Nut-upsuser] nutdrv_qx hangs after send: QS
Hi, I have to admit this sounds like it could screw up the system if not done right - particularly because the solaris packaging system is unlikely to allow me to remove the libusb package if many things are dependent on it. Are there any other options - e.g. doing something with the libusb that ships with solaris - are we sure it doesn't support timeouts? I think I found some code relating to it here: https://java.net/projects/solaris-userland/sources/gate/show/components/libusb/wrapper/src Regards, Richard On Sun, Apr 5, 2015 at 3:10 AM Charles Lepple clep...@gmail.com wrote: On Apr 4, 2015, at 9:48 PM, Richard Flint richard.fl...@gmail.com wrote: Again, apologies for my ignorance - are you suggesting that if the NUT application was built against openusb this would probably be fixed? Yes, that is my current theory. It's a little complicated in practice - openusb has a different API than libusb-0.1.x, but it supposedly includes a compatibility layer. If openusb works, I would not expect it to wait for more than 1 or 5 seconds when reading the reply. If so I'm happy to give this a try - any idea how can I tell NUT to build against openusb instead of libusb? Not sure exactly, but to be safe, I'd make an extra backup of wherever libusb is installed - my concern is that other things might be using libusb, and openusb could interfere. Ideally, openusb is a strict superset of libusb, but I haven't used it myself. openusb does seem to use the same library name as libusb, so if libusb was installed by a package, you might want to uninstall libusb first to avoid conflicts. NUT uses either the generic pkg-config tool or a libusb-config binary to find the USB library. openusb seems to install an openusb-config file which could be symlinked to libusb-config in /usr/local/bin (once the original libusb package is out of the way). At that point, you can re-run the NUT ./configure script, and it should list the openusb version number (1.1.11?) instead of 0.1.7. -- Charles Lepple clepple@gmail ___ Nut-upsuser mailing list Nut-upsuser@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/nut-upsuser