Hi, it seems I made some progress on this issue. Warning: loooong text ahead...
I'm now in possession of a powered 7-port USB 3 hub which I use for testing devices. This hub happens to have a power button, allowing me to easily cut USB power and bring it back up again. It just so happens that I can reproduce the initialization problems 100% of the time by using the hub's power button instead of hot -plugging the device. On Wed, 2015-11-25 at 17:49 +0100, Daniel Elstner wrote: > I still suspect it's partly due to flaky FX2 firmware on the device, > and partly due to my driver code straddling the boundaries more > tightly than the original software. I'm almost sure now that it is *not* the sigrok driver code, but flaky FX2 firmware. What seems to happen is that the USB device does not reset properly in some circumstances. When the device resets properly, in such a way that it works correctly afterwards, the dmesg output on power-on/hotplug looks like this: [235815.350314] usb 1-2.1.4: new high-speed USB device number 65 using xhci_hcd [235817.141662] usb 1-2.1.4: new high-speed USB device number 66 using xhci_hcd [235817.242819] usb 1-2.1.4: New USB device found, idVendor=2961, idProduct=6689 [235817.242822] usb 1-2.1.4: New USB device strings: Mfr=1, Product=2, SerialNumber=0 [235817.242824] usb 1-2.1.4: Product: LWLA-1034 v1.0 [235817.242825] usb 1-2.1.4: Manufacturer: www.SysClk.com Note the double enumeration at the beginning. If the device does not reset properly, the the dmesg output looks like this instead: [235762.633529] usb 1-2.1.4: new high-speed USB device number 64 using xhci_hcd [235762.734699] usb 1-2.1.4: New USB device found, idVendor=2961, idProduct=6689 [235762.734702] usb 1-2.1.4: New USB device strings: Mfr=1, Product=2, SerialNumber=0 [235762.734704] usb 1-2.1.4: Product: LWLA-1034 v1.0 [235762.734705] usb 1-2.1.4: Manufacturer: www.SysClk.com Note that the double enumeration is missing this time. Occasionally, I can even get it stuck in a state where it identifies as: ID 04b4:8613 Cypress Semiconductor Corp. CY7C68013 EZ-USB FX2 USB 2.0 Development Kit This hints that even though the FX2 firmware is loaded from an EEPROM, the device still needs to trigger reenumeration as part of its initialization process, so it's not completely transparent. Now, if the device turns up with the correct VID/PID of the SysClk LWLA1034, one would think that the FX2 firmware did load correctly. So why does it not work? Here's the thing: It mostly does, except for one small detail. Here's a summary of my findings: 1) Bulk transfers of any size to endpoint 4 (FPGA bitstream) and endpoint 2 (commands to the FPGA logic) work fine. Also, not only the transfer works, but the associated action is in fact performed. 2) Bulk transfers from endpoint 6, which is used for receiving replies to commands, are *cut off after 64 bytes*, even though the endpoint descriptor says the endpoint size is 512 bytes. However, it appears that more data is waiting and can be read in subsequent transfers (which are also cut short); and not reading the waiting data throws off the device state logic. 3) libusb_reset_device(), libusb_clear_halt(), playing around with the configuration descriptor etc *do not have any effect*. Yep, that's right, not even an explicit reset is able to bring the device into a proper state. So, just for kicks, I tried this: When a short transfer comes in but more data is expected, I re-submit the in transfer with adjusted buffer pointer and length, so as to fetch the remaining data that is waiting. This *almost* works! It seems to work completely for status requests, but unfortunately it fails for capture memory reads. With memory reads, this technique results in multiple 64-byte transfers which return more data, but the final few bytes are missing. So it seems in case of memory reads, the limited transfer size throws off the internal logic in a way that is not recoverable. I a desperate last-ditch attempt, I then reduced the chunk size of memory read requests to the smallest possible amount, so that the response to each request would fit within 64 bytes. *That* made it work! However, it also slows down the capture memory read quite noticeably. So I don't think this an appropriate "solution". Phew. A side issue is that leaving data waiting on the reply endpoint for any reason will make the device completely unusable afterwards. I'm not entirely sure, but I believe the incorrect read of the test ID people have been seeing may be due to that. Since this issue is independent of the weird 64-byte limit problem, I have added a routine in the sigrok driver that drains any pending data from the endpoint during device initialization. The code can be found here: https://github.com/danielkitta/libsigrok/tree/lwla-fixes I've also created a branch for the short transfer hacks, if anyone is interested in playing around with that: https://github.com/danielkitta/libsigrok/tree/lwla-hackery > I just hope it's not due to the peculiarities of the OS' USB stack > (either causing or more likely hiding the issue). The original > software is Windows-only. I now fear that this is indeed the case. Windows must be doing something that hides the problem, or this issue would have turned up for many users. Cheers, --Daniel ------------------------------------------------------------------------------ _______________________________________________ sigrok-devel mailing list sigrok-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/sigrok-devel