Hello,

More detials below, but TL;DR: the problem only happens on XHCI USB,
everything is fine with the same code going through EHCI; and the
problem is the loss of a short bulk xfer from the device, but it never
happens on the first handle opened for the device file, only on
subsequent uses of the same device.

on Wednesday 12 February 2020 at 22:33, Natasha Kerensikova wrote:
> Being out of ideas, I followed my intuition about some deinitialization
> issue, and tried to reproduce the issue using a single executable run.
> The result is available at
> https://upload.instinctive.eu/t3QWc3v5LW-MzqNJ_twkqVenpvc/test-usb2.c
> 
> Basically, if I use two pairs libusb_bulk_transfer() in a row,
> everything works fine, but I release the interface and claim it again
> between them, I reproduce the timeout problem.
> 
> So at this point, I suspect that libusb_claim_interface() does not
> exactly undo what libusb_release_interface().

To continue the investigations, I got rid of libusb, and used OpenBSD
API directly, using the following code:
https://upload.instinctive.eu/zHYhUftnlxgEvknA9whSUHZHslc/test-usb3.c

I have really no idea what I'm doing, maybe I'm badly misusing the API,
but to my untrained eye it seems reasonable to open(), ioctl(), read()
or write(), and close() the device file, and expecting to leave things
the way I found them.
When running this code, the first iteration of the loop works fine, and
the second read() times out.

To double-check, I tried moving open() and close() outside of the loop,
which means two write/read pairs with the same fds. Both times succeed,
and I could run it several times and it still succeeds.

So I went on with three write/read paris on the same fds, and I'm back
with the first run working, and the subsequent runs timing out; however
only the first read times out, the second and third do work fine.

So it seems there is some kind of parity thing on, with odd requests
putting the system in a wierd, bogus state, and even requests bringing
back to normal.

I tried further diagnosing by enabling kernel debugging traces for ugen,
usb, and xhci. Here is a fragment of /var/log/messages showing the
powering of the USB device, two runs of the binary with three write/read
paris on the same fds, and unpowering of the device, with
ugendebug = 10;
usbdebug = 10;
xhcidebug = 9;
https://upload.instinctive.eu/mI8lANDl7KTetMiuo7EQc3hzRHA/2020-02-13-13-50-messages.txt

As far as I can tell, all read() behave similarly and enter the only
while loop in usbd_transfer(), and the difference between successful and
timing-out read() lies in the external completion of the xfer,
presumably somewhere in xhci.

As said above, this made me try with an older laptop (after updating to
the latest CURRENT), and with its ehci-based USB I cannot reproduce any
part of this problem (and the enumeration of `scanimage -L` is much much
faster).

I guess at this point I'm blocked by my lack of global understanding of
xhci and how it works with usbdi.

Would anyone have any idea on how to further diagnose the issue?


Thanks for help,
Natasha

Reply via email to