Re: USB-MSD non-functional after merging v5.1 to v6.x (seems to be internal USB stack issue?)

2021-09-02 Thread kra...@redhat.com
  Hi,

> At some point during implementation of the STM USB stack must have run
> into the same problem with the communications choking during MSD init.
> And at the time (which involved a LOT of wireshark comparisons with a
> real USB drive on the host and on the DCW2 Rpi2 stack) I'd added the
> QEMU_PACKED directive to the usb_msd_csw struct.

> Thoughts?

Send a patch adding the QEMU_PACKED,
that should indeed be the correct fix.

thanks,
  Gerd




Re: USB-MSD non-functional after merging v5.1 to v6.x (seems to be internal USB stack issue?)

2021-09-02 Thread VintagePC
Well... the plot thickens. I have found the smoking gun.

At some point during implementation of the STM USB stack must have run into the 
same problem with the communications choking during MSD init. And at the time 
(which involved a LOT of wireshark comparisons with a real USB drive on the 
host and on the DCW2 Rpi2 stack) I'd added the QEMU_PACKED directive to the 
usb_msd_csw struct.

Naturally, when the definitions were relocated to a header, that was lost when 
resolving the merge conflict because I wasn't paying attention. (D'oh!)

So while it's working again... the question still remains why I had to make 
this tweak in the first place - so I dug further:

I just re-compared with the MSD setup of a real drive on the PC. The status 
section of the MSD is only a single byte for the real drive but is 4 bytes in 
the packet received by the guest without the PACKED directive. Wireshark also 
cannot decode this "oversize" packet properly, which seems to further suggest  
that this may indeed be an actual bug in the MSD implementation - naturally it 
would only manifest on systems/compilers where the data alignment precipitates 
this problem.

Further code inspection suggests this originates with the following line in 
usb-storage.c:
> len = MIN(sizeof(s->csw), p->iov.size);
specifically, if the iov.size is > than the MSD CSW. And that's where my 
knowledge of the QEMU USB system ends and I defer to those more knowledgeable. 
I'm guessing this might mean that things are fine if the receiving endpoint 
configures itself to exactly the expected CSW size - but breaks if the endpoint 
is configured with a larger receive buffer. At least for ARM kernel being run 
this isn't custom code and is part of the generic HAL and USB stack STM offers 
- so I have a hard time believing this is an unusual practice.

I even went so far as to eliminate dev-storage from the picture entirely and 
use USB-host to connect a real physical USB drive with the simulated platform - 
and again, the wireshark capture reveals that the CSW packet size is a single 
byte as the "correct" behaviour regardless of how the device sets up its 
receive buffers.

Thoughts?

~ VintagePC




Re: USB-MSD non-functional after merging v5.1 to v6.x (seems to be internal USB stack issue?)

2021-09-02 Thread VintagePC
Hi Gerd,

Thanks for the reply!

> Is this reproducable on master branch somehow?

Interesting thought... I initially hadn't considered it very much because I've 
got a bit of tunnel vision with my fork  - but perhaps having a go with the 
RPi2 which uses the DWC2 subsystem may manifest something similar since I made 
some use of that as a comparison when first implementing the F4xx subsystem. 
I'll give it a try and report back.

> Try run qemu with valgrind to see if there is any memory corruption?

> Oh, that is easy, all usb devices have a pcap= property to write
> out traces which you can then open in wireshark.

Thank you - I'll report any findings of interest. I'm hoping either of these 
may reveal something more specific about the nature of the issue.

~ VintagePC


Sent with ProtonMail Secure Email.





Re: USB-MSD non-functional after merging v5.1 to v6.x (seems to be internal USB stack issue?)

2021-09-02 Thread kra...@redhat.com
  Hi,

> I decided to bisect the merge in order to identify the commit that causes the 
> issue - and much to my surprise, it is this particular commit:
> https://github.com/qemu/qemu/commit/bbd8323d3196c9979385cba1b8b38859836e63c3

Hmm, that is rather strange indeed.

> Given this doesn't seem to be anything more than a relocation of
> declarations (and I don't even use any of these types directly in my
> code), this would seem to suggest an internal issue in linking or
> memory initialization. I'm happy to assist in debugging this where I
> can but I'm hoping someone more knowledgeable about the QEMU USB
> innards might be able to point me to an area to start digging since
> the change seems entirely orthogonal to the actual problem and could
> be just about anywhere.

Try run qemu with valgrind to see if there is any memory corruption?

> I've been told this problem is not unique to my own development setup,
> and a cursory investigation reveals one of the symptoms is a
> divergence in the size of the incoming USB packets.

Is this reproducable on master branch somehow?

> (I'm hoping to set
> up a more detailed packet capture when I have more spare time this
> weekend).

Oh, that is easy, all usb devices have a pcap= property to write
out traces which you can then open in wireshark.

HTH,
  Gerd




USB-MSD non-functional after merging v5.1 to v6.x (seems to be internal USB stack issue?)

2021-09-01 Thread VintagePC
Hello!

Sending to the list because I was directed here after asking on IRC.

Background: I've forked QEMU for a project and am in the process of 
implementing a more complete STM32F4xx stack as a secondary task.

To resolve recent GCC11 build errors, I attempted to merge with upstream QEMU 
v6 (coming from a late 2020 v5.x.x) and to my surprise, USB mass storage is no 
longer functional. A few packets are exchanged but at some point I
At first I suspected the issue was my own code (fair, my implementations are in 
varying states of completeness. The STM32F4 USB controller has many 
similarities to (but is not quite the same as) the DWC2 USB stack).

I decided to bisect the merge in order to identify the commit that causes the 
issue - and much to my surprise, it is this particular commit:
https://github.com/qemu/qemu/commit/bbd8323d3196c9979385cba1b8b38859836e63c3

Given this doesn't seem to be anything more than a relocation of declarations 
(and I don't even use any of these types directly in my code), this would seem 
to suggest an internal issue in linking or memory initialization. I'm happy to 
assist in debugging this where I can but I'm hoping someone more knowledgeable 
about the QEMU USB innards might be able to point me to an area to start 
digging since the change seems entirely orthogonal to the actual problem and 
could be just about anywhere.

The command line I'm using is as follows:
./qemu-system-buddy -machine prusa-mini -kernel firmware.bin -device 
usb-storage,drive=usbst -drive id=usbst,file=SDcard.bin

where SDCard.bin is a bog-standard image of a FAT32 partition. (using VFAT 
folder emulation is similarly non-functional). Unfortunately the device does 
not support anything other than MSD so I cannot check functionality with other 
USB devices.

I've been told this problem is not unique to my own development setup, and a 
cursory investigation reveals one of the symptoms is a divergence in the size 
of the incoming USB packets. (I'm hoping to set up a more detailed packet 
capture when I have more spare time this weekend). For example, the working 
version I will see some initial setup packets be exchanged, then a few packets 
of size 1, 36, 13, 13, 14, etc as the ARM firmware talks to the device. (Back 
when I implemented the USB stack I spent a lot of time debugging and comparing 
wireshark captures so I'm reasonably confident the USB handling code is correct 
since the firmware being run is doing full on FAT filesystem support). After 
merging the offending commit, I see the initial setup and then a series of 
packets of size 1, 36, 16, 512, and then nothing further. (Internally on the 
firmware the USB bus gets stuck because this last packet of size 512 is 
obviously junk and a symptom of the issue.). --enable-sanitizers revealed only 
a minor bug elsewhere that was unrelated to the issue (and did not resolve it 
when fixed)

The project itself is here: https://github.com/vintagepc/MINI404 with all of my 
custom implementations temporarily residing in hw/arm/prusa/stm32f407 (see 
footnote/P.S.)

Thanks in advance for any assistance!
VintagePC.

P.S. - Yes, I recognize it's not organized and formatted entirely in the same 
style as the rest of QEMU and probably violates a few style guide items. As 
this is a spare time project (and because I am also leveraging prior work from 
non-upstream sources) I need to be as efficient as possible in making changes 
and being able to debug things. Longer term... yes, I am absolutely supportive 
of getting STM32F4 support upstream because I know there is significant value 
here. But as it is right now the parts are not functional/polished enough that 
I feel comfortable doing so - and I just don't have the bandwidth to spend time 
on that in addition to the project itself. In the long term once the SOC 
implementation is more complete, I definitely hope to be able to shift focus to 
refactoring and making what I believe to be sufficiently functional 
implementations ready for upstream submissions. (And if someone reading this is 
keen and willing to help, feel free to contact me off-list! to collaborate 
and/or discuss what needs to happen to make it submittable!)