Hi,

On 4/27/20 8:14 PM, Jiang, Guomin wrote:
Add stakeholder and link https://edk2.groups.io/g/devel/message/58193

Add discussion history and add comment.

Hi Jeremy,

You can refer https://edk2.groups.io/g/devel/message/58125 for discussion
about this solution.

Oh fun, odd how a bug can exist in a code base for years and then this
happens.. I will move the discussion there.

Thanks,


Two issue I found:
1. Memory leakage may occur if doing so and I am investigating it.

It seems our solutions differ a bit?

The memory leakage may occur when invoke UsbBuildDescTable(), because you 
allocate new buffer but haven't freed the old allocated buffer ye.

What tool are you using to detect memory leaks?



2. It test pass with OVMF but fail in real platform, and I am figuring out the
flow.

Hmm, I've been seeing this on a RPI with an attached USB3 hub and 5 bay USB
JBOD. (there is another problem but the reset crash is keeps the machine
from booting).

Can you share detail information with us.

Sure, this is all off the shelf hardware, its an rpi4+edk2. What you you like?

The device that causes the ASSERT() is a JMicron Controller (152d:0562) in one of these:

http://www.yottamaster.com/index.php?route=product/product&path=63_65&product_id=91

behind a VIA based usb3 hub (2109:8110).

There is a fpaste of the log here: https://paste.centos.org/view/1860e620
Grab anything you want quick, it expires in 1 day.

The last few lines are:

UsbBotDataTransfer: DataIn Stall
UsbBootExecCmd: Success to Exec 0x88 Cmd (Result = 1)
UsbBootRequestSense: (Invalid Parameter) with error code (70) sense key 5/24/0
XhcCheckUrbResult: STALL_ERROR! Completecode = 6
Recovery Halted Slot = 6,Dci = 3
XhcResetEndpoint: Slot = 0x6, Dci = 0x3
XhcSetTrDequeuePointer: Slot = 0x6, Dci = 0x3, Urb = 0x3597C998
XhcBulkTransfer: error - Device Error, transfer - 2
UsbBotDataTransfer: (Device Error)
UsbBotDataTransfer: DataIn Stall
UsbBootExecCmd: Success to Exec 0x88 Cmd (Result = 1)
UsbBootRequestSense: (Invalid Parameter) with error code (70) sense key 5/24/0
UsbMassReadBlocks: UsbBootReadBlocks (Invalid Parameter) -> Reset
Disable device slot 6!
Enable Slot Successfully, The Slot ID = 0x6
    Address 6 assigned successfully
UsbIoPortReset: device is now ADDRESSED at 6
ASSERT [XhciDxe] /home/jlinton/rpi2/edk2/MdeModulePkg/Bus/Pci/XhciDxe/XhciSched)


BTW: I'm pretty sure I've seen this with various other pieces of hardware in the past as well. Usually in an unreproducible manner. This one was easy because the EP is just 5/2400, following a linux reboot (where the device is working). With the patch I suggested it gets those 5/2400's and continues the boot.

I was a bit unable to decide if I should attempt to manually request the device descriptors in the XHCI driver rather than modifying the USB core. OTOH, that solution ends but being a bit ugly, because presumably one would do it in the initial set address/etc by querying the device descriptors at that point.



-----Original Message-----
From: Wu, Hao A <hao.a...@intel.com>
Sent: Monday, April 27, 2020 9:36 AM
To: Jiang, Guomin <guomin.ji...@intel.com>; devel@edk2.groups.io
Cc: Wang, Jian J <jian.j.w...@intel.com>; Ni, Ray <ray...@intel.com>
Subject: RE: [edk2-devel] [PATCH] MdeModulePkg/UsbBusDxe: Rebuild the
description table after Reset Device

-----Original Message-----
From: Jiang, Guomin
Sent: Sunday, April 26, 2020 9:05 PM
To: Wu, Hao A; devel@edk2.groups.io
Cc: Wang, Jian J; Ni, Ray
Subject: RE: [edk2-devel] [PATCH] MdeModulePkg/UsbBusDxe: Rebuild the
description table after Reset Device

Add comment inline.

-----Original Message-----
From: Wu, Hao A <hao.a...@intel.com>
Sent: Sunday, April 26, 2020 4:04 PM
To: Jiang, Guomin <guomin.ji...@intel.com>; devel@edk2.groups.io
Cc: Wang, Jian J <jian.j.w...@intel.com>; Ni, Ray <ray...@intel.com>
Subject: RE: [edk2-devel] [PATCH] MdeModulePkg/UsbBusDxe: Rebuild
the description table after Reset Device

-----Original Message-----
From: Jiang, Guomin
Sent: Sunday, April 26, 2020 2:44 PM
To: Wu, Hao A; devel@edk2.groups.io
Cc: Wang, Jian J; Ni, Ray
Subject: RE: [edk2-devel] [PATCH] MdeModulePkg/UsbBusDxe: Rebuild
the
description table after Reset Device

Hi Hao,

Add comments inline.

-----Original Message-----
From: Wu, Hao A <hao.a...@intel.com>
Sent: Sunday, April 26, 2020 1:12 PM
To: devel@edk2.groups.io; Wu, Hao A <hao.a...@intel.com>; Jiang,
Guomin
<guomin.ji...@intel.com>
Cc: Wang, Jian J <jian.j.w...@intel.com>; Ni, Ray
<ray...@intel.com>
Subject: RE: [edk2-devel] [PATCH] MdeModulePkg/UsbBusDxe:
Rebuild the description table after Reset Device

-----Original Message-----
From: devel@edk2.groups.io [mailto:devel@edk2.groups.io] On
Behalf
Of
Wu, Hao A
Sent: Sunday, April 26, 2020 1:10 PM
To: Jiang, Guomin; devel@edk2.groups.io
Cc: Wang, Jian J; Ni, Ray
Subject: Re: [edk2-devel] [PATCH] MdeModulePkg/UsbBusDxe:
Rebuild
the
description table after Reset Device

-----Original Message-----
From: Jiang, Guomin
Sent: Saturday, April 25, 2020 9:36 AM
To: devel@edk2.groups.io
Cc: Wang, Jian J; Wu, Hao A; Ni, Ray
Subject: [PATCH] MdeModulePkg/UsbBusDxe: Rebuild the
description
table after Reset Device

REF: https://bugzilla.tianocore.org/show_bug.cgi?id=2694

When the USB fail and then Reset Device, it should rebuild
description.

Signed-off-by: Guomin Jiang <guomin.ji...@intel.com>
Cc: Jian J Wang <jian.j.w...@intel.com>
Cc: Hao A Wu <hao.a...@intel.com>
Cc: Ray Ni <ray...@intel.com>
---
  MdeModulePkg/Bus/Usb/UsbBusDxe/UsbBus.c | 5 +++++
  1 file changed, 5 insertions(+)

diff --git a/MdeModulePkg/Bus/Usb/UsbBusDxe/UsbBus.c
b/MdeModulePkg/Bus/Usb/UsbBusDxe/UsbBus.c
index 4b4915c019..9f2d2cc87f 100644
--- a/MdeModulePkg/Bus/Usb/UsbBusDxe/UsbBus.c
+++ b/MdeModulePkg/Bus/Usb/UsbBusDxe/UsbBus.c
@@ -869,6 +869,11 @@ UsbIoPortReset (


    DEBUG (( EFI_D_INFO, "UsbIoPortReset: device is now
ADDRESSED
at %d\n", Dev->Address));



+  //

+  // The description will be invalid after reset, should
+ rebuild it as
well.

+  //

+  UsbBuildDescTable (Dev);


Hello Guomin,

Thanks for the proposed patch.

Could you help to explain in more detail for the above fix
with regard to the transfer ring not being set properly in the XHCI
driver?
Thanks.

When I debug, I dump the below data structure and found that
before UsbMassReset, the corresponding slot transfer ring is
normal and invalid after UsbMassReset.
USB_DEV_CONTEXT = 0x148
  +0x0 Enabled
  +0x1 SlotId
  +0x4 RouteString
  +0x8 ParentRouteString
  +0xC XhciDevAddr
  +0xD BusDevAddr
  +0x10 InputContext
  +0x18 OutputContext
  +0x20 EndpointTransferRing[31]
  +0x118 DevDesc
  +0x130 ConfDesc
  +0x138 ActiveConfiguration
  +0x140 ActiveAlternateSetting

Before UsbMassReset
000002D0:                                            -01 01 00 00 00 00 10 10
000002E0: 00 00 00 00 01 01 00 00-80 02 FA 06 00 00 00 00
000002F0: C0 16 FA 06 00 00 00 00-98 27 FC 06 00 00 00 00
00000300: 00 00 00 00 00 00 00 00-98 26 FC 06 00 00 00 00
00000310: 98 BE F9 06 00 00 00 00-00 00 00 00 00 00 00 00
00000320: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
00000330: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
00000340: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
00000350: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
00000360: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
00000370: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
00000380: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
00000390: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
000003A0: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
000003B0: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
000003C0: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
000003D0: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
000003E0: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
000003F0: 12 01 00 03 00 00 00 09-F4 46 01 00 00 00 01 02
00000400: 03 01 00 00 00 00 00 00-98 CD F9 06 00 00 00 00
00000410: 01 00 00 00 00 00 00 00-18 24 FC 06 00 00 00 00

After UsbMassReset
000002D0:                                           -01 01 00 00 00 00 10 10
000002E0: 00 00 00 00 01 01 00 00-80 02 FA 06 00 00 00 00
000002F0: C0 16 FA 06 00 00 00 00-00 00 00 00 00 00 00 00
00000300: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
00000310: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
00000320: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
00000330: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
00000340: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
00000350: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
00000360: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
00000370: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
00000380: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
00000390: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
000003A0: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
000003B0: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
000003C0: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
000003D0: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
000003E0: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
000003F0: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
00000400: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
00000410: 01 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00

After Reset Finished, it will try to get data again by sending
command to Usb Mass Device and it will use USB_DEV_CONTEXT.
EndpointTransferRing[4].
But it haven't been initialized after reset, and ASSERT will
trigger when access the USB_DEV_CONTEXT. EndpointTransferRing and
will
show
ASSERT



c:\users\guominji\dcg10nm\edk2\MdeModulePkg\Bus\Pci\XhciDxe\XhciSch
e
d.c(1909): TrsRing != ((void *) 0)

The EndpointTranferRing should be initialized by
UsbBuildDescTable() but seem that it is omitted in Reset branch, so I add
it.


Sorry Guomin, I still do not fully understand the link between
calling the
UsbBuildDescTable() in UsbBusDxe and the re-initialization of the
'EndpointTranferRing' in the XhciDxe. Could you help to provide more
detail?

I am sorry that make confusion to you, I will explain it in detail now.
1. The critical call flow as below
UsbMassReset()
--> UsbMass->Transport->Reset() == UsbBotResetDevice()
      --> UsbBot->UsbIo->UsbPortReset() == UsbIoPortReset()
            --> UsbIf->Device->ParentIf->HubApi->ResetPort() ==
UsbRootHubResetPort()
                  --> UsbHcGetRootHubPortStatus()
                        --> UsbBus->Usb2Hc-GetRootHubPortStatus() ==
XhcGetRootHubPortStatus()
                              --> XhcPollPortStatusChange()
                                   --> XhcDisableSlotCmd() or
XhcDisableSlotCmd64() [Very very critical routine 1]
                                   --> XhcInitializeDeviceSlot() or
XhcInitializeDeviceSlot64() [Very very critical routine 2]
            --> UsbSetAddress()
            --> UsbSetConfig()
                  -->  UsbBus->Usb2Hc->ControlTransfer() == XhcControlTransfer()
                         --> XhcSetConfigCmd() or XhcSetConfigCmd64()
[Very very critical routine 3]

There are three very very critical routine, 1. XhcDisableSlotCmd() or
XhcDisableSlotCmd64() will disable the slot context and free the
allocated memory.
2. XhcInitializeDeviceSlot() or XhcInitializeDeviceSlot() will
initialize the basic control and slot endpoint transfer ring.
3. XhcSetConfigCmd() or XhcSetConfigCmd64() will configure the other
transfer ring. It  depend on Xhc-
UsbDevContext[Slot].DevDesc.NumConfigurations, unfortunately, the
DevDesc haven't benn initialized by UsbBuildDescTable and it is 0. So
XhcSetConfigCmd() or XhcSetConfigCmd64() haven't run.

After run below 3 critical routine, the device is initialize but
EndpointTransferRing of Input and Output haven't been initialized and
still keep NULL.

When UsbMassReadBlocks, it will use uninitialized EndpointTransferRing
and will ASSERT.


Thanks a lot Guomin,

Now I understand the reason behind for the proposed fix.

I would suggest to add an abstract of the above analysis in the commit
message, such information will be helpful for others who might work on
USB/Xhci in the future.



Another thing I found (if current proposed fix is a proper solution)
is that in
UsbBuildDescTable() , several memory allocations will be made to
store the device descriptor/config descriptor. Could you help to
double check if their old values are properly freed in order to avoid
memory leak?


I will investigate it and give you feedback.


Thanks for this.

Best Regards,
Hao Wu



Best Regards,
Hao Wu




Also, judging from the function description comments in
UsbBuildDescTable():
|>  /**
|>    Build the whole array of descriptors. This function must
|>    be called after UsbGetMaxPacketSize0 returns the max packet
|>    size correctly for endpoint 0.
|>  ...
|>  **/
|>  EFI_STATUS
|>  UsbBuildDescTable (
|>    IN USB_DEVICE           *UsbDev
|>    )

Does function UsbGetMaxPacketSize0() need to be called before
UsbBuildDescTable() in the proposed fix?

I ignored it and will double check it.



One more thing, could you help to add the information for what
tests have been done for the proposed patch as well?

Thanks in advance.


Have test OVMF and test pass, detail as below I use
qemu-w64-2020020, you should install it first and add it to PATH
environment, the procedure as below 1. Import the test patch from
https://bugzilla.tianocore.org/attachment.cgi?id=508
2. type ```build -p OvmfPkg\OvmfPkgX64.dsc -b DEBUG -a X64``` to
generate OVMF.fd 3. type ```qemu-img create file.img 1G``` to
create test image.
4. type ```qemu-system-x86_64 -bios OVMF.fd -drive
if=none,id=stick,file=file.img -device nec-usb-xhci,id=xhci,
-device
usb- storage,bus=xhci.0,drive=stick -global
isa-debugcon.iobase=0x402
- debugcon file:xhci.log``` 5. you will see hang currently and
will see ASSERT when check xhci.log.
6. import proposal patch, the symptom disappear.



Best Regards,
Hao Wu



+

    //

    // Reset the current active configure, after this device

    // is in CONFIGURED state.

--
2.25.1.windows.1











-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#58203): https://edk2.groups.io/g/devel/message/58203
Mute This Topic: https://groups.io/mt/73255478/21656
Group Owner: devel+ow...@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to