RE: [PATCH] usb: xhci: fix event trb handling missed

2020-11-12 Thread Ran Wang
Hi Bin,

On Friday, November 13, 2020 2:25 PM Bin Meng wrote:
> 
> Hi Ran,
> 
> On Wed, Nov 11, 2020 at 4:07 PM Ran Wang  wrote:
> >
> > Hi Bin
> >
> > On Tuesday, November 10, 2020 5:55 PM, Ran Wang wrote:
> > >
> > > Hi Bin,
> > >
> >
> > 
> >
> > > > > record_transfer_result():comp_code: 1 csw org fbb4ec90,
> > > > > adjusted to fbb4fffe
> > > > > record_transfer_result():comp_code: 1
> > > > > record_transfer_result():comp_code: 1
> > > > > record_transfer_result():comp_code: 1 csw org fbb4ec90,
> > > > > adjusted to fbb4fffe
> > > > > record_transfer_result():comp_code: 1
> > > > > record_transfer_result():comp_code: 1
> > > > > record_transfer_result():comp_code: 1 csw org fbb4ec90,
> > > > > adjusted to fbb4fffe
> > > > > record_transfer_result():comp_code: 1
> > > > > record_transfer_result():comp_code: 1
> > > > > record_transfer_result():comp_code: 1
> > > >
> > > > In the earlier log you sent, you have the following:
> > > >
> > > > dev=fbb4f3c0, pipe=c0010283, buffer=fbb4fd80,
> > > > length=2048buffer would cross 64KB
> > > > boundary, so we will send request in more than 1 TRB, this is the
> > > > key issue trigger condition
> > > > xhci_bulk_tx()#0.1.running_total:0x280
> > > > xhci_bulk_tx()#0.2.trb_buff_len:0x280
> > > > xhci_bulk_tx()#0.3.running_total:0x280
> > > > xhci_bulk_tx()#0.4.num_trbs:0x2--2
> > > > Transfer TRB
> > > > xhci_bulk_tx()#0.5.running_total:0x10280
> > > > xhci_bulk_tx()#0.start_trb:0xfbb47140
> > > > --xhci_bulk_tx()#0.>enqueue->generic:0xfbb47
> > > > 140-
> > > > 
> > > > -Assemble
> > > > 1st Transfer TRB
> > > > xhci_bulk_tx()#0.addr:0xfbb4fd80
> > > > xhci_bulk_tx()#0.trb_buff_len:0x280
> > > > xhci_bulk_tx()#0.running_total:0x280
> > > > xhci_bulk_tx()#0.length:0x800
> > > > xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x1
> > > > --xhci_bulk_tx()#0.>enqueue->generic:0xfbb47
> > > > 150-
> > > > 
> > > > -Assemble
> > > > 2nd Transfer TRB
> > > > xhci_bulk_tx()#0.addr:0xfbb5
> > > > xhci_bulk_tx()#0.trb_buff_len:0x580
> > > > xhci_bulk_tx()#0.running_total:0x800
> > > > xhci_bulk_tx()#0.length:0x800
> > > > xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x1
> > > >
> > > > Could you please specify where this output is for?
> > >
> > > All this prints is within function xhci_bulk_tx(),
> > >
> > > Let me open those verbose output and check if that crossing 64KB
> > > case has been triggered or not.
> > > But I have to get back to you later (tomorrow).
> >
> > With your test commit and my verbose print code, issue cannot be
> > reproduced in mass storage usage (although crossing 64KB boundary
> > condition has met), so looks like the only different to ethernet device is 
> > that if
> xHC would receive short packet:
> >
> 
> Thank you.
> 
> > => pri t
> > t=usb read 0x8200 0 30
> > => run t
> >
> > usb read: device 0 block # 0, count 3145728 ...
> > usb_stor_BBB_transport()csw org fbb4ac90, adjusted to
> > fbb4fffe xhci_bulk_tx()dev=fbb49950, pipe=c0010203,
> > buffer=fbb1a9c0, length=0x1f
> > xhci_bulk_tx()#0.4.num_trbs:0x1
> >
> xhci_bulk_tx()#0.>enqueue->generic:0xfbb4957
> > 0
> > xhci_bulk_tx()#0.last_transfer_trb_addr:0xfbb49570
> > --xhci_bulk_tx()#0.event->trans_event.buffer:0xfbb
> > 49570
> > xhci_bulk_tx()#0.event->complete_code:1
> > xhci_bulk_tx()#3.ep_index:0x3
> > xhci_bulk_tx()dev=fbb49950, pipe=c0008283,
> > buffer=fbb1ab00, length=0x200
> > xhci_bulk_tx()#0.4.num_trbs:0x1
> > xhci_bulk_tx()#set TRB_ISP
> >
> xhci_bulk_tx()#0.>enqueue->generic:0xfbb4918
> > 0
> > xhci_bulk_tx()#0.last_transfer_trb_addr:0xfbb49180
> > --xhci_bulk_tx()#0.event->trans_event.buffer:0xfbb
> > 49180
> > xhci_bulk_tx()#0.event->complete_code:1
> > xhci_bulk_tx()#3.ep_index:0x2
> > xhci_bulk_tx()dev=fbb49950, pipe=c0008283,
> > buffer=fbb4fffe, length=0xd-buffer cross 64KB
> > boundary
> > xhci_bulk_tx()#0.4.num_trbs:0x2
> > xhci_bulk_tx()#set
> > TRB_ISP---
> > Set ISP
> >
> xhci_bulk_tx()#0.>enqueue->generic:0xfbb4919
> > 01st transfer TRB queued
> > xhci_bulk_tx()#0.last_transfer_trb_addr:0xfbb49190
> > xhci_bulk_tx()#set
> > TRB_ISP---
> > Set ISP
> >
> xhci_bulk_tx()#0.>enqueue->generic:0xfbb491a
> > 02nd transfer TRB queued
> > xhci_bulk_tx()#0.last_transfer_trb_addr:0xfbb491a0
> > --xhci_bulk_tx()#0.event->trans_event.buffer:0xfbb
> > 491a0-event TRB 

Re: [PATCH] usb: xhci: fix event trb handling missed

2020-11-12 Thread Bin Meng
Hi Ran,

On Wed, Nov 11, 2020 at 4:07 PM Ran Wang  wrote:
>
> Hi Bin
>
> On Tuesday, November 10, 2020 5:55 PM, Ran Wang wrote:
> >
> > Hi Bin,
> >
>
> 
>
> > > > record_transfer_result():comp_code: 1 csw org fbb4ec90,
> > > > adjusted to fbb4fffe
> > > > record_transfer_result():comp_code: 1
> > > > record_transfer_result():comp_code: 1
> > > > record_transfer_result():comp_code: 1 csw org fbb4ec90,
> > > > adjusted to fbb4fffe
> > > > record_transfer_result():comp_code: 1
> > > > record_transfer_result():comp_code: 1
> > > > record_transfer_result():comp_code: 1 csw org fbb4ec90,
> > > > adjusted to fbb4fffe
> > > > record_transfer_result():comp_code: 1
> > > > record_transfer_result():comp_code: 1
> > > > record_transfer_result():comp_code: 1
> > >
> > > In the earlier log you sent, you have the following:
> > >
> > > dev=fbb4f3c0, pipe=c0010283, buffer=fbb4fd80,
> > > length=2048buffer would cross 64KB
> > > boundary, so we will send request in more than 1 TRB, this is the key
> > > issue trigger condition
> > > xhci_bulk_tx()#0.1.running_total:0x280
> > > xhci_bulk_tx()#0.2.trb_buff_len:0x280
> > > xhci_bulk_tx()#0.3.running_total:0x280
> > > xhci_bulk_tx()#0.4.num_trbs:0x2--2 Transfer
> > > TRB
> > > xhci_bulk_tx()#0.5.running_total:0x10280
> > > xhci_bulk_tx()#0.start_trb:0xfbb47140
> > > --xhci_bulk_tx()#0.>enqueue->generic:0xfbb47140-
> > > 
> > > -Assemble
> > > 1st Transfer TRB
> > > xhci_bulk_tx()#0.addr:0xfbb4fd80
> > > xhci_bulk_tx()#0.trb_buff_len:0x280
> > > xhci_bulk_tx()#0.running_total:0x280
> > > xhci_bulk_tx()#0.length:0x800
> > > xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x1
> > > --xhci_bulk_tx()#0.>enqueue->generic:0xfbb47150-
> > > 
> > > -Assemble
> > > 2nd Transfer TRB
> > > xhci_bulk_tx()#0.addr:0xfbb5
> > > xhci_bulk_tx()#0.trb_buff_len:0x580
> > > xhci_bulk_tx()#0.running_total:0x800
> > > xhci_bulk_tx()#0.length:0x800
> > > xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x1
> > >
> > > Could you please specify where this output is for?
> >
> > All this prints is within function xhci_bulk_tx(),
> >
> > Let me open those verbose output and check if that crossing 64KB case has 
> > been
> > triggered or not.
> > But I have to get back to you later (tomorrow).
>
> With your test commit and my verbose print code, issue cannot be reproduced 
> in mass storage usage
> (although crossing 64KB boundary condition has met), so looks like the only 
> different to ethernet device
> is that if xHC would receive short packet:
>

Thank you.

> => pri t
> t=usb read 0x8200 0 30
> => run t
>
> usb read: device 0 block # 0, count 3145728 ...
> usb_stor_BBB_transport()csw org fbb4ac90, adjusted to fbb4fffe
> xhci_bulk_tx()dev=fbb49950, pipe=c0010203, buffer=fbb1a9c0, 
> length=0x1f
> xhci_bulk_tx()#0.4.num_trbs:0x1
> xhci_bulk_tx()#0.>enqueue->generic:0xfbb49570
> xhci_bulk_tx()#0.last_transfer_trb_addr:0xfbb49570
> --xhci_bulk_tx()#0.event->trans_event.buffer:0xfbb49570
> xhci_bulk_tx()#0.event->complete_code:1
> xhci_bulk_tx()#3.ep_index:0x3
> xhci_bulk_tx()dev=fbb49950, pipe=c0008283, buffer=fbb1ab00, 
> length=0x200
> xhci_bulk_tx()#0.4.num_trbs:0x1
> xhci_bulk_tx()#set TRB_ISP
> xhci_bulk_tx()#0.>enqueue->generic:0xfbb49180
> xhci_bulk_tx()#0.last_transfer_trb_addr:0xfbb49180
> --xhci_bulk_tx()#0.event->trans_event.buffer:0xfbb49180
> xhci_bulk_tx()#0.event->complete_code:1
> xhci_bulk_tx()#3.ep_index:0x2
> xhci_bulk_tx()dev=fbb49950, pipe=c0008283, buffer=fbb4fffe, 
> length=0xd-buffer cross 64KB boundary
> xhci_bulk_tx()#0.4.num_trbs:0x2
> xhci_bulk_tx()#set 
> TRB_ISP---Set
>  ISP
> xhci_bulk_tx()#0.>enqueue->generic:0xfbb491901st
>  transfer TRB queued
> xhci_bulk_tx()#0.last_transfer_trb_addr:0xfbb49190
> xhci_bulk_tx()#set 
> TRB_ISP---Set
>  ISP
> xhci_bulk_tx()#0.>enqueue->generic:0xfbb491a02nd
>  transfer TRB queued
> xhci_bulk_tx()#0.last_transfer_trb_addr:0xfbb491a0
> --xhci_bulk_tx()#0.event->trans_event.buffer:0xfbb491a0-event
>  TRB that pointing to 2nd transfer TRB
> xhci_bulk_tx()#0.event->complete_code:1-not
>  short packet
> xhci_bulk_tx()#3.ep_index:0x2
> num_trbs_count > 1!
> ...
> usb_stor_BBB_transport()csw org fbb4aea0, adjusted to fbb4fffe
> xhci_bulk_tx()dev=fbb49950, pipe=c0010203, 

RE: [PATCH] usb: xhci: fix event trb handling missed

2020-11-11 Thread Ran Wang
Hi Bin

On Tuesday, November 10, 2020 5:55 PM, Ran Wang wrote:
> 
> Hi Bin,
> 



> > > record_transfer_result():comp_code: 1 csw org fbb4ec90,
> > > adjusted to fbb4fffe
> > > record_transfer_result():comp_code: 1
> > > record_transfer_result():comp_code: 1
> > > record_transfer_result():comp_code: 1 csw org fbb4ec90,
> > > adjusted to fbb4fffe
> > > record_transfer_result():comp_code: 1
> > > record_transfer_result():comp_code: 1
> > > record_transfer_result():comp_code: 1 csw org fbb4ec90,
> > > adjusted to fbb4fffe
> > > record_transfer_result():comp_code: 1
> > > record_transfer_result():comp_code: 1
> > > record_transfer_result():comp_code: 1
> >
> > In the earlier log you sent, you have the following:
> >
> > dev=fbb4f3c0, pipe=c0010283, buffer=fbb4fd80,
> > length=2048buffer would cross 64KB
> > boundary, so we will send request in more than 1 TRB, this is the key
> > issue trigger condition
> > xhci_bulk_tx()#0.1.running_total:0x280
> > xhci_bulk_tx()#0.2.trb_buff_len:0x280
> > xhci_bulk_tx()#0.3.running_total:0x280
> > xhci_bulk_tx()#0.4.num_trbs:0x2--2 Transfer
> > TRB
> > xhci_bulk_tx()#0.5.running_total:0x10280
> > xhci_bulk_tx()#0.start_trb:0xfbb47140
> > --xhci_bulk_tx()#0.>enqueue->generic:0xfbb47140-
> > 
> > -Assemble
> > 1st Transfer TRB
> > xhci_bulk_tx()#0.addr:0xfbb4fd80
> > xhci_bulk_tx()#0.trb_buff_len:0x280
> > xhci_bulk_tx()#0.running_total:0x280
> > xhci_bulk_tx()#0.length:0x800
> > xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x1
> > --xhci_bulk_tx()#0.>enqueue->generic:0xfbb47150-
> > 
> > -Assemble
> > 2nd Transfer TRB
> > xhci_bulk_tx()#0.addr:0xfbb5
> > xhci_bulk_tx()#0.trb_buff_len:0x580
> > xhci_bulk_tx()#0.running_total:0x800
> > xhci_bulk_tx()#0.length:0x800
> > xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x1
> >
> > Could you please specify where this output is for?
> 
> All this prints is within function xhci_bulk_tx(),
> 
> Let me open those verbose output and check if that crossing 64KB case has been
> triggered or not.
> But I have to get back to you later (tomorrow).

With your test commit and my verbose print code, issue cannot be reproduced in 
mass storage usage
(although crossing 64KB boundary condition has met), so looks like the only 
different to ethernet device
is that if xHC would receive short packet:

=> pri t
t=usb read 0x8200 0 30
=> run t

usb read: device 0 block # 0, count 3145728 ... 
usb_stor_BBB_transport()csw org fbb4ac90, adjusted to fbb4fffe
xhci_bulk_tx()dev=fbb49950, pipe=c0010203, buffer=fbb1a9c0, 
length=0x1f
xhci_bulk_tx()#0.4.num_trbs:0x1
xhci_bulk_tx()#0.>enqueue->generic:0xfbb49570
xhci_bulk_tx()#0.last_transfer_trb_addr:0xfbb49570
--xhci_bulk_tx()#0.event->trans_event.buffer:0xfbb49570
xhci_bulk_tx()#0.event->complete_code:1
xhci_bulk_tx()#3.ep_index:0x3
xhci_bulk_tx()dev=fbb49950, pipe=c0008283, buffer=fbb1ab00, 
length=0x200
xhci_bulk_tx()#0.4.num_trbs:0x1
xhci_bulk_tx()#set TRB_ISP 
xhci_bulk_tx()#0.>enqueue->generic:0xfbb49180
xhci_bulk_tx()#0.last_transfer_trb_addr:0xfbb49180
--xhci_bulk_tx()#0.event->trans_event.buffer:0xfbb49180
xhci_bulk_tx()#0.event->complete_code:1
xhci_bulk_tx()#3.ep_index:0x2
xhci_bulk_tx()dev=fbb49950, pipe=c0008283, buffer=fbb4fffe, 
length=0xd-buffer cross 64KB boundary
xhci_bulk_tx()#0.4.num_trbs:0x2
xhci_bulk_tx()#set 
TRB_ISP---Set
 ISP
xhci_bulk_tx()#0.>enqueue->generic:0xfbb491901st
 transfer TRB queued
xhci_bulk_tx()#0.last_transfer_trb_addr:0xfbb49190
xhci_bulk_tx()#set 
TRB_ISP---Set
 ISP
xhci_bulk_tx()#0.>enqueue->generic:0xfbb491a02nd
 transfer TRB queued
xhci_bulk_tx()#0.last_transfer_trb_addr:0xfbb491a0
--xhci_bulk_tx()#0.event->trans_event.buffer:0xfbb491a0-event
 TRB that pointing to 2nd transfer TRB
xhci_bulk_tx()#0.event->complete_code:1-not
 short packet
xhci_bulk_tx()#3.ep_index:0x2
num_trbs_count > 1!
...
usb_stor_BBB_transport()csw org fbb4aea0, adjusted to fbb4fffe
xhci_bulk_tx()dev=fbb49950, pipe=c0010203, buffer=fbb1ad00, 
length=0x1f
xhci_bulk_tx()#0.4.num_trbs:0x1
xhci_bulk_tx()#0.>enqueue->generic:0xfbb49790
xhci_bulk_tx()#0.last_transfer_trb_addr:0xfbb49790
--xhci_bulk_tx()#0.event->trans_event.buffer:0xfbb49790
xhci_bulk_tx()#0.event->complete_code:1

RE: [PATCH] usb: xhci: fix event trb handling missed

2020-11-10 Thread Ran Wang
Hi Bin,

On Tuesday, November 10, 2020 5:47 PM Bin Meng wrote:
> 
> Hi Ran,
> 
> On Tue, Nov 10, 2020 at 5:41 PM Ran Wang  wrote:
> >
> > Hi Bin,
> >
> > On Tuesday, November 10, 2020 5:03 PM Bin Meng wrote:
> > >
> > > Hi Ran,
> > >
> > > On Tue, Nov 10, 2020 at 4:31 PM Ran Wang 
> wrote:
> > > >
> > > > Hi Bin,
> > > >
> > > > On Tuesday, November 10, 2020 4:06 PM Ran Wang wrote:
> > > > > Hi Bin,
> >
> > 
> >
> > > >
> > >
> > > Based on your log, it matches what I suspected and we are getting
> > > close to the root cause.
> > >
> > > Could you please try the following simple test case to see if it can
> > > trigger the issue with a USB disk?
> > > After U-Boot boot on LS1028, with a USB flash disk attached, then
> > > type "usb start" to see if the USB disk can be recognized by U-Boot?
> > >
> > > commit 60e68ed667362870c20b36ae26dacc1af903e81e
> > > Author: Bin Meng 
> > > Date:   Tue Nov 10 16:19:06 2020 +0800
> > >
> > > WIP: usb: A simple test case to trigger the TRB 64K boundary
> > > issue with mass storage device
> > >
> > > This should not be applied as it aims to provide a simple test case to
> > > trigger the TRB 64K boundary issue as reported by Ran Wang @ NXP.
> > >
> > > Signed-off-by: Bin Meng 
> > >
> > > diff --git a/common/usb_storage.c b/common/usb_storage.c index
> > > ff25441..c8aec2e 100644
> > > --- a/common/usb_storage.c
> > > +++ b/common/usb_storage.c
> > > @@ -710,7 +710,15 @@ static int usb_stor_BBB_transport(struct
> > > scsi_cmd *srb, struct us_data *us)
> > > int dir_in;
> > > int actlen, data_actlen;
> > > unsigned int pipe, pipein, pipeout;
> > > +#if 0
> > > ALLOC_CACHE_ALIGN_BUFFER(struct umass_bbb_csw, csw, 1);
> > > +#else
> > > +   struct umass_bbb_csw *csw_org, *csw;
> > > +   csw_org = malloc(0x1 + UMASS_BBB_CSW_SIZE);
> > > +   csw = (struct umass_bbb_csw *)roundup((ulong)csw_org,
> 0x1);
> > > +   csw = (struct umass_bbb_csw *)((ulong)csw - 2);
> > > +   printf("csw org %p, adjusted to %p\n", csw_org, csw); #endif
> > >  #ifdef BBB_XPORT_TRACE
> > > unsigned char *ptr;
> > > int index;
> > > @@ -824,6 +832,7 @@ again:
> > > return USB_STOR_TRANSPORT_FAILED;
> > > }
> > >
> > > +   free(csw_org);
> > > return result;
> > >  }
> > >
> >
> > Issue cannot be reproduced on mass storage device with above hacking:
> 
> Thank you for the testing.
> 
> > =>
> > resetting USB...
> > Bus usb3@310: Register 200017f NbrPorts 2 Starting the controller
> > USB XHCI 1.00 Bus usb3@311: Register 200017f NbrPorts 2 Starting
> > the controller USB XHCI 1.00 scanning bus usb3@310 for devices...
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 13
> > record_transfer_result():comp_code: 13
> > record_transfer_result():comp_code: 13
> > record_transfer_result():comp_code: 13
> > record_transfer_result():comp_code: 1
> > csw org fbb53010, adjusted to fbb5fffe
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > csw org fbb53010, adjusted to fbb5fffe
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > csw org fbb53010, adjusted to fbb5fffe
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > 2 USB Device(s) found
> > scanning bus usb3@311 for devices... 1 USB Device(s) found
> >scanning usb for storage devices... 1 Storage Device(s) found
> > =>
> >
> > One more thing I'd like to tell is that I found this issue would not 
> > happened
> with 'usb start' (even with USB dongle).
> >
> > Below is the log I tested with ' usb read 0x8200 0 30', issue
> > could not be reproduced for at least 10 minutes ...
> > csw org fbb4ec90, adjusted to fbb4fffe
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > csw org fbb4ec90, adjusted to fbb4fffe
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > csw org fbb4ec90, adjusted to fbb4fffe
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > csw org fbb4ec90, adjusted to fbb4fffe
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > csw org fbb4ec90, adjusted to fbb4fffe
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > csw org fbb4ec90, 

Re: [PATCH] usb: xhci: fix event trb handling missed

2020-11-10 Thread Bin Meng
Hi Ran,

On Tue, Nov 10, 2020 at 5:41 PM Ran Wang  wrote:
>
> Hi Bin,
>
> On Tuesday, November 10, 2020 5:03 PM Bin Meng wrote:
> >
> > Hi Ran,
> >
> > On Tue, Nov 10, 2020 at 4:31 PM Ran Wang  wrote:
> > >
> > > Hi Bin,
> > >
> > > On Tuesday, November 10, 2020 4:06 PM Ran Wang wrote:
> > > > Hi Bin,
>
> 
>
> > >
> >
> > Based on your log, it matches what I suspected and we are getting close to 
> > the
> > root cause.
> >
> > Could you please try the following simple test case to see if it can 
> > trigger the
> > issue with a USB disk?
> > After U-Boot boot on LS1028, with a USB flash disk attached, then type "usb
> > start" to see if the USB disk can be recognized by U-Boot?
> >
> > commit 60e68ed667362870c20b36ae26dacc1af903e81e
> > Author: Bin Meng 
> > Date:   Tue Nov 10 16:19:06 2020 +0800
> >
> > WIP: usb: A simple test case to trigger the TRB 64K boundary issue with
> > mass storage device
> >
> > This should not be applied as it aims to provide a simple test case to
> > trigger the TRB 64K boundary issue as reported by Ran Wang @ NXP.
> >
> > Signed-off-by: Bin Meng 
> >
> > diff --git a/common/usb_storage.c b/common/usb_storage.c index
> > ff25441..c8aec2e 100644
> > --- a/common/usb_storage.c
> > +++ b/common/usb_storage.c
> > @@ -710,7 +710,15 @@ static int usb_stor_BBB_transport(struct scsi_cmd
> > *srb, struct us_data *us)
> > int dir_in;
> > int actlen, data_actlen;
> > unsigned int pipe, pipein, pipeout;
> > +#if 0
> > ALLOC_CACHE_ALIGN_BUFFER(struct umass_bbb_csw, csw, 1);
> > +#else
> > +   struct umass_bbb_csw *csw_org, *csw;
> > +   csw_org = malloc(0x1 + UMASS_BBB_CSW_SIZE);
> > +   csw = (struct umass_bbb_csw *)roundup((ulong)csw_org, 0x1);
> > +   csw = (struct umass_bbb_csw *)((ulong)csw - 2);
> > +   printf("csw org %p, adjusted to %p\n", csw_org, csw); #endif
> >  #ifdef BBB_XPORT_TRACE
> > unsigned char *ptr;
> > int index;
> > @@ -824,6 +832,7 @@ again:
> > return USB_STOR_TRANSPORT_FAILED;
> > }
> >
> > +   free(csw_org);
> > return result;
> >  }
> >
>
> Issue cannot be reproduced on mass storage device with above hacking:

Thank you for the testing.

> =>
> resetting USB...
> Bus usb3@310: Register 200017f NbrPorts 2
> Starting the controller
> USB XHCI 1.00
> Bus usb3@311: Register 200017f NbrPorts 2
> Starting the controller
> USB XHCI 1.00
> scanning bus usb3@310 for devices... record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 13
> record_transfer_result():comp_code: 13
> record_transfer_result():comp_code: 13
> record_transfer_result():comp_code: 13
> record_transfer_result():comp_code: 1
> csw org fbb53010, adjusted to fbb5fffe
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> csw org fbb53010, adjusted to fbb5fffe
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> csw org fbb53010, adjusted to fbb5fffe
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> 2 USB Device(s) found
> scanning bus usb3@311 for devices... 1 USB Device(s) found
>scanning usb for storage devices... 1 Storage Device(s) found
> =>
>
> One more thing I'd like to tell is that I found this issue would not happened 
> with 'usb start' (even with USB dongle).
>
> Below is the log I tested with ' usb read 0x8200 0 30', issue could 
> not be reproduced for at least 10 minutes
> ...
> csw org fbb4ec90, adjusted to fbb4fffe
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> csw org fbb4ec90, adjusted to fbb4fffe
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> csw org fbb4ec90, adjusted to fbb4fffe
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> csw org fbb4ec90, adjusted to fbb4fffe
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> csw org fbb4ec90, adjusted to fbb4fffe
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> csw org fbb4ec90, adjusted to fbb4fffe
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> csw org fbb4ec90, adjusted to fbb4fffe
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> 

RE: [PATCH] usb: xhci: fix event trb handling missed

2020-11-10 Thread Ran Wang
Hi Bin,

On Tuesday, November 10, 2020 5:03 PM Bin Meng wrote:
> 
> Hi Ran,
> 
> On Tue, Nov 10, 2020 at 4:31 PM Ran Wang  wrote:
> >
> > Hi Bin,
> >
> > On Tuesday, November 10, 2020 4:06 PM Ran Wang wrote:
> > > Hi Bin,



> >
> 
> Based on your log, it matches what I suspected and we are getting close to the
> root cause.
> 
> Could you please try the following simple test case to see if it can trigger 
> the
> issue with a USB disk?
> After U-Boot boot on LS1028, with a USB flash disk attached, then type "usb
> start" to see if the USB disk can be recognized by U-Boot?
> 
> commit 60e68ed667362870c20b36ae26dacc1af903e81e
> Author: Bin Meng 
> Date:   Tue Nov 10 16:19:06 2020 +0800
> 
> WIP: usb: A simple test case to trigger the TRB 64K boundary issue with
> mass storage device
> 
> This should not be applied as it aims to provide a simple test case to
> trigger the TRB 64K boundary issue as reported by Ran Wang @ NXP.
> 
> Signed-off-by: Bin Meng 
> 
> diff --git a/common/usb_storage.c b/common/usb_storage.c index
> ff25441..c8aec2e 100644
> --- a/common/usb_storage.c
> +++ b/common/usb_storage.c
> @@ -710,7 +710,15 @@ static int usb_stor_BBB_transport(struct scsi_cmd
> *srb, struct us_data *us)
> int dir_in;
> int actlen, data_actlen;
> unsigned int pipe, pipein, pipeout;
> +#if 0
> ALLOC_CACHE_ALIGN_BUFFER(struct umass_bbb_csw, csw, 1);
> +#else
> +   struct umass_bbb_csw *csw_org, *csw;
> +   csw_org = malloc(0x1 + UMASS_BBB_CSW_SIZE);
> +   csw = (struct umass_bbb_csw *)roundup((ulong)csw_org, 0x1);
> +   csw = (struct umass_bbb_csw *)((ulong)csw - 2);
> +   printf("csw org %p, adjusted to %p\n", csw_org, csw); #endif
>  #ifdef BBB_XPORT_TRACE
> unsigned char *ptr;
> int index;
> @@ -824,6 +832,7 @@ again:
> return USB_STOR_TRANSPORT_FAILED;
> }
> 
> +   free(csw_org);
> return result;
>  }
> 

Issue cannot be reproduced on mass storage device with above hacking:
=>
resetting USB...
Bus usb3@310: Register 200017f NbrPorts 2
Starting the controller
USB XHCI 1.00
Bus usb3@311: Register 200017f NbrPorts 2
Starting the controller
USB XHCI 1.00
scanning bus usb3@310 for devices... record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 13
record_transfer_result():comp_code: 13
record_transfer_result():comp_code: 13
record_transfer_result():comp_code: 13
record_transfer_result():comp_code: 1
csw org fbb53010, adjusted to fbb5fffe
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
csw org fbb53010, adjusted to fbb5fffe
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
csw org fbb53010, adjusted to fbb5fffe
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
2 USB Device(s) found
scanning bus usb3@311 for devices... 1 USB Device(s) found
   scanning usb for storage devices... 1 Storage Device(s) found
=>

One more thing I'd like to tell is that I found this issue would not happened 
with 'usb start' (even with USB dongle).

Below is the log I tested with ' usb read 0x8200 0 30', issue could not 
be reproduced for at least 10 minutes
...
csw org fbb4ec90, adjusted to fbb4fffe
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
csw org fbb4ec90, adjusted to fbb4fffe
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
csw org fbb4ec90, adjusted to fbb4fffe
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
csw org fbb4ec90, adjusted to fbb4fffe
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
csw org fbb4ec90, adjusted to fbb4fffe
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
csw org fbb4ec90, adjusted to fbb4fffe
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
csw org fbb4ec90, adjusted to fbb4fffe
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
csw org fbb4ec90, adjusted to fbb4fffe
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
csw org fbb4ec90, adjusted to fbb4fffe
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 

Re: [PATCH] usb: xhci: fix event trb handling missed

2020-11-10 Thread Bin Meng
Hi Ran,

On Tue, Nov 10, 2020 at 4:31 PM Ran Wang  wrote:
>
> Hi Bin,
>
> On Tuesday, November 10, 2020 4:06 PM Ran Wang wrote:
> > Hi Bin,
> >
> > On Tuesday, November 10, 2020 3:47 PM Bin Meng wrote:
> > >
> > > Hi Ran,
> > 
> > > > > > > send request in more than 1 Transfer TRB by chaining them, but
> > > > > > > then handle only 1 event TRB to mark request completed.
> > > > > > >
> > > > > > > However, on Layerscape platforms (LS1028A, LS1088A, etc), we
> > > > > > > observe xhci controller will generated more than 1 event TRB
> > > > > > > sometimes, this cause that
> > > > > >
> > > > > > I am not sure if it's fair to say this is Layerscape specific
> > > > > > behavior. Based on the xHCI spec, the spec indicates 1 event trb
> > > > > > should be generated when IOC/ISP flag is set to 1 or an error 
> > > > > > occurs.
> > > > >
> > > > > Ah, ISP flag is set if the pipe is from an IN endpoint. Currently we 
> > > > > have:
> > > > >
> > > > > /* Only set interrupt on short packet for IN endpoints */
> > > > > if (usb_pipein(pipe))
> > > > > field |= TRB_ISP;
> > > > >
> > > > > Can you verify that if removing the above codes, and without your
> > > > > changes in this patch, the original issue can be resolved on LS1028?
> > > >
> > > > Bingo, removing above codes can resolve my issue, too
> > >
> > > Thank you for your testing.
> > >
> > > >
> > > > > > I will see if I can reproduce your issue on an x86 board.
> > > > > >
> > > > >
> > > > > Note this patch does not apply on top of u-boot/master. Please rebase.
> > > >
> > > > Sure, I can rebase my patch, but I am nor sure my solution is still 
> > > > worth:
> > > > xHCI says "The detection of a USB Short Packet (i.e. the actual
> > > > number of bytes received was less than the expected number of bytes
> > > > defined by the Transfer TRB) during a transfer does not necessarily 
> > > > generate
> > an Event.
> > > "
> > > >
> > >
> > > Yes, that's what the spec says. So in your case, can you add some logs
> > > to verify that there is a transfer event trb generated by the xHC and
> > > the completion code is 13 (short packet)? You can add some debug
> > > output in record_transfer_result().
> >
> > Sure, let me try this can get back to you later.
>
> Confirm completion code is 13 when issue happen (please search ' 
> complete_code:13'):

Thank you very much for your testing.

>
> dev=fbb4f3c0, pipe=c0010283, buffer=fbb4fd80, 
> length=2048buffer would cross 64KB boundary, so 
> we will send request in more than 1 TRB, this is the key issue trigger 
> condition
> xhci_bulk_tx()#0.1.running_total:0x280
> xhci_bulk_tx()#0.2.trb_buff_len:0x280
> xhci_bulk_tx()#0.3.running_total:0x280
> xhci_bulk_tx()#0.4.num_trbs:0x2--2 Transfer TRB
> xhci_bulk_tx()#0.5.running_total:0x10280
> xhci_bulk_tx()#0.start_trb:0xfbb47140
> --xhci_bulk_tx()#0.>enqueue->generic:0xfbb47140--Assemble
>  1st Transfer TRB
> xhci_bulk_tx()#0.addr:0xfbb4fd80
> xhci_bulk_tx()#0.trb_buff_len:0x280
> xhci_bulk_tx()#0.running_total:0x280
> xhci_bulk_tx()#0.length:0x800
> xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x1
> --xhci_bulk_tx()#0.>enqueue->generic:0xfbb47150--Assemble
>  2nd Transfer TRB
> xhci_bulk_tx()#0.addr:0xfbb5
> xhci_bulk_tx()#0.trb_buff_len:0x580
> xhci_bulk_tx()#0.running_total:0x800
> xhci_bulk_tx()#0.length:0x800
> xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x1
> xhci_bulk_tx()#0.event->trans_event.buffer:0xfbb47140handle
>  event TRB for 1st Transfer TRB
> xhci_bulk_tx()#0.event->trans_event.transfer_len:0xd000180
> xhci_bulk_tx()#0.event->trans_event.flags:0x1058001
> xhci_bulk_tx()#0.event->len:384
> xhci_bulk_tx()#0.event->complete_code:13
> xhci_bulk_tx()#1.field:0x1058001
> xhci_bulk_tx()#2.TRB_TO_EP_ENDEX(field):0x4
> xhci_bulk_tx()#3.ep_index:0x4--xhci_bulk_tx()
>  return
>
> dev=fbb4f3c0, pipe=c0018203, buffer=fbb2f9c0, 
> length=110--an other call comming, and for different EP (ep_index = 
> 05)
> xhci_bulk_tx()#0.1.running_total:0x640
> xhci_bulk_tx()#0.2.trb_buff_len:0x640
> xhci_bulk_tx()#0.3.running_total:0x640
> xhci_bulk_tx()#0.4.num_trbs:0x1
> xhci_bulk_tx()#0.5.running_total:0x640
> xhci_bulk_tx()#0.start_trb:0xfbb47610
> --xhci_bulk_tx()#0.>enqueue->generic:0xfbb47610-queue
>  1st transfer TRB
> xhci_bulk_tx()#0.addr:0xfbb2f9c0
> xhci_bulk_tx()#0.trb_buff_len:0x6e
> xhci_bulk_tx()#0.running_total:0x6e
> xhci_bulk_tx()#0.length:0x6e
> xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x1
> xhci_bulk_tx()#0.event->trans_event.buffer:0xfbb47150hand 
> event TRB, note that this buffer is for last Transfer TRB in last call (see 
> above '2nd Transfer TRB')
> xhci_bulk_tx()#0.event->trans_event.transfer_len:0x100
> 

RE: [PATCH] usb: xhci: fix event trb handling missed

2020-11-10 Thread Ran Wang
Hi Bin,

On Tuesday, November 10, 2020 4:06 PM Ran Wang wrote: 
> Hi Bin,
> 
> On Tuesday, November 10, 2020 3:47 PM Bin Meng wrote:
> >
> > Hi Ran,
> 
> > > > > > send request in more than 1 Transfer TRB by chaining them, but
> > > > > > then handle only 1 event TRB to mark request completed.
> > > > > >
> > > > > > However, on Layerscape platforms (LS1028A, LS1088A, etc), we
> > > > > > observe xhci controller will generated more than 1 event TRB
> > > > > > sometimes, this cause that
> > > > >
> > > > > I am not sure if it's fair to say this is Layerscape specific
> > > > > behavior. Based on the xHCI spec, the spec indicates 1 event trb
> > > > > should be generated when IOC/ISP flag is set to 1 or an error occurs.
> > > >
> > > > Ah, ISP flag is set if the pipe is from an IN endpoint. Currently we 
> > > > have:
> > > >
> > > > /* Only set interrupt on short packet for IN endpoints */
> > > > if (usb_pipein(pipe))
> > > > field |= TRB_ISP;
> > > >
> > > > Can you verify that if removing the above codes, and without your
> > > > changes in this patch, the original issue can be resolved on LS1028?
> > >
> > > Bingo, removing above codes can resolve my issue, too
> >
> > Thank you for your testing.
> >
> > >
> > > > > I will see if I can reproduce your issue on an x86 board.
> > > > >
> > > >
> > > > Note this patch does not apply on top of u-boot/master. Please rebase.
> > >
> > > Sure, I can rebase my patch, but I am nor sure my solution is still worth:
> > > xHCI says "The detection of a USB Short Packet (i.e. the actual
> > > number of bytes received was less than the expected number of bytes
> > > defined by the Transfer TRB) during a transfer does not necessarily 
> > > generate
> an Event.
> > "
> > >
> >
> > Yes, that's what the spec says. So in your case, can you add some logs
> > to verify that there is a transfer event trb generated by the xHC and
> > the completion code is 13 (short packet)? You can add some debug
> > output in record_transfer_result().
> 
> Sure, let me try this can get back to you later.

Confirm completion code is 13 when issue happen (please search ' 
complete_code:13'):

dev=fbb4f3c0, pipe=c0010283, buffer=fbb4fd80, 
length=2048buffer would cross 64KB boundary, so we 
will send request in more than 1 TRB, this is the key issue trigger condition
xhci_bulk_tx()#0.1.running_total:0x280
xhci_bulk_tx()#0.2.trb_buff_len:0x280
xhci_bulk_tx()#0.3.running_total:0x280
xhci_bulk_tx()#0.4.num_trbs:0x2--2 Transfer TRB
xhci_bulk_tx()#0.5.running_total:0x10280
xhci_bulk_tx()#0.start_trb:0xfbb47140
--xhci_bulk_tx()#0.>enqueue->generic:0xfbb47140--Assemble
 1st Transfer TRB
xhci_bulk_tx()#0.addr:0xfbb4fd80
xhci_bulk_tx()#0.trb_buff_len:0x280
xhci_bulk_tx()#0.running_total:0x280
xhci_bulk_tx()#0.length:0x800
xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x1
--xhci_bulk_tx()#0.>enqueue->generic:0xfbb47150--Assemble
 2nd Transfer TRB
xhci_bulk_tx()#0.addr:0xfbb5
xhci_bulk_tx()#0.trb_buff_len:0x580
xhci_bulk_tx()#0.running_total:0x800
xhci_bulk_tx()#0.length:0x800
xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x1
xhci_bulk_tx()#0.event->trans_event.buffer:0xfbb47140handle
 event TRB for 1st Transfer TRB
xhci_bulk_tx()#0.event->trans_event.transfer_len:0xd000180
xhci_bulk_tx()#0.event->trans_event.flags:0x1058001
xhci_bulk_tx()#0.event->len:384
xhci_bulk_tx()#0.event->complete_code:13
xhci_bulk_tx()#1.field:0x1058001
xhci_bulk_tx()#2.TRB_TO_EP_ENDEX(field):0x4
xhci_bulk_tx()#3.ep_index:0x4--xhci_bulk_tx()
 return

dev=fbb4f3c0, pipe=c0018203, buffer=fbb2f9c0, 
length=110--an other call comming, and for different EP (ep_index = 05)
xhci_bulk_tx()#0.1.running_total:0x640
xhci_bulk_tx()#0.2.trb_buff_len:0x640
xhci_bulk_tx()#0.3.running_total:0x640
xhci_bulk_tx()#0.4.num_trbs:0x1
xhci_bulk_tx()#0.5.running_total:0x640
xhci_bulk_tx()#0.start_trb:0xfbb47610
--xhci_bulk_tx()#0.>enqueue->generic:0xfbb47610-queue
 1st transfer TRB
xhci_bulk_tx()#0.addr:0xfbb2f9c0
xhci_bulk_tx()#0.trb_buff_len:0x6e
xhci_bulk_tx()#0.running_total:0x6e
xhci_bulk_tx()#0.length:0x6e
xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x1
xhci_bulk_tx()#0.event->trans_event.buffer:0xfbb47150hand 
event TRB, note that this buffer is for last Transfer TRB in last call (see 
above '2nd Transfer TRB')
xhci_bulk_tx()#0.event->trans_event.transfer_len:0x100
xhci_bulk_tx()#0.event->trans_event.flags:0x1058000
xhci_bulk_tx()#0.event->len:0
xhci_bulk_tx()#0.event->complete_code:1
xhci_bulk_tx()#1.field:0x1058000
xhci_bulk_tx()#2.TRB_TO_EP_ENDEX(field):0x4ep_index is 4 rather 
than 5, then cause BUG()
xhci_bulk_tx()#3.ep_index:0x5-
ep_index mis-match wait for event 

RE: [PATCH] usb: xhci: fix event trb handling missed

2020-11-10 Thread Ran Wang
Hi Bin,

On Tuesday, November 10, 2020 3:47 PM Bin Meng wrote:
> 
> Hi Ran,

> > > > > send request in more than 1 Transfer TRB by chaining them, but
> > > > > then handle only 1 event TRB to mark request completed.
> > > > >
> > > > > However, on Layerscape platforms (LS1028A, LS1088A, etc), we
> > > > > observe xhci controller will generated more than 1 event TRB
> > > > > sometimes, this cause that
> > > >
> > > > I am not sure if it's fair to say this is Layerscape specific
> > > > behavior. Based on the xHCI spec, the spec indicates 1 event trb
> > > > should be generated when IOC/ISP flag is set to 1 or an error occurs.
> > >
> > > Ah, ISP flag is set if the pipe is from an IN endpoint. Currently we have:
> > >
> > > /* Only set interrupt on short packet for IN endpoints */
> > > if (usb_pipein(pipe))
> > > field |= TRB_ISP;
> > >
> > > Can you verify that if removing the above codes, and without your
> > > changes in this patch, the original issue can be resolved on LS1028?
> >
> > Bingo, removing above codes can resolve my issue, too
> 
> Thank you for your testing.
> 
> >
> > > > I will see if I can reproduce your issue on an x86 board.
> > > >
> > >
> > > Note this patch does not apply on top of u-boot/master. Please rebase.
> >
> > Sure, I can rebase my patch, but I am nor sure my solution is still worth:
> > xHCI says "The detection of a USB Short Packet (i.e. the actual number
> > of bytes received was less than the expected number of bytes defined
> > by the Transfer TRB) during a transfer does not necessarily generate an 
> > Event.
> "
> >
> 
> Yes, that's what the spec says. So in your case, can you add some logs to 
> verify
> that there is a transfer event trb generated by the xHC and the completion
> code is 13 (short packet)? You can add some debug output in
> record_transfer_result().

Sure, let me try this can get back to you later.

> > So does it mean above workaround you suggest would not violate spec,
> > either (although current Linux kernel driver still set ISP for this
> > case, but may have a more robust mechanism for event TRB handling)?
> 
> I need to do more testing to see if this is a LS1028 behavior or generic xHCI
> behavior, ie: on an x86 board. Do you happen to know if there is any errata
> document for LS1028 that is related to this?

I don’t see any errata for LS1028A for this case.

Actually this issue only randomly happens in U-Boot + USB dongle
(I have tried 3 different brands, such as TP-LINK UE300, etc).
And U-Boot + USB mass storage devicecannot reproduce this issue.

Under Linux, no such issue observed neither (with those dongles), all of them 
works fine.
That's why I decide to improve event TRB handling logic here.

Below is the U-Boot test commands I used (repeatedly do tftp file transfer), 
issue would happen in less 5 minutes.
for your reference:

setenv tt 'usb reset& ethact asix_eth& a000 tmp/1MB_test_file'
setenv ttt 'run tt& tt& tt& tt& tt& tt& tt& tt& 
tt& tt'
setenv  'run ttt& ttt& ttt& ttt& ttt& ttt& ttt& 
ttt& ttt'
setenv u 'run & & & & & & 
& & & & '
run u

Regards,
Ran


Re: [PATCH] usb: xhci: fix event trb handling missed

2020-11-09 Thread Bin Meng
Hi Ran,

On Tue, Nov 10, 2020 at 3:36 PM Ran Wang  wrote:
>
> Hi Bin,
>
> On Tuesday, November 10, 2020 1:43 PM Bin Meng wrote:
> >
> > Hi Ran,
> >
> > On Tue, Nov 10, 2020 at 1:30 PM Bin Meng  wrote:
> > >
> > > Hi Ran,
> > >
> > > On Tue, Sep 22, 2020 at 1:02 PM Ran Wang  wrote:
> > > >
> > > > In functiion xhci_bulk_tx(), when buffer cross 64KB boundary, it
> > > > will
> > >
> > > typo: function
>
> Got it.
>
> > > > send request in more than 1 Transfer TRB by chaining them, but then
> > > > handle only 1 event TRB to mark request completed.
> > > >
> > > > However, on Layerscape platforms (LS1028A, LS1088A, etc), we observe
> > > > xhci controller will generated more than 1 event TRB sometimes, this
> > > > cause that
> > >
> > > I am not sure if it's fair to say this is Layerscape specific
> > > behavior. Based on the xHCI spec, the spec indicates 1 event trb
> > > should be generated when IOC/ISP flag is set to 1 or an error occurs.
> >
> > Ah, ISP flag is set if the pipe is from an IN endpoint. Currently we have:
> >
> > /* Only set interrupt on short packet for IN endpoints */
> > if (usb_pipein(pipe))
> > field |= TRB_ISP;
> >
> > Can you verify that if removing the above codes, and without your changes in
> > this patch, the original issue can be resolved on LS1028?
>
> Bingo, removing above codes can resolve my issue, too

Thank you for your testing.

>
> > > I will see if I can reproduce your issue on an x86 board.
> > >
> >
> > Note this patch does not apply on top of u-boot/master. Please rebase.
>
> Sure, I can rebase my patch, but I am nor sure my solution is still worth:
> xHCI says "The detection of a USB Short Packet (i.e. the actual number of
> bytes received was less than the expected number of bytes defined by the
> Transfer TRB) during a transfer does not necessarily generate an Event. "
>

Yes, that's what the spec says. So in your case, can you add some logs
to verify that there is a transfer event trb generated by the xHC and
the completion code is 13 (short packet)? You can add some debug
output in record_transfer_result().

> So does it mean above workaround you suggest would not violate spec, either
> (although current Linux kernel driver still set ISP for this case, but may 
> have
> a more robust mechanism for event TRB handling)?

I need to do more testing to see if this is a LS1028 behavior or
generic xHCI behavior, ie: on an x86 board. Do you happen to know if
there is any errata document for LS1028 that is related to this?

>
>
> > > > function mishandle event TRB in next round call, then system hang
> > > > due to
> > > > BUG() checking.
> > > >
> > > > This patch adds a loop to make sure the event TRB for last Transfer
> > > > TRB has to be handled in time.
> > > >
> > > > Signed-off-by: Ran Wang 
> > > > ---
> > > >  drivers/usb/host/xhci-ring.c | 17 ++---
> > > >  1 file changed, 14 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/drivers/usb/host/xhci-ring.c
> > > > b/drivers/usb/host/xhci-ring.c index 092ed6e..d77e058 100644
> > > > --- a/drivers/usb/host/xhci-ring.c
> > > > +++ b/drivers/usb/host/xhci-ring.c
> > > > @@ -578,10 +578,13 @@ int xhci_bulk_tx(struct usb_device *udev,
> > unsigned long pipe,
> > > > int ret;
> > > > u32 trb_fields[4];
> > > > u64 val_64 = virt_to_phys(buffer);
> > > > +   void *last_transfer_trb_addr;
> > > > +   int available_length;
> > > >
> > > > debug("dev=%p, pipe=%lx, buffer=%p, length=%d\n",
> > > > udev, pipe, buffer, length);
> > > >
> > > > +   available_length = length;
> > > > ep_index = usb_pipe_ep_index(pipe);
> > > > virt_dev = ctrl->devs[slot_id];
> > > >
> > > > @@ -701,7 +704,7 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned
> > long pipe,
> > > > trb_fields[2] = length_field;
> > > > trb_fields[3] = field | (TRB_NORMAL <<
> > > > TRB_TYPE_SHIFT);
> > > >
> > > > -   queue_trb(ctrl, ring, (num_trbs > 1), trb_fields);
> > > > +   last_transfer_trb_addr = queue_trb(ctrl, ring,
> > > > + (num_trbs > 1), trb_fields);
> > > >
> > > > --num_trbs;
> > > >
> > > > @@ -714,6 +717,7 @@ int xhci_bulk_tx(struct usb_device *udev,
> > > > unsigned long pipe,
> > > >
> > > > giveback_first_trb(udev, ep_index, start_cycle, start_trb);
> > > >
> > > > +again:
> > > > event = xhci_wait_for_event(ctrl, TRB_TRANSFER);
> > > > if (!event) {
> > > > debug("XHCI bulk transfer timed out,
> > > > aborting...\n"); @@ -722,14 +726,21 @@ int xhci_bulk_tx(struct
> > usb_device *udev, unsigned long pipe,
> > > > udev->act_len = 0;
> > > > return -ETIMEDOUT;
> > > > }
> > > > -   field = le32_to_cpu(event->trans_event.flags);
> > > >
> > > > +   if ((void *)event->trans_event.buffer !=
> > > > + last_transfer_trb_addr) {
> > >
> > > This should be:
> > >
> > > 

RE: [PATCH] usb: xhci: fix event trb handling missed

2020-11-09 Thread Ran Wang
Hi Bin,

On Tuesday, November 10, 2020 1:43 PM Bin Meng wrote:
> 
> Hi Ran,
> 
> On Tue, Nov 10, 2020 at 1:30 PM Bin Meng  wrote:
> >
> > Hi Ran,
> >
> > On Tue, Sep 22, 2020 at 1:02 PM Ran Wang  wrote:
> > >
> > > In functiion xhci_bulk_tx(), when buffer cross 64KB boundary, it
> > > will
> >
> > typo: function

Got it.

> > > send request in more than 1 Transfer TRB by chaining them, but then
> > > handle only 1 event TRB to mark request completed.
> > >
> > > However, on Layerscape platforms (LS1028A, LS1088A, etc), we observe
> > > xhci controller will generated more than 1 event TRB sometimes, this
> > > cause that
> >
> > I am not sure if it's fair to say this is Layerscape specific
> > behavior. Based on the xHCI spec, the spec indicates 1 event trb
> > should be generated when IOC/ISP flag is set to 1 or an error occurs.
> 
> Ah, ISP flag is set if the pipe is from an IN endpoint. Currently we have:
> 
> /* Only set interrupt on short packet for IN endpoints */
> if (usb_pipein(pipe))
> field |= TRB_ISP;
> 
> Can you verify that if removing the above codes, and without your changes in
> this patch, the original issue can be resolved on LS1028?

Bingo, removing above codes can resolve my issue, too

> > I will see if I can reproduce your issue on an x86 board.
> >
> 
> Note this patch does not apply on top of u-boot/master. Please rebase.

Sure, I can rebase my patch, but I am nor sure my solution is still worth:
xHCI says "The detection of a USB Short Packet (i.e. the actual number of
bytes received was less than the expected number of bytes defined by the
Transfer TRB) during a transfer does not necessarily generate an Event. "

So does it mean above workaround you suggest would not violate spec, either
(although current Linux kernel driver still set ISP for this case, but may have
a more robust mechanism for event TRB handling)?


> > > function mishandle event TRB in next round call, then system hang
> > > due to
> > > BUG() checking.
> > >
> > > This patch adds a loop to make sure the event TRB for last Transfer
> > > TRB has to be handled in time.
> > >
> > > Signed-off-by: Ran Wang 
> > > ---
> > >  drivers/usb/host/xhci-ring.c | 17 ++---
> > >  1 file changed, 14 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/usb/host/xhci-ring.c
> > > b/drivers/usb/host/xhci-ring.c index 092ed6e..d77e058 100644
> > > --- a/drivers/usb/host/xhci-ring.c
> > > +++ b/drivers/usb/host/xhci-ring.c
> > > @@ -578,10 +578,13 @@ int xhci_bulk_tx(struct usb_device *udev,
> unsigned long pipe,
> > > int ret;
> > > u32 trb_fields[4];
> > > u64 val_64 = virt_to_phys(buffer);
> > > +   void *last_transfer_trb_addr;
> > > +   int available_length;
> > >
> > > debug("dev=%p, pipe=%lx, buffer=%p, length=%d\n",
> > > udev, pipe, buffer, length);
> > >
> > > +   available_length = length;
> > > ep_index = usb_pipe_ep_index(pipe);
> > > virt_dev = ctrl->devs[slot_id];
> > >
> > > @@ -701,7 +704,7 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned
> long pipe,
> > > trb_fields[2] = length_field;
> > > trb_fields[3] = field | (TRB_NORMAL <<
> > > TRB_TYPE_SHIFT);
> > >
> > > -   queue_trb(ctrl, ring, (num_trbs > 1), trb_fields);
> > > +   last_transfer_trb_addr = queue_trb(ctrl, ring,
> > > + (num_trbs > 1), trb_fields);
> > >
> > > --num_trbs;
> > >
> > > @@ -714,6 +717,7 @@ int xhci_bulk_tx(struct usb_device *udev,
> > > unsigned long pipe,
> > >
> > > giveback_first_trb(udev, ep_index, start_cycle, start_trb);
> > >
> > > +again:
> > > event = xhci_wait_for_event(ctrl, TRB_TRANSFER);
> > > if (!event) {
> > > debug("XHCI bulk transfer timed out,
> > > aborting...\n"); @@ -722,14 +726,21 @@ int xhci_bulk_tx(struct
> usb_device *udev, unsigned long pipe,
> > > udev->act_len = 0;
> > > return -ETIMEDOUT;
> > > }
> > > -   field = le32_to_cpu(event->trans_event.flags);
> > >
> > > +   if ((void *)event->trans_event.buffer !=
> > > + last_transfer_trb_addr) {
> >
> > This should be:
> >
> > if ((void *)le64_to_cpu(event->trans_event.buffer) !=
> > last_transfer_trb_addr)

Got it, will update this in next version (if you think my solution is still 
acceptable for this issue).

Thanks & regards,
Ran

> > > +   available_length -=
> > > +
> (int)EVENT_TRB_LEN(le32_to_cpu(event->trans_event.transfer_len));
> > > +   xhci_acknowledge_event(ctrl);
> > > +   goto again;
> > > +   }
> > > +
> > > +   field = le32_to_cpu(event->trans_event.flags);
> > > BUG_ON(TRB_TO_SLOT_ID(field) != slot_id);
> > > BUG_ON(TRB_TO_EP_INDEX(field) != ep_index);
> > > BUG_ON(*(void
> **)(uintptr_t)le64_to_cpu(event->trans_event.buffer) -
> > > buffer > (size_t)length);
> > >
> > > - 

Re: [PATCH] usb: xhci: fix event trb handling missed

2020-11-09 Thread Bin Meng
Hi Ran,

On Tue, Nov 10, 2020 at 1:30 PM Bin Meng  wrote:
>
> Hi Ran,
>
> On Tue, Sep 22, 2020 at 1:02 PM Ran Wang  wrote:
> >
> > In functiion xhci_bulk_tx(), when buffer cross 64KB boundary, it will
>
> typo: function
>
> > send request in more than 1 Transfer TRB by chaining them, but then handle
> > only 1 event TRB to mark request completed.
> >
> > However, on Layerscape platforms (LS1028A, LS1088A, etc), we observe xhci
> > controller will generated more than 1 event TRB sometimes, this cause that
>
> I am not sure if it's fair to say this is Layerscape specific
> behavior. Based on the xHCI spec, the spec indicates 1 event trb
> should be generated when IOC/ISP flag is set to 1 or an error occurs.

Ah, ISP flag is set if the pipe is from an IN endpoint. Currently we have:

/* Only set interrupt on short packet for IN endpoints */
if (usb_pipein(pipe))
field |= TRB_ISP;

Can you verify that if removing the above codes, and without your
changes in this patch, the original issue can be resolved on LS1028?

> I will see if I can reproduce your issue on an x86 board.
>

Note this patch does not apply on top of u-boot/master. Please rebase.

> > function mishandle event TRB in next round call, then system hang due to
> > BUG() checking.
> >
> > This patch adds a loop to make sure the event TRB for last Transfer TRB has
> > to be handled in time.
> >
> > Signed-off-by: Ran Wang 
> > ---
> >  drivers/usb/host/xhci-ring.c | 17 ++---
> >  1 file changed, 14 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
> > index 092ed6e..d77e058 100644
> > --- a/drivers/usb/host/xhci-ring.c
> > +++ b/drivers/usb/host/xhci-ring.c
> > @@ -578,10 +578,13 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned 
> > long pipe,
> > int ret;
> > u32 trb_fields[4];
> > u64 val_64 = virt_to_phys(buffer);
> > +   void *last_transfer_trb_addr;
> > +   int available_length;
> >
> > debug("dev=%p, pipe=%lx, buffer=%p, length=%d\n",
> > udev, pipe, buffer, length);
> >
> > +   available_length = length;
> > ep_index = usb_pipe_ep_index(pipe);
> > virt_dev = ctrl->devs[slot_id];
> >
> > @@ -701,7 +704,7 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned long 
> > pipe,
> > trb_fields[2] = length_field;
> > trb_fields[3] = field | (TRB_NORMAL << TRB_TYPE_SHIFT);
> >
> > -   queue_trb(ctrl, ring, (num_trbs > 1), trb_fields);
> > +   last_transfer_trb_addr = queue_trb(ctrl, ring, (num_trbs > 
> > 1), trb_fields);
> >
> > --num_trbs;
> >
> > @@ -714,6 +717,7 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned long 
> > pipe,
> >
> > giveback_first_trb(udev, ep_index, start_cycle, start_trb);
> >
> > +again:
> > event = xhci_wait_for_event(ctrl, TRB_TRANSFER);
> > if (!event) {
> > debug("XHCI bulk transfer timed out, aborting...\n");
> > @@ -722,14 +726,21 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned 
> > long pipe,
> > udev->act_len = 0;
> > return -ETIMEDOUT;
> > }
> > -   field = le32_to_cpu(event->trans_event.flags);
> >
> > +   if ((void *)event->trans_event.buffer != last_transfer_trb_addr) {
>
> This should be:
>
> if ((void *)le64_to_cpu(event->trans_event.buffer) != last_transfer_trb_addr)
>
> > +   available_length -=
> > +   
> > (int)EVENT_TRB_LEN(le32_to_cpu(event->trans_event.transfer_len));
> > +   xhci_acknowledge_event(ctrl);
> > +   goto again;
> > +   }
> > +
> > +   field = le32_to_cpu(event->trans_event.flags);
> > BUG_ON(TRB_TO_SLOT_ID(field) != slot_id);
> > BUG_ON(TRB_TO_EP_INDEX(field) != ep_index);
> > BUG_ON(*(void **)(uintptr_t)le64_to_cpu(event->trans_event.buffer) -
> > buffer > (size_t)length);
> >
> > -   record_transfer_result(udev, event, length);
> > +   record_transfer_result(udev, event, available_length);
> > xhci_acknowledge_event(ctrl);
> > xhci_inval_cache((uintptr_t)buffer, length);
> >

Regards,
Bin


Re: [PATCH] usb: xhci: fix event trb handling missed

2020-11-09 Thread Bin Meng
Hi Ran,

On Tue, Sep 22, 2020 at 1:02 PM Ran Wang  wrote:
>
> In functiion xhci_bulk_tx(), when buffer cross 64KB boundary, it will

typo: function

> send request in more than 1 Transfer TRB by chaining them, but then handle
> only 1 event TRB to mark request completed.
>
> However, on Layerscape platforms (LS1028A, LS1088A, etc), we observe xhci
> controller will generated more than 1 event TRB sometimes, this cause that

I am not sure if it's fair to say this is Layerscape specific
behavior. Based on the xHCI spec, the spec indicates 1 event trb
should be generated when IOC/ISP flag is set to 1 or an error occurs.
I will see if I can reproduce your issue on an x86 board.

> function mishandle event TRB in next round call, then system hang due to
> BUG() checking.
>
> This patch adds a loop to make sure the event TRB for last Transfer TRB has
> to be handled in time.
>
> Signed-off-by: Ran Wang 
> ---
>  drivers/usb/host/xhci-ring.c | 17 ++---
>  1 file changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
> index 092ed6e..d77e058 100644
> --- a/drivers/usb/host/xhci-ring.c
> +++ b/drivers/usb/host/xhci-ring.c
> @@ -578,10 +578,13 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned long 
> pipe,
> int ret;
> u32 trb_fields[4];
> u64 val_64 = virt_to_phys(buffer);
> +   void *last_transfer_trb_addr;
> +   int available_length;
>
> debug("dev=%p, pipe=%lx, buffer=%p, length=%d\n",
> udev, pipe, buffer, length);
>
> +   available_length = length;
> ep_index = usb_pipe_ep_index(pipe);
> virt_dev = ctrl->devs[slot_id];
>
> @@ -701,7 +704,7 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned long 
> pipe,
> trb_fields[2] = length_field;
> trb_fields[3] = field | (TRB_NORMAL << TRB_TYPE_SHIFT);
>
> -   queue_trb(ctrl, ring, (num_trbs > 1), trb_fields);
> +   last_transfer_trb_addr = queue_trb(ctrl, ring, (num_trbs > 
> 1), trb_fields);
>
> --num_trbs;
>
> @@ -714,6 +717,7 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned long 
> pipe,
>
> giveback_first_trb(udev, ep_index, start_cycle, start_trb);
>
> +again:
> event = xhci_wait_for_event(ctrl, TRB_TRANSFER);
> if (!event) {
> debug("XHCI bulk transfer timed out, aborting...\n");
> @@ -722,14 +726,21 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned long 
> pipe,
> udev->act_len = 0;
> return -ETIMEDOUT;
> }
> -   field = le32_to_cpu(event->trans_event.flags);
>
> +   if ((void *)event->trans_event.buffer != last_transfer_trb_addr) {

This should be:

if ((void *)le64_to_cpu(event->trans_event.buffer) != last_transfer_trb_addr)

> +   available_length -=
> +   
> (int)EVENT_TRB_LEN(le32_to_cpu(event->trans_event.transfer_len));
> +   xhci_acknowledge_event(ctrl);
> +   goto again;
> +   }
> +
> +   field = le32_to_cpu(event->trans_event.flags);
> BUG_ON(TRB_TO_SLOT_ID(field) != slot_id);
> BUG_ON(TRB_TO_EP_INDEX(field) != ep_index);
> BUG_ON(*(void **)(uintptr_t)le64_to_cpu(event->trans_event.buffer) -
> buffer > (size_t)length);
>
> -   record_transfer_result(udev, event, length);
> +   record_transfer_result(udev, event, available_length);
> xhci_acknowledge_event(ctrl);
> xhci_inval_cache((uintptr_t)buffer, length);
>

Regards,
Bin


RE: [PATCH] usb: xhci: fix event trb handling missed

2020-11-08 Thread Ran Wang
Hi Bin,

On Friday, November 6, 2020 3:54 PM, Bin Meng wrote:
> To: Marek Vasut 
> Cc: Ran Wang ; U-Boot Mailing List
> 
> Subject: Re: [PATCH] usb: xhci: fix event trb handling missed
> 
> On Tue, Sep 22, 2020 at 7:41 PM Marek Vasut  wrote:
> >
> > On 9/22/20 6:54 AM, Ran Wang wrote:
> > > In functiion xhci_bulk_tx(), when buffer cross 64KB boundary, it
> > > will send request in more than 1 Transfer TRB by chaining them, but
> > > then handle only 1 event TRB to mark request completed.
> > >
> > > However, on Layerscape platforms (LS1028A, LS1088A, etc), we observe
> > > xhci controller will generated more than 1 event TRB sometimes, this
> > > cause that function mishandle event TRB in next round call, then
> > > system hang due to
> > > BUG() checking.
> > >
> > > This patch adds a loop to make sure the event TRB for last Transfer
> > > TRB has to be handled in time.
> >
> > Bin, can you please take a look at this one ? Thanks
> 
> Sorry I missed this one. Will take a look at this soon.

Kindly remind that this patch is to resolve issue I ever sent to community for 
help: 
https://lists.denx.de/pipermail/u-boot/2020-September/426981.html

Thanks & Regards,
Ran



Re: [PATCH] usb: xhci: fix event trb handling missed

2020-11-05 Thread Bin Meng
On Tue, Sep 22, 2020 at 7:41 PM Marek Vasut  wrote:
>
> On 9/22/20 6:54 AM, Ran Wang wrote:
> > In functiion xhci_bulk_tx(), when buffer cross 64KB boundary, it will
> > send request in more than 1 Transfer TRB by chaining them, but then handle
> > only 1 event TRB to mark request completed.
> >
> > However, on Layerscape platforms (LS1028A, LS1088A, etc), we observe xhci
> > controller will generated more than 1 event TRB sometimes, this cause that
> > function mishandle event TRB in next round call, then system hang due to
> > BUG() checking.
> >
> > This patch adds a loop to make sure the event TRB for last Transfer TRB has
> > to be handled in time.
>
> Bin, can you please take a look at this one ? Thanks

Sorry I missed this one. Will take a look at this soon.

Regards,
Bin


Re: [PATCH] usb: xhci: fix event trb handling missed

2020-09-22 Thread Marek Vasut
On 9/22/20 6:54 AM, Ran Wang wrote:
> In functiion xhci_bulk_tx(), when buffer cross 64KB boundary, it will
> send request in more than 1 Transfer TRB by chaining them, but then handle
> only 1 event TRB to mark request completed.
> 
> However, on Layerscape platforms (LS1028A, LS1088A, etc), we observe xhci
> controller will generated more than 1 event TRB sometimes, this cause that
> function mishandle event TRB in next round call, then system hang due to
> BUG() checking.
> 
> This patch adds a loop to make sure the event TRB for last Transfer TRB has
> to be handled in time.

Bin, can you please take a look at this one ? Thanks