RE: [PATCH v7 09/10] usb: dwc3: Check for IOC/LST bit in both event->status and TRB->ctrl fields

2018-12-10 Thread Felipe Balbi

Hi,

Anurag Kumar Vulisha  writes:
> Thanks for reviewing this patch. Lets consider an example where a
> request has num_sgs > 0 and each sg is mapped to a TRB and the last
> TRB has the IOC bit set. Once the controller is done with the
> transfer, it  generates XferInProgress for the last TRB (since IOC bit
> is set). As a part of trb reclaim process
> dwc3_gadget_ep_reclaim_trb_sg() calls
> dwc3_gadget_ep_reclaim_completed_trb() for req->num_sgs times. Since
> the event already has the IOC bit set, the loop is exited from the
> loop at the very first TRB and the remaining TRBs (mapped to the sglist) 
> are left
unhandled.
> To avoid this we modified the code to exit only if both TRB & event
> has the IOC bit set.

Seems like IOC case should just test for chain flag as well:

>>>
>>> Okay. Along with this logic the code for updating chain bit should also be 
>>> modified I
>>guess.
>>
>>not really
>>
>>> Since the IOC bit is also set when there are not enough TRBs available, the 
>>> code
>>should be
>>> modified to not set DWC3_TRB_CTRL_CHN bit when the IOC bit is set. I will 
>>> update
>>below
>>> changes along with your suggestions and resend the patches.
>>
>>no. Actually I don't think we're allowed to split a scatter/gather like
>>that. I did that quite a while ago, but I don't think we're allowed to
>>do so. What we should do, in that case, is not even queue that request
>>until we have enough for all members of the scatter/gather. But that's a
>>separate patch, anyway.
>>
>
> Okay. I have a doubt here, not pushing the request until all sgs are mapped 
> to enough TRBs
> might remove the driver complexity but reduce the performance (since we are 
> waiting
> until enough TRBs are available). Are we okay with that?  

The only other way would be to copy the buffer over to a contiguous
buffer. That will also reduce performance. I think we need to consider
how frequently this may actually happen. I dare to say we don't have any
usb function in kernel as of today that can, easily and frequently, fall
into such a situation. Besides, the performance loss can be amortized by
a deeper request queue.

IMO, this is a minor problem. But, certainly, if you have the setup,
_do_ run some benchmarking and report your findings :-)

-- 
balbi


signature.asc
Description: PGP signature


RE: [PATCH v7 09/10] usb: dwc3: Check for IOC/LST bit in both event->status and TRB->ctrl fields

2018-12-10 Thread Anurag Kumar Vulisha
Hi Felipe,

>-Original Message-
>From: Felipe Balbi [mailto:ba...@kernel.org]
>Sent: Monday, December 10, 2018 12:24 PM
>To: Anurag Kumar Vulisha ; Greg Kroah-Hartman
>; Shuah Khan ; Alan Stern
>; Johan Hovold ; Jaejoong Kim
>; Benjamin Herrenschmidt ;
>Roger Quadros ; Manu Gautam ;
>martin.peter...@oracle.com; Bart Van Assche ; Mike
>Christie ; Matthew Wilcox ; Colin Ian
>King 
>Cc: linux-...@vger.kernel.org; linux-kernel@vger.kernel.org;
>v.anuragku...@gmail.com; Thinh Nguyen ; Tejas Joglekar
>; Ajay Yugalkishore Pandey 
>Subject: RE: [PATCH v7 09/10] usb: dwc3: Check for IOC/LST bit in both 
>event->status
>and TRB->ctrl fields
>
>
>Hi,
>
>Anurag Kumar Vulisha  writes:
>> HI Felipe,
>>
>>>-Original Message-
>>>From: Felipe Balbi [mailto:ba...@kernel.org]
>>>Sent: Friday, December 07, 2018 11:42 AM
>>>To: Anurag Kumar Vulisha ; Greg Kroah-Hartman
>>>; Shuah Khan ; Alan Stern
>>>; Johan Hovold ; Jaejoong Kim
>>>; Benjamin Herrenschmidt ;
>>>Roger Quadros ; Manu Gautam ;
>>>martin.peter...@oracle.com; Bart Van Assche ; Mike
>>>Christie ; Matthew Wilcox ; Colin
>Ian
>>>King 
>>>Cc: linux-...@vger.kernel.org; linux-kernel@vger.kernel.org;
>>>v.anuragku...@gmail.com; Thinh Nguyen ; Tejas
>Joglekar
>>>; Ajay Yugalkishore Pandey
>
>>>Subject: RE: [PATCH v7 09/10] usb: dwc3: Check for IOC/LST bit in both event-
>>status
>>>and TRB->ctrl fields
>>>
>>>
>>>Hi,
>>>
>>>Anurag Kumar Vulisha  writes:
>> @@ -2286,7 +2286,12 @@ static int
>dwc3_gadget_ep_reclaim_completed_trb(struct dwc3_ep *dep,
>>  if (event->status & DEPEVT_STATUS_SHORT && !chain)
>>  return 1;
>>
>> -if (event->status & (DEPEVT_STATUS_IOC | DEPEVT_STATUS_LST))
>> +if ((event->status & DEPEVT_STATUS_IOC) &&
>> +(trb->ctrl & DWC3_TRB_CTRL_IOC))
>> +return 1;
>
>this shouldn't be necessary. According to databook, event->status
>contains the bits from the completed TRB.  Which means that
>event->status & IOC will always be equal to trb->ctrl & IOC.
>
 Thanks for reviewing this patch. Lets consider an example where a
 request has num_sgs > 0 and each sg is mapped to a TRB and the last
 TRB has the IOC bit set. Once the controller is done with the
 transfer, it  generates XferInProgress for the last TRB (since IOC bit
 is set). As a part of trb reclaim process
 dwc3_gadget_ep_reclaim_trb_sg() calls
 dwc3_gadget_ep_reclaim_completed_trb() for req->num_sgs times. Since
 the event already has the IOC bit set, the loop is exited from the
 loop at the very first TRB and the remaining TRBs (mapped to the sglist) 
 are left
>>>unhandled.
 To avoid this we modified the code to exit only if both TRB & event
 has the IOC bit set.
>>>
>>>Seems like IOC case should just test for chain flag as well:
>>>
>>
>> Okay. Along with this logic the code for updating chain bit should also be 
>> modified I
>guess.
>
>not really
>
>> Since the IOC bit is also set when there are not enough TRBs available, the 
>> code
>should be
>> modified to not set DWC3_TRB_CTRL_CHN bit when the IOC bit is set. I will 
>> update
>below
>> changes along with your suggestions and resend the patches.
>
>no. Actually I don't think we're allowed to split a scatter/gather like
>that. I did that quite a while ago, but I don't think we're allowed to
>do so. What we should do, in that case, is not even queue that request
>until we have enough for all members of the scatter/gather. But that's a
>separate patch, anyway.
>

Okay. I have a doubt here, not pushing the request until all sgs are mapped to 
enough TRBs
might remove the driver complexity but reduce the performance (since we are 
waiting
until enough TRBs are available). Are we okay with that?  

Thanks,
Anurag Kumar Vulisha  


RE: [PATCH v7 09/10] usb: dwc3: Check for IOC/LST bit in both event->status and TRB->ctrl fields

2018-12-09 Thread Felipe Balbi

Hi,

Anurag Kumar Vulisha  writes:
> HI Felipe,
>
>>-Original Message-
>>From: Felipe Balbi [mailto:ba...@kernel.org]
>>Sent: Friday, December 07, 2018 11:42 AM
>>To: Anurag Kumar Vulisha ; Greg Kroah-Hartman
>>; Shuah Khan ; Alan Stern
>>; Johan Hovold ; Jaejoong Kim
>>; Benjamin Herrenschmidt ;
>>Roger Quadros ; Manu Gautam ;
>>martin.peter...@oracle.com; Bart Van Assche ; Mike
>>Christie ; Matthew Wilcox ; Colin 
>>Ian
>>King 
>>Cc: linux-...@vger.kernel.org; linux-kernel@vger.kernel.org;
>>v.anuragku...@gmail.com; Thinh Nguyen ; Tejas Joglekar
>>; Ajay Yugalkishore Pandey 
>>Subject: RE: [PATCH v7 09/10] usb: dwc3: Check for IOC/LST bit in both 
>>event->status
>>and TRB->ctrl fields
>>
>>
>>Hi,
>>
>>Anurag Kumar Vulisha  writes:
> @@ -2286,7 +2286,12 @@ static int
dwc3_gadget_ep_reclaim_completed_trb(struct dwc3_ep *dep,
>   if (event->status & DEPEVT_STATUS_SHORT && !chain)
>   return 1;
>
> - if (event->status & (DEPEVT_STATUS_IOC | DEPEVT_STATUS_LST))
> + if ((event->status & DEPEVT_STATUS_IOC) &&
> + (trb->ctrl & DWC3_TRB_CTRL_IOC))
> + return 1;

this shouldn't be necessary. According to databook, event->status
contains the bits from the completed TRB.  Which means that
event->status & IOC will always be equal to trb->ctrl & IOC.

>>> Thanks for reviewing this patch. Lets consider an example where a
>>> request has num_sgs > 0 and each sg is mapped to a TRB and the last
>>> TRB has the IOC bit set. Once the controller is done with the
>>> transfer, it  generates XferInProgress for the last TRB (since IOC bit
>>> is set). As a part of trb reclaim process
>>> dwc3_gadget_ep_reclaim_trb_sg() calls
>>> dwc3_gadget_ep_reclaim_completed_trb() for req->num_sgs times. Since
>>> the event already has the IOC bit set, the loop is exited from the
>>> loop at the very first TRB and the remaining TRBs (mapped to the sglist) 
>>> are left
>>unhandled.
>>> To avoid this we modified the code to exit only if both TRB & event
>>> has the IOC bit set.
>>
>>Seems like IOC case should just test for chain flag as well:
>>
>
> Okay. Along with this logic the code for updating chain bit should also be 
> modified I guess.

not really

> Since the IOC bit is also set when there are not enough TRBs available, the 
> code should be
> modified to not set DWC3_TRB_CTRL_CHN bit when the IOC bit is set. I will 
> update below
> changes along with your suggestions and resend the patches.

no. Actually I don't think we're allowed to split a scatter/gather like
that. I did that quite a while ago, but I don't think we're allowed to
do so. What we should do, in that case, is not even queue that request
until we have enough for all members of the scatter/gather. But that's a
separate patch, anyway.

-- 
balbi


signature.asc
Description: PGP signature


RE: [PATCH v7 09/10] usb: dwc3: Check for IOC/LST bit in both event->status and TRB->ctrl fields

2018-12-08 Thread Anurag Kumar Vulisha


HI Felipe,

>-Original Message-
>From: Felipe Balbi [mailto:ba...@kernel.org]
>Sent: Friday, December 07, 2018 11:42 AM
>To: Anurag Kumar Vulisha ; Greg Kroah-Hartman
>; Shuah Khan ; Alan Stern
>; Johan Hovold ; Jaejoong Kim
>; Benjamin Herrenschmidt ;
>Roger Quadros ; Manu Gautam ;
>martin.peter...@oracle.com; Bart Van Assche ; Mike
>Christie ; Matthew Wilcox ; Colin Ian
>King 
>Cc: linux-...@vger.kernel.org; linux-kernel@vger.kernel.org;
>v.anuragku...@gmail.com; Thinh Nguyen ; Tejas Joglekar
>; Ajay Yugalkishore Pandey 
>Subject: RE: [PATCH v7 09/10] usb: dwc3: Check for IOC/LST bit in both 
>event->status
>and TRB->ctrl fields
>
>
>Hi,
>
>Anurag Kumar Vulisha  writes:
 @@ -2286,7 +2286,12 @@ static int
>>>dwc3_gadget_ep_reclaim_completed_trb(struct dwc3_ep *dep,
if (event->status & DEPEVT_STATUS_SHORT && !chain)
return 1;

 -  if (event->status & (DEPEVT_STATUS_IOC | DEPEVT_STATUS_LST))
 +  if ((event->status & DEPEVT_STATUS_IOC) &&
 +  (trb->ctrl & DWC3_TRB_CTRL_IOC))
 +  return 1;
>>>
>>>this shouldn't be necessary. According to databook, event->status
>>>contains the bits from the completed TRB.  Which means that
>>>event->status & IOC will always be equal to trb->ctrl & IOC.
>>>
>> Thanks for reviewing this patch. Lets consider an example where a
>> request has num_sgs > 0 and each sg is mapped to a TRB and the last
>> TRB has the IOC bit set. Once the controller is done with the
>> transfer, it  generates XferInProgress for the last TRB (since IOC bit
>> is set). As a part of trb reclaim process
>> dwc3_gadget_ep_reclaim_trb_sg() calls
>> dwc3_gadget_ep_reclaim_completed_trb() for req->num_sgs times. Since
>> the event already has the IOC bit set, the loop is exited from the
>> loop at the very first TRB and the remaining TRBs (mapped to the sglist) are 
>> left
>unhandled.
>> To avoid this we modified the code to exit only if both TRB & event
>> has the IOC bit set.
>
>Seems like IOC case should just test for chain flag as well:
>

Okay. Along with this logic the code for updating chain bit should also be 
modified I guess.
Since the IOC bit is also set when there are not enough TRBs available, the 
code should be
modified to not set DWC3_TRB_CTRL_CHN bit when the IOC bit is set. I will 
update below
changes along with your suggestions and resend the patches.

@@ -998,7 +998,7 @@ static void __dwc3_prepare_one_trb(struct dwc3_ep *dep, 
struct dwc3_trb *trb,
(dwc3_calc_trbs_left(dep) == 1))
trb->ctrl |= DWC3_TRB_CTRL_IOC;
 
-   if (chain)
+   if (chain && !(trb->ctrl & DWC3_TRB_CTRL_IOC))
trb->ctrl |= DWC3_TRB_CTRL_CHN;
 
if (usb_endpoint_xfer_bulk(dep->endpoint.desc) && dep->stream_capable)
@@ -2372,7 +2372,7 @@ static int dwc3_gadget_ep_reclaim_completed_trb(struct 
dwc3_ep *dep,
if (event->status & DEPEVT_STATUS_SHORT && !chain)
return 1;
 
-   if (event->status & DEPEVT_STATUS_IOC)
+   if (event->status & DEPEVT_STATUS_IOC && !chain)
return 1;
 
return 0;
@@ -2399,7 +2399,7 @@ static int dwc3_gadget_ep_reclaim_trb_sg(struct dwc3_ep 
*dep,
req->num_pending_sgs--;
 
ret = dwc3_gadget_ep_reclaim_completed_trb(dep, req,
-   trb, event, status, true);
+   trb, event, status, (trb & DWC3_TRB_CTRL_CHN));
if (ret)
break;
}

Thanks,
Anurag Kumar Vulisha

>modified   drivers/usb/dwc3/gadget.c
>@@ -2372,7 +2372,7 @@ static int dwc3_gadget_ep_reclaim_completed_trb(struct
>dwc3_ep *dep,
>   if (event->status & DEPEVT_STATUS_SHORT && !chain)
>   return 1;
>
>-  if (event->status & DEPEVT_STATUS_IOC)
>+  if (event->status & DEPEVT_STATUS_IOC && !chain)
>   return 1;
>
>   return 0;
>
>--
>balbi


RE: [PATCH v7 09/10] usb: dwc3: Check for IOC/LST bit in both event->status and TRB->ctrl fields

2018-12-06 Thread Felipe Balbi


Hi,

Anurag Kumar Vulisha  writes:
>>> @@ -2286,7 +2286,12 @@ static int
>>dwc3_gadget_ep_reclaim_completed_trb(struct dwc3_ep *dep,
>>> if (event->status & DEPEVT_STATUS_SHORT && !chain)
>>> return 1;
>>>
>>> -   if (event->status & (DEPEVT_STATUS_IOC | DEPEVT_STATUS_LST))
>>> +   if ((event->status & DEPEVT_STATUS_IOC) &&
>>> +   (trb->ctrl & DWC3_TRB_CTRL_IOC))
>>> +   return 1;
>>
>>this shouldn't be necessary. According to databook, event->status
>>contains the bits from the completed TRB.  Which means that
>>event->status & IOC will always be equal to trb->ctrl & IOC.
>>
> Thanks for reviewing this patch. Lets consider an example where a request
> has num_sgs > 0 and each sg is mapped to a TRB and the last TRB has the
> IOC bit set. Once the controller is done with the transfer, it  generates 
> XferInProgress for the last TRB (since IOC bit is set). As a part of trb 
> reclaim
> process  dwc3_gadget_ep_reclaim_trb_sg() calls
> dwc3_gadget_ep_reclaim_completed_trb() for req->num_sgs times. Since
> the event already has the IOC bit set, the loop is exited from the loop at the
> very first TRB and the remaining TRBs (mapped to the sglist) are left 
> unhandled.
> To avoid this we modified the code to exit only if both TRB & event has the 
> IOC
> bit set.

Seems like IOC case should just test for chain flag as well:

modified   drivers/usb/dwc3/gadget.c
@@ -2372,7 +2372,7 @@ static int dwc3_gadget_ep_reclaim_completed_trb(struct 
dwc3_ep *dep,
if (event->status & DEPEVT_STATUS_SHORT && !chain)
return 1;
 
-   if (event->status & DEPEVT_STATUS_IOC)
+   if (event->status & DEPEVT_STATUS_IOC && !chain)
return 1;
 
return 0;

-- 
balbi