Re: [PATCH] usb: xhci: add quirk flag for broken stop command on AMD platforms

2017-05-26 Thread Mathias Nyman

On 25.05.2017 16:17, Shyam Sundar S K wrote:



On 5/22/2017 8:26 PM, Shyam Sundar S K wrote:


On 5/22/2017 6:49 PM, Mathias Nyman wrote:

On 22.05.2017 11:56, Shyam Sundar S K wrote:

Hi Mathias,


On 5/19/2017 12:43 PM, Mathias Nyman wrote:

On 18.05.2017 16:46, Alan Stern wrote:

On Thu, 18 May 2017, Shyam Sundar S K wrote:


on AMD platforms with SNPS 3.1 USB controller has an issue
if the stop EP command is issued when the controller is not
in running state. If issued, it is leading to a critical RTL bug
because of which controller goes into irrecoverable state.

This patch adds a appropriate checks to make sure that scenario
does not happen.

Signed-off-by: Shyam Sundar S K 
Signed-off-by: Nehal Shah 
---
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1819,6 +1819,7 @@ struct xhci_hcd {
/* For controller with a broken Port Disable implementation */
#define XHCI_BROKEN_PORT_PED(1 << 25)
#define XHCI_LIMIT_ENDPOINT_INTERVAL_7(1 << 26)
+#define XHCI_BROKEN_STOP(1 << 27)

Does there really need to be a quirk flag for this?  I should think
that you never want to issue a STOP EP command while the controller is
not running, no matter what kind of controller it is.

Alan Stern


If it's about controller not running then there shouldn't be any problems.
We shouldn't issue a stop endpoint command if controller is halted.

If this is about issuing a stop endpoint command while endpoint isn't
running, then fully working controllers should just respond with a command
completion with "context state error" status.

As per SNPS the controller is responding with "Context State Error", however 
the same is not getting
reflected when we check the cmd->status in the xhci driver.


Anyway, as Alan said the quirk is probably unnecessary here.

OK. We will take care of this.


We shouldn't need to
stop endpoints that are not running. Only problem I see here is that the
endpoint state in the output endpoint context might not be up to date. If driver
just restarted the endpoint by ringing the doorbell, the output context state
might not be updated yet.

Before issuing the stop end point command, we checked the state of the endpoint 
and it looks the state of
the end point is EP_STATE_STOPPED. If the output endpoint context is not 
updated is there a better way
to retrieve the EP state before issuing the stop end point command ?

Not really, checking endpoint context and possible a software variable kept up 
to date
by driver to keep track of doorbell. Perhaps checking endpoint ctx is enough 
for now

So, is it OK to guard the stop endpoint by checking the EP context before 
queuing it?


How does this SNPS 3.1 controller react if the endpoint just halted or moved to
error state just before controller runs the stop endpoint command? Still 
triggers
the RTL bug?

As per SNPS analysis.

1) Driver issues STOP ENDPOINT command  and the EP is in Running state.
2) HW executes the STOP ENDPOINT command successfully
3) Driver again issues STOP ENDPOINT command.
4) Since the EP is already halted/stopped, HW completes the command execution 
and reports “device context error” completion code. This is as per the spec.
5) However HW on receiving the second command additionally marks EP to Flow 
control state in HW which is RTL bug
6) The above bug causes the HW not to respond to any further doorbells that are 
rung by the driver. This causes the EP to not functional anymore and causes 
gross functional failures.


What happens if endpoint ctx shows endpoint is in the halted or error state 
when stop endpoint command is issued?
  still RTL bug?

Yes. That's right. If EP context shows as halted/stopped/error and we issue a 
stop endpoint command it is triggering the RTL bug. Since the tapeout has 
already happened and there is no way to fix this
from SNPS side, they are asking for a SW workaround i.e.  "issuing the stop endpoint 
command only when the EP context state is running."

So, it is OK to have this patch submission which will check for EP context 
before queuing the stop endpoint command ?


I'm talking about the in xhci spec 4.6.9:

" A Busy endpoint may asynchronously transition from the Running to the Halted 
or Error state due
to error conditions detected while processing TRBs. A possible race condition 
may occur if
software, thinking an endpoint is in the Running state, issues a Stop Endpoint 
Command however
at the same time the xHC asynchronously transitions the endpoint to the Halted 
or Error state. In
this case, a Context State Error may be generated for the command completion. 
Software may
verify that this case occurred by inspecting the EP State for Halted or Error 
when a Stop Endpoint
Command results in a Context State Error."


Since we are novice, can you please help to understand what is the intuition 
behind sending two stop endpoint commands ?

No need for two stop endpoint commands, that can be fixed in the 

Re: [PATCH] usb: xhci: add quirk flag for broken stop command on AMD platforms

2017-05-25 Thread Shyam Sundar S K


On 5/22/2017 8:26 PM, Shyam Sundar S K wrote:
>
> On 5/22/2017 6:49 PM, Mathias Nyman wrote:
>> On 22.05.2017 11:56, Shyam Sundar S K wrote:
>>> Hi Mathias,
>>>
>>>
>>> On 5/19/2017 12:43 PM, Mathias Nyman wrote:
 On 18.05.2017 16:46, Alan Stern wrote:
> On Thu, 18 May 2017, Shyam Sundar S K wrote:
>
>> on AMD platforms with SNPS 3.1 USB controller has an issue
>> if the stop EP command is issued when the controller is not
>> in running state. If issued, it is leading to a critical RTL bug
>> because of which controller goes into irrecoverable state.
>>
>> This patch adds a appropriate checks to make sure that scenario
>> does not happen.
>>
>> Signed-off-by: Shyam Sundar S K 
>> Signed-off-by: Nehal Shah 
>> ---
>> --- a/drivers/usb/host/xhci.h
>> +++ b/drivers/usb/host/xhci.h
>> @@ -1819,6 +1819,7 @@ struct xhci_hcd {
>>/* For controller with a broken Port Disable implementation */
>>#define XHCI_BROKEN_PORT_PED(1 << 25)
>>#define XHCI_LIMIT_ENDPOINT_INTERVAL_7(1 << 26)
>> +#define XHCI_BROKEN_STOP(1 << 27)
> Does there really need to be a quirk flag for this?  I should think
> that you never want to issue a STOP EP command while the controller is
> not running, no matter what kind of controller it is.
>
> Alan Stern
>
 If it's about controller not running then there shouldn't be any problems.
 We shouldn't issue a stop endpoint command if controller is halted.

 If this is about issuing a stop endpoint command while endpoint isn't
 running, then fully working controllers should just respond with a command
 completion with "context state error" status.
>>> As per SNPS the controller is responding with "Context State Error", 
>>> however the same is not getting
>>> reflected when we check the cmd->status in the xhci driver.
>>>
 Anyway, as Alan said the quirk is probably unnecessary here.
>>> OK. We will take care of this.
>>>
 We shouldn't need to
 stop endpoints that are not running. Only problem I see here is that the
 endpoint state in the output endpoint context might not be up to date. If 
 driver
 just restarted the endpoint by ringing the doorbell, the output context 
 state
 might not be updated yet.
>>> Before issuing the stop end point command, we checked the state of the 
>>> endpoint and it looks the state of
>>> the end point is EP_STATE_STOPPED. If the output endpoint context is not 
>>> updated is there a better way
>>> to retrieve the EP state before issuing the stop end point command ?
>> Not really, checking endpoint context and possible a software variable kept 
>> up to date
>> by driver to keep track of doorbell. Perhaps checking endpoint ctx is enough 
>> for now
> So, is it OK to guard the stop endpoint by checking the EP context before 
> queuing it?
>
 How does this SNPS 3.1 controller react if the endpoint just halted or 
 moved to
 error state just before controller runs the stop endpoint command? Still 
 triggers
 the RTL bug?
>>> As per SNPS analysis.
>>>
>>> 1) Driver issues STOP ENDPOINT command  and the EP is in Running state.
>>> 2) HW executes the STOP ENDPOINT command successfully
>>> 3) Driver again issues STOP ENDPOINT command.
>>> 4) Since the EP is already halted/stopped, HW completes the command 
>>> execution and reports “device context error” completion code. This is as 
>>> per the spec.
>>> 5) However HW on receiving the second command additionally marks EP to Flow 
>>> control state in HW which is RTL bug
>>> 6) The above bug causes the HW not to respond to any further doorbells that 
>>> are rung by the driver. This causes the EP to not functional anymore and 
>>> causes gross functional failures.
>>>
>> What happens if endpoint ctx shows endpoint is in the halted or error state 
>> when stop endpoint command is issued?
>>  still RTL bug?
> Yes. That's right. If EP context shows as halted/stopped/error and we issue a 
> stop endpoint command it is triggering the RTL bug. Since the tapeout has 
> already happened and there is no way to fix this
> from SNPS side, they are asking for a SW workaround i.e.  "issuing the stop 
> endpoint command only when the EP context state is running."
>
> So, it is OK to have this patch submission which will check for EP context 
> before queuing the stop endpoint command ?
>
 I'm talking about the in xhci spec 4.6.9:

 " A Busy endpoint may asynchronously transition from the Running to the 
 Halted or Error state due
 to error conditions detected while processing TRBs. A possible race 
 condition may occur if
 software, thinking an endpoint is in the Running state, issues a Stop 
 Endpoint Command however
 at the same time the xHC asynchronously transitions the endpoint to the 
 Halted or Error state. In
 this case, a 

Re: [PATCH] usb: xhci: add quirk flag for broken stop command on AMD platforms

2017-05-22 Thread Shyam Sundar S K


On 5/22/2017 6:49 PM, Mathias Nyman wrote:
> On 22.05.2017 11:56, Shyam Sundar S K wrote:
>> Hi Mathias,
>>
>>
>> On 5/19/2017 12:43 PM, Mathias Nyman wrote:
>>> On 18.05.2017 16:46, Alan Stern wrote:
 On Thu, 18 May 2017, Shyam Sundar S K wrote:

> on AMD platforms with SNPS 3.1 USB controller has an issue
> if the stop EP command is issued when the controller is not
> in running state. If issued, it is leading to a critical RTL bug
> because of which controller goes into irrecoverable state.
>
> This patch adds a appropriate checks to make sure that scenario
> does not happen.
>
> Signed-off-by: Shyam Sundar S K 
> Signed-off-by: Nehal Shah 
> ---

> --- a/drivers/usb/host/xhci.h
> +++ b/drivers/usb/host/xhci.h
> @@ -1819,6 +1819,7 @@ struct xhci_hcd {
>/* For controller with a broken Port Disable implementation */
>#define XHCI_BROKEN_PORT_PED(1 << 25)
>#define XHCI_LIMIT_ENDPOINT_INTERVAL_7(1 << 26)
> +#define XHCI_BROKEN_STOP(1 << 27)

 Does there really need to be a quirk flag for this?  I should think
 that you never want to issue a STOP EP command while the controller is
 not running, no matter what kind of controller it is.

 Alan Stern

>>>
>>> If it's about controller not running then there shouldn't be any problems.
>>> We shouldn't issue a stop endpoint command if controller is halted.
>>>
>>> If this is about issuing a stop endpoint command while endpoint isn't
>>> running, then fully working controllers should just respond with a command
>>> completion with "context state error" status.
>>
>> As per SNPS the controller is responding with "Context State Error", however 
>> the same is not getting
>> reflected when we check the cmd->status in the xhci driver.
>>
>>> Anyway, as Alan said the quirk is probably unnecessary here.
>>
>> OK. We will take care of this.
>>
>>> We shouldn't need to
>>> stop endpoints that are not running. Only problem I see here is that the
>>> endpoint state in the output endpoint context might not be up to date. If 
>>> driver
>>> just restarted the endpoint by ringing the doorbell, the output context 
>>> state
>>> might not be updated yet.
>>
>> Before issuing the stop end point command, we checked the state of the 
>> endpoint and it looks the state of
>> the end point is EP_STATE_STOPPED. If the output endpoint context is not 
>> updated is there a better way
>> to retrieve the EP state before issuing the stop end point command ?
>
> Not really, checking endpoint context and possible a software variable kept 
> up to date
> by driver to keep track of doorbell. Perhaps checking endpoint ctx is enough 
> for now
So, is it OK to guard the stop endpoint by checking the EP context before 
queuing it?

>
>>
>>>
>>> How does this SNPS 3.1 controller react if the endpoint just halted or 
>>> moved to
>>> error state just before controller runs the stop endpoint command? Still 
>>> triggers
>>> the RTL bug?
>>
>> As per SNPS analysis.
>>
>> 1) Driver issues STOP ENDPOINT command  and the EP is in Running state.
>> 2) HW executes the STOP ENDPOINT command successfully
>> 3) Driver again issues STOP ENDPOINT command.
>> 4) Since the EP is already halted/stopped, HW completes the command 
>> execution and reports “device context error” completion code. This is as per 
>> the spec.
>> 5) However HW on receiving the second command additionally marks EP to Flow 
>> control state in HW which is RTL bug
>> 6) The above bug causes the HW not to respond to any further doorbells that 
>> are rung by the driver. This causes the EP to not functional anymore and 
>> causes gross functional failures.
>>
>
> What happens if endpoint ctx shows endpoint is in the halted or error state 
> when stop endpoint command is issued?
>  still RTL bug?

Yes. That's right. If EP context shows as halted/stopped/error and we issue a 
stop endpoint command it is triggering the RTL bug. Since the tapeout has 
already happened and there is no way to fix this
from SNPS side, they are asking for a SW workaround i.e.  "issuing the stop 
endpoint command only when the EP context state is running."

So, it is OK to have this patch submission which will check for EP context 
before queuing the stop endpoint command ?

>
>>
>>>
>>> I'm talking about the in xhci spec 4.6.9:
>>>
>>> " A Busy endpoint may asynchronously transition from the Running to the 
>>> Halted or Error state due
>>> to error conditions detected while processing TRBs. A possible race 
>>> condition may occur if
>>> software, thinking an endpoint is in the Running state, issues a Stop 
>>> Endpoint Command however
>>> at the same time the xHC asynchronously transitions the endpoint to the 
>>> Halted or Error state. In
>>> this case, a Context State Error may be generated for the command 
>>> completion. Software may
>>> verify that this case 

Re: [PATCH] usb: xhci: add quirk flag for broken stop command on AMD platforms

2017-05-22 Thread Mathias Nyman

On 22.05.2017 11:56, Shyam Sundar S K wrote:

Hi Mathias,


On 5/19/2017 12:43 PM, Mathias Nyman wrote:

On 18.05.2017 16:46, Alan Stern wrote:

On Thu, 18 May 2017, Shyam Sundar S K wrote:


on AMD platforms with SNPS 3.1 USB controller has an issue
if the stop EP command is issued when the controller is not
in running state. If issued, it is leading to a critical RTL bug
because of which controller goes into irrecoverable state.

This patch adds a appropriate checks to make sure that scenario
does not happen.

Signed-off-by: Shyam Sundar S K 
Signed-off-by: Nehal Shah 
---



--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1819,6 +1819,7 @@ struct xhci_hcd {
   /* For controller with a broken Port Disable implementation */
   #define XHCI_BROKEN_PORT_PED(1 << 25)
   #define XHCI_LIMIT_ENDPOINT_INTERVAL_7(1 << 26)
+#define XHCI_BROKEN_STOP(1 << 27)


Does there really need to be a quirk flag for this?  I should think
that you never want to issue a STOP EP command while the controller is
not running, no matter what kind of controller it is.

Alan Stern



If it's about controller not running then there shouldn't be any problems.
We shouldn't issue a stop endpoint command if controller is halted.

If this is about issuing a stop endpoint command while endpoint isn't
running, then fully working controllers should just respond with a command
completion with "context state error" status.


As per SNPS the controller is responding with "Context State Error", however 
the same is not getting
reflected when we check the cmd->status in the xhci driver.


Anyway, as Alan said the quirk is probably unnecessary here.


OK. We will take care of this.


We shouldn't need to
stop endpoints that are not running. Only problem I see here is that the
endpoint state in the output endpoint context might not be up to date. If driver
just restarted the endpoint by ringing the doorbell, the output context state
might not be updated yet.


Before issuing the stop end point command, we checked the state of the endpoint 
and it looks the state of
the end point is EP_STATE_STOPPED. If the output endpoint context is not 
updated is there a better way
to retrieve the EP state before issuing the stop end point command ?


Not really, checking endpoint context and possible a software variable kept up 
to date
by driver to keep track of doorbell. Perhaps checking endpoint ctx is enough 
for now





How does this SNPS 3.1 controller react if the endpoint just halted or moved to
error state just before controller runs the stop endpoint command? Still 
triggers
the RTL bug?


As per SNPS analysis.

1) Driver issues STOP ENDPOINT command  and the EP is in Running state.
2) HW executes the STOP ENDPOINT command successfully
3) Driver again issues STOP ENDPOINT command.
4) Since the EP is already halted/stopped, HW completes the command execution 
and reports “device context error” completion code. This is as per the spec.
5) However HW on receiving the second command additionally marks EP to Flow 
control state in HW which is RTL bug
6) The above bug causes the HW not to respond to any further doorbells that are 
rung by the driver. This causes the EP to not functional anymore and causes 
gross functional failures.



What happens if endpoint ctx shows endpoint is in the halted or error state 
when stop endpoint command is issued?
 still RTL bug?





I'm talking about the in xhci spec 4.6.9:

" A Busy endpoint may asynchronously transition from the Running to the Halted 
or Error state due
to error conditions detected while processing TRBs. A possible race condition 
may occur if
software, thinking an endpoint is in the Running state, issues a Stop Endpoint 
Command however
at the same time the xHC asynchronously transitions the endpoint to the Halted 
or Error state. In
this case, a Context State Error may be generated for the command completion. 
Software may
verify that this case occurred by inspecting the EP State for Halted or Error 
when a Stop Endpoint
Command results in a Context State Error."



Since we are novice, can you please help to understand what is the intuition 
behind sending two stop endpoint commands ?


No need for two stop endpoint commands, that can be fixed in the driver.

-Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] usb: xhci: add quirk flag for broken stop command on AMD platforms

2017-05-22 Thread Shyam Sundar S K
Hi Mathias,


On 5/19/2017 12:43 PM, Mathias Nyman wrote:
> On 18.05.2017 16:46, Alan Stern wrote:
>> On Thu, 18 May 2017, Shyam Sundar S K wrote:
>>
>>> on AMD platforms with SNPS 3.1 USB controller has an issue
>>> if the stop EP command is issued when the controller is not
>>> in running state. If issued, it is leading to a critical RTL bug
>>> because of which controller goes into irrecoverable state.
>>>
>>> This patch adds a appropriate checks to make sure that scenario
>>> does not happen.
>>>
>>> Signed-off-by: Shyam Sundar S K 
>>> Signed-off-by: Nehal Shah 
>>> ---
>>
>>> --- a/drivers/usb/host/xhci.h
>>> +++ b/drivers/usb/host/xhci.h
>>> @@ -1819,6 +1819,7 @@ struct xhci_hcd {
>>>   /* For controller with a broken Port Disable implementation */
>>>   #define XHCI_BROKEN_PORT_PED(1 << 25)
>>>   #define XHCI_LIMIT_ENDPOINT_INTERVAL_7(1 << 26)
>>> +#define XHCI_BROKEN_STOP(1 << 27)
>>
>> Does there really need to be a quirk flag for this?  I should think
>> that you never want to issue a STOP EP command while the controller is
>> not running, no matter what kind of controller it is.
>>
>> Alan Stern
>>
>
> If it's about controller not running then there shouldn't be any problems.
> We shouldn't issue a stop endpoint command if controller is halted.
>
> If this is about issuing a stop endpoint command while endpoint isn't
> running, then fully working controllers should just respond with a command
> completion with "context state error" status.

As per SNPS the controller is responding with "Context State Error", however 
the same is not getting
reflected when we check the cmd->status in the xhci driver.

> Anyway, as Alan said the quirk is probably unnecessary here. 

OK. We will take care of this.

> We shouldn't need to
> stop endpoints that are not running. Only problem I see here is that the
> endpoint state in the output endpoint context might not be up to date. If 
> driver
> just restarted the endpoint by ringing the doorbell, the output context state
> might not be updated yet.

Before issuing the stop end point command, we checked the state of the endpoint 
and it looks the state of
the end point is EP_STATE_STOPPED. If the output endpoint context is not 
updated is there a better way
to retrieve the EP state before issuing the stop end point command ?

>
> How does this SNPS 3.1 controller react if the endpoint just halted or moved 
> to
> error state just before controller runs the stop endpoint command? Still 
> triggers
> the RTL bug?

As per SNPS analysis.

1) Driver issues STOP ENDPOINT command  and the EP is in Running state.
2) HW executes the STOP ENDPOINT command successfully
3) Driver again issues STOP ENDPOINT command.
4) Since the EP is already halted/stopped, HW completes the command execution 
and reports “device context error” completion code. This is as per the spec.
5) However HW on receiving the second command additionally marks EP to Flow 
control state in HW which is RTL bug
6) The above bug causes the HW not to respond to any further doorbells that are 
rung by the driver. This causes the EP to not functional anymore and causes 
gross functional failures.


>
> I'm talking about the in xhci spec 4.6.9:
>
> " A Busy endpoint may asynchronously transition from the Running to the 
> Halted or Error state due
> to error conditions detected while processing TRBs. A possible race condition 
> may occur if
> software, thinking an endpoint is in the Running state, issues a Stop 
> Endpoint Command however
> at the same time the xHC asynchronously transitions the endpoint to the 
> Halted or Error state. In
> this case, a Context State Error may be generated for the command completion. 
> Software may
> verify that this case occurred by inspecting the EP State for Halted or Error 
> when a Stop Endpoint
> Command results in a Context State Error."
>

Since we are novice, can you please help to understand what is the intuition 
behind sending two stop endpoint commands ?

> -Mathias

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] usb: xhci: add quirk flag for broken stop command on AMD platforms

2017-05-19 Thread Mathias Nyman

On 18.05.2017 16:46, Alan Stern wrote:

On Thu, 18 May 2017, Shyam Sundar S K wrote:


on AMD platforms with SNPS 3.1 USB controller has an issue
if the stop EP command is issued when the controller is not
in running state. If issued, it is leading to a critical RTL bug
because of which controller goes into irrecoverable state.

This patch adds a appropriate checks to make sure that scenario
does not happen.

Signed-off-by: Shyam Sundar S K 
Signed-off-by: Nehal Shah 
---



--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1819,6 +1819,7 @@ struct xhci_hcd {
  /* For controller with a broken Port Disable implementation */
  #define XHCI_BROKEN_PORT_PED  (1 << 25)
  #define XHCI_LIMIT_ENDPOINT_INTERVAL_7(1 << 26)
+#define XHCI_BROKEN_STOP   (1 << 27)


Does there really need to be a quirk flag for this?  I should think
that you never want to issue a STOP EP command while the controller is
not running, no matter what kind of controller it is.

Alan Stern



If it's about controller not running then there shouldn't be any problems.
We shouldn't issue a stop endpoint command if controller is halted.

If this is about issuing a stop endpoint command while endpoint isn't
running, then fully working controllers should just respond with a command
completion with "context state error" status.

Anyway, as Alan said the quirk is probably unnecessary here. We shouldn't need 
to
stop endpoints that are not running. Only problem I see here is that the
endpoint state in the output endpoint context might not be up to date. If driver
just restarted the endpoint by ringing the doorbell, the output context state
might not be updated yet.

How does this SNPS 3.1 controller react if the endpoint just halted or moved to
error state just before controller runs the stop endpoint command? Still 
triggers
the RTL bug?

I'm talking about the in xhci spec 4.6.9:

" A Busy endpoint may asynchronously transition from the Running to the Halted 
or Error state due
to error conditions detected while processing TRBs. A possible race condition 
may occur if
software, thinking an endpoint is in the Running state, issues a Stop Endpoint 
Command however
at the same time the xHC asynchronously transitions the endpoint to the Halted 
or Error state. In
this case, a Context State Error may be generated for the command completion. 
Software may
verify that this case occurred by inspecting the EP State for Halted or Error 
when a Stop Endpoint
Command results in a Context State Error."

-Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] usb: xhci: add quirk flag for broken stop command on AMD platforms

2017-05-18 Thread Alan Stern
On Thu, 18 May 2017, Shyam Sundar S K wrote:

> on AMD platforms with SNPS 3.1 USB controller has an issue
> if the stop EP command is issued when the controller is not
> in running state. If issued, it is leading to a critical RTL bug
> because of which controller goes into irrecoverable state.
> 
> This patch adds a appropriate checks to make sure that scenario
> does not happen.
> 
> Signed-off-by: Shyam Sundar S K 
> Signed-off-by: Nehal Shah 
> ---

> --- a/drivers/usb/host/xhci.h
> +++ b/drivers/usb/host/xhci.h
> @@ -1819,6 +1819,7 @@ struct xhci_hcd {
>  /* For controller with a broken Port Disable implementation */
>  #define XHCI_BROKEN_PORT_PED (1 << 25)
>  #define XHCI_LIMIT_ENDPOINT_INTERVAL_7   (1 << 26)
> +#define XHCI_BROKEN_STOP (1 << 27)

Does there really need to be a quirk flag for this?  I should think 
that you never want to issue a STOP EP command while the controller is 
not running, no matter what kind of controller it is.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] usb: xhci: add quirk flag for broken stop command on AMD platforms

2017-05-18 Thread Roger Quadros
Hi,

On 18/05/17 11:53, Shyam Sundar S K wrote:
> on AMD platforms with SNPS 3.1 USB controller has an issue
> if the stop EP command is issued when the controller is not
> in running state. If issued, it is leading to a critical RTL bug
> because of which controller goes into irrecoverable state.
> 
> This patch adds a appropriate checks to make sure that scenario
> does not happen.
> 
> Signed-off-by: Shyam Sundar S K 
> Signed-off-by: Nehal Shah 
> ---
>  drivers/usb/host/xhci-hub.c | 36 ++--
>  drivers/usb/host/xhci-pci.c |  8 
>  drivers/usb/host/xhci.h |  1 +
>  3 files changed, 43 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
> index 5e3e9d4..fec849f 100644
> --- a/drivers/usb/host/xhci-hub.c
> +++ b/drivers/usb/host/xhci-hub.c
> @@ -380,6 +380,7 @@ static int xhci_stop_device(struct xhci_hcd *xhci, int 
> slot_id, int suspend)
>  {
>   struct xhci_virt_device *virt_dev;
>   struct xhci_command *cmd;
> + struct xhci_ep_ctx *ep_ctx;
>   unsigned long flags;
>   int ret;
>   int i;
> @@ -407,11 +408,42 @@ static int xhci_stop_device(struct xhci_hcd *xhci, int 
> slot_id, int suspend)
>   return -ENOMEM;
>  
>   }
> - xhci_queue_stop_endpoint(xhci, command, slot_id, i,
> +
> + /* on AMD platforms with SNPS 3.1 USB controller has
> +  * an issue if the stop EP command is issued when
> +  * the controller is not in running state.
> +  * If issued, its leading to a RTL bug because of
> +  * which controller goes to a bad state.
> +  * Adding a conditional check to prevent this.
> +  */
> +
> + if (xhci->quirks & XHCI_BROKEN_STOP) {
> + ep_ctx = xhci_get_ep_ctx(xhci,
> + virt_dev->out_ctx, i);
> +
> + if (GET_EP_CTX_STATE(ep_ctx) ==
> + EP_STATE_RUNNING) {

In the comment you mention that we need to avoid the stop EP command when
controller is not running but here you are checking if EP is not running.

Is Controller not running and EP not running the same?

If we're not queuing the stop endpoint command do we need still to allocate it
and ring the doorbell?

> + xhci_queue_stop_endpoint(xhci, command,
> + slot_id, i, suspend);
> + }
> + } else {
> + xhci_queue_stop_endpoint(xhci, command,
> + slot_id, i, suspend);
> + }
> + }
> + }
> +
> + if (xhci->quirks & XHCI_BROKEN_STOP) {
> + ep_ctx = xhci_get_ep_ctx(xhci, virt_dev->out_ctx, 0);
> +
> + if (GET_EP_CTX_STATE(ep_ctx) ==  EP_STATE_RUNNING) {
> + xhci_queue_stop_endpoint(xhci, cmd, slot_id, 0,
>suspend);
>   }
> + } else {
> + xhci_queue_stop_endpoint(xhci, cmd, slot_id, 0, suspend);
>   }
> - xhci_queue_stop_endpoint(xhci, cmd, slot_id, 0, suspend);
> +
>   xhci_ring_cmd_db(xhci);
>   spin_unlock_irqrestore(>lock, flags);

What happens to the wait for completion code that is right after this?
Since we didn't send the command there is no point waiting for it to complete 
right?

>  
> diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
> index 7b86508..403de8f 100644
> --- a/drivers/usb/host/xhci-pci.c
> +++ b/drivers/usb/host/xhci-pci.c
> @@ -52,6 +52,8 @@
>  #define PCI_DEVICE_ID_INTEL_BROXTON_M_XHCI   0x0aa8
>  #define PCI_DEVICE_ID_INTEL_BROXTON_B_XHCI   0x1aa8
>  #define PCI_DEVICE_ID_INTEL_APL_XHCI 0x5aa8
> +#define PCI_DEVICE_ID_AMD_XHCI_300x15e0
> +#define PCI_DEVICE_ID_AMD_XHCI_310x15e1
>  
>  static const char hcd_name[] = "xhci_hcd";
>  
> @@ -205,6 +207,12 @@ static void xhci_pci_quirks(struct device *dev, struct 
> xhci_hcd *xhci)
>   if (xhci->quirks & XHCI_RESET_ON_RESUME)
>   xhci_dbg_trace(xhci, trace_xhci_dbg_quirks,
>   "QUIRK: Resetting on resume");
> +
> + if (pdev->vendor == PCI_VENDOR_ID_AMD &&
> + (pdev->device == PCI_DEVICE_ID_AMD_XHCI_30 ||
> + pdev->device == PCI_DEVICE_ID_AMD_XHCI_31))
> + xhci->quirks |= XHCI_BROKEN_STOP;
> +
>  }
>  
>  #ifdef CONFIG_ACPI
> diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
> index 73a28a9..c6fa3ca 100644
> --- a/drivers/usb/host/xhci.h
> +++ b/drivers/usb/host/xhci.h
> @@