Re: [RFC PATCH] xhci: do not halt the secondary HCD

2016-10-25 Thread Joel Stanley
On Tue, Sep 20, 2016 at 5:56 PM, Mathias Nyman
 wrote:
> Quick Googling shows that that TI TUSB 73x0 USB3.0 xHCI host has an issue
> with halting.
>
> Errata says host needs 125us to 1ms between the last control transfer and
> clearing the run/stop bit. (halting the host)
>
> Suggested workaround is to wait at least 2ms before halting the host.
>
> See issue #10 in:
> http://www.ti.com/lit/er/sllz076/sllz076.pdf
>
> It might just be that the patch works because it forces halting the host to
> be done later (secondary hcd -> primary hcd),  giving it enough time after
> the last control transfer.

Well spotted.

I gave this a go, adding a quirk and performing a msleep:

+++ b/drivers/usb/host/xhci.c
@@ -109,6 +109,10 @@ int xhci_halt(struct xhci_hcd *xhci)
 {
int ret;
xhci_dbg_trace(xhci, trace_xhci_dbg_init, "// Halt the HC");
+
+   if (xhci->quirks & XHCI_HALT_DELAY_QUIRK)
+   msleep(2);
+
xhci_quiesce(xhci);

However it didn't help.

Are we guaranteed that transfers are not in flight at that point?

>
>>> a first step.
>>>
>>> load primary
>>> load secondary  (starts the xhci controller
>>> ...
>>> unload secondary (halts the controller)
>>> unload primary   (free memory)
>
>
> Now thinking about it, it doesn't really make sense to halt the host
> controller hardware
> before removing the primary hcd. It will just cause devices under the
> primary (USB2) to
> be removed uncleanly.  So basically the idea of the workaround makes sense,
> it just needs
> to be cleaned up from a workaround to intended behavior.

Great. When you say clean up, do you just mean tidying the comments?

Cheers,

Joel


>
> We might also need an additional quirk for TI TUSB 73x0 that adds a msleep()
> before the
> xhci_halt, even if it's moved to the last hcd removed.
>
> -Mathias


Re: [RFC PATCH] xhci: do not halt the secondary HCD

2016-10-25 Thread Joel Stanley
On Tue, Sep 20, 2016 at 5:56 PM, Mathias Nyman
 wrote:
> Quick Googling shows that that TI TUSB 73x0 USB3.0 xHCI host has an issue
> with halting.
>
> Errata says host needs 125us to 1ms between the last control transfer and
> clearing the run/stop bit. (halting the host)
>
> Suggested workaround is to wait at least 2ms before halting the host.
>
> See issue #10 in:
> http://www.ti.com/lit/er/sllz076/sllz076.pdf
>
> It might just be that the patch works because it forces halting the host to
> be done later (secondary hcd -> primary hcd),  giving it enough time after
> the last control transfer.

Well spotted.

I gave this a go, adding a quirk and performing a msleep:

+++ b/drivers/usb/host/xhci.c
@@ -109,6 +109,10 @@ int xhci_halt(struct xhci_hcd *xhci)
 {
int ret;
xhci_dbg_trace(xhci, trace_xhci_dbg_init, "// Halt the HC");
+
+   if (xhci->quirks & XHCI_HALT_DELAY_QUIRK)
+   msleep(2);
+
xhci_quiesce(xhci);

However it didn't help.

Are we guaranteed that transfers are not in flight at that point?

>
>>> a first step.
>>>
>>> load primary
>>> load secondary  (starts the xhci controller
>>> ...
>>> unload secondary (halts the controller)
>>> unload primary   (free memory)
>
>
> Now thinking about it, it doesn't really make sense to halt the host
> controller hardware
> before removing the primary hcd. It will just cause devices under the
> primary (USB2) to
> be removed uncleanly.  So basically the idea of the workaround makes sense,
> it just needs
> to be cleaned up from a workaround to intended behavior.

Great. When you say clean up, do you just mean tidying the comments?

Cheers,

Joel


>
> We might also need an additional quirk for TI TUSB 73x0 that adds a msleep()
> before the
> xhci_halt, even if it's moved to the last hcd removed.
>
> -Mathias


Re: [RFC PATCH] xhci: do not halt the secondary HCD

2016-09-20 Thread Mathias Nyman

On 19.09.2016 11:23, Joel Stanley wrote:

Hi Mathias,

On Mon, Sep 19, 2016 at 4:33 PM, Greg KH  wrote:

On Mon, Sep 19, 2016 at 04:05:45PM +0930, Joel Stanley wrote:

We can't halt the secondary HCD, because it's also the primary HCD,
which will cause problems if we have devices attached to the primary
HCD, like a keyboard.

We've been carrying this in our Linux-as-a-bootloader environment for a little
while now. The machines all have the same TI TUSB73x0 part, and when we kexec
the devices don't come back until a system power cycle.

I'd like some advice on an acceptable way to upstream the fix, so that the xhci
device survives kexec.


Any reason you didn't cc: Mathias?


Fat fingers - I missed him when grabbing names from get_maintainers.
Thanks for adding him in.

On Mon, Sep 19, 2016 at 5:11 PM, Mathias Nyman
 wrote:

What kernel version is this?


This patch is against 4.4.21. I've tested newer releases but haven't
seen any improvement.


As Greg said there are fixes in this area in the 4.8 latest rc kernel.

If that doesn't work then we need to figure out what the real issue is.


No dice on 4.8-rc7 (without any patches).

Here's 4.8-rc7 loading:

[3.699524] xhci_hcd 0021:09:00.0: xHCI Host Controller
[3.699556] xhci_hcd 0021:09:00.0: new USB bus registered, assigned
bus number 1
[3.699640] xhci_hcd 0021:09:00.0: Using 64-bit DMA iommu bypass
[3.699697] xhci_hcd 0021:09:00.0: hcc params 0x0270f06d hci
version 0x96 quirks 0x
[3.700286] hub 1-0:1.0: USB hub found
[3.700299] hub 1-0:1.0: 4 ports detected
[3.700493] xhci_hcd 0021:09:00.0: xHCI Host Controller
[3.700522] xhci_hcd 0021:09:00.0: new USB bus registered, assigned
bus number 2
[3.700552] usb usb2: We don't know the algorithms for LPM for this
host, disabling LPM.
[3.700733] hub 2-0:1.0: USB hub found
[3.700748] hub 2-0:1.0: 4 ports detected

Then we kexec into the second kernel. Here's what the second kernel
prints when trying to bring the controller up:

[1.588272] xhci_hcd 0021:09:00.0: xHCI Host Controller
[1.588282] xhci_hcd 0021:09:00.0: new USB bus registered, assigned
bus number 1
[1.619279] xhci_hcd 0021:09:00.0: Host not halted after 16000 microseconds.
[1.619281] xhci_hcd 0021:09:00.0: can't setup: -110
[1.619447] xhci_hcd 0021:09:00.0: USB bus 1 deregistered
[1.619457] xhci_hcd 0021:09:00.0: init 0021:09:00.0 fail, -110
[1.619571] xhci_hcd: probe of 0021:09:00.0 failed with error -110


Quick Googling shows that that TI TUSB 73x0 USB3.0 xHCI host has an issue with 
halting.

Errata says host needs 125us to 1ms between the last control transfer and
clearing the run/stop bit. (halting the host)

Suggested workaround is to wait at least 2ms before halting the host.

See issue #10 in:
http://www.ti.com/lit/er/sllz076/sllz076.pdf

It might just be that the patch works because it forces halting the host to
be done later (secondary hcd -> primary hcd),  giving it enough time after the 
last control transfer.



a first step.

load primary
load secondary  (starts the xhci controller
...
unload secondary (halts the controller)
unload primary   (free memory)


Now thinking about it, it doesn't really make sense to halt the host controller 
hardware
before removing the primary hcd. It will just cause devices under the primary 
(USB2) to
be removed uncleanly.  So basically the idea of the workaround makes sense, it 
just needs
to be cleaned up from a workaround to intended behavior.

We might also need an additional quirk for TI TUSB 73x0 that adds a msleep() 
before the
xhci_halt, even if it's moved to the last hcd removed.

-Mathias


Re: [RFC PATCH] xhci: do not halt the secondary HCD

2016-09-20 Thread Mathias Nyman

On 19.09.2016 11:23, Joel Stanley wrote:

Hi Mathias,

On Mon, Sep 19, 2016 at 4:33 PM, Greg KH  wrote:

On Mon, Sep 19, 2016 at 04:05:45PM +0930, Joel Stanley wrote:

We can't halt the secondary HCD, because it's also the primary HCD,
which will cause problems if we have devices attached to the primary
HCD, like a keyboard.

We've been carrying this in our Linux-as-a-bootloader environment for a little
while now. The machines all have the same TI TUSB73x0 part, and when we kexec
the devices don't come back until a system power cycle.

I'd like some advice on an acceptable way to upstream the fix, so that the xhci
device survives kexec.


Any reason you didn't cc: Mathias?


Fat fingers - I missed him when grabbing names from get_maintainers.
Thanks for adding him in.

On Mon, Sep 19, 2016 at 5:11 PM, Mathias Nyman
 wrote:

What kernel version is this?


This patch is against 4.4.21. I've tested newer releases but haven't
seen any improvement.


As Greg said there are fixes in this area in the 4.8 latest rc kernel.

If that doesn't work then we need to figure out what the real issue is.


No dice on 4.8-rc7 (without any patches).

Here's 4.8-rc7 loading:

[3.699524] xhci_hcd 0021:09:00.0: xHCI Host Controller
[3.699556] xhci_hcd 0021:09:00.0: new USB bus registered, assigned
bus number 1
[3.699640] xhci_hcd 0021:09:00.0: Using 64-bit DMA iommu bypass
[3.699697] xhci_hcd 0021:09:00.0: hcc params 0x0270f06d hci
version 0x96 quirks 0x
[3.700286] hub 1-0:1.0: USB hub found
[3.700299] hub 1-0:1.0: 4 ports detected
[3.700493] xhci_hcd 0021:09:00.0: xHCI Host Controller
[3.700522] xhci_hcd 0021:09:00.0: new USB bus registered, assigned
bus number 2
[3.700552] usb usb2: We don't know the algorithms for LPM for this
host, disabling LPM.
[3.700733] hub 2-0:1.0: USB hub found
[3.700748] hub 2-0:1.0: 4 ports detected

Then we kexec into the second kernel. Here's what the second kernel
prints when trying to bring the controller up:

[1.588272] xhci_hcd 0021:09:00.0: xHCI Host Controller
[1.588282] xhci_hcd 0021:09:00.0: new USB bus registered, assigned
bus number 1
[1.619279] xhci_hcd 0021:09:00.0: Host not halted after 16000 microseconds.
[1.619281] xhci_hcd 0021:09:00.0: can't setup: -110
[1.619447] xhci_hcd 0021:09:00.0: USB bus 1 deregistered
[1.619457] xhci_hcd 0021:09:00.0: init 0021:09:00.0 fail, -110
[1.619571] xhci_hcd: probe of 0021:09:00.0 failed with error -110


Quick Googling shows that that TI TUSB 73x0 USB3.0 xHCI host has an issue with 
halting.

Errata says host needs 125us to 1ms between the last control transfer and
clearing the run/stop bit. (halting the host)

Suggested workaround is to wait at least 2ms before halting the host.

See issue #10 in:
http://www.ti.com/lit/er/sllz076/sllz076.pdf

It might just be that the patch works because it forces halting the host to
be done later (secondary hcd -> primary hcd),  giving it enough time after the 
last control transfer.



a first step.

load primary
load secondary  (starts the xhci controller
...
unload secondary (halts the controller)
unload primary   (free memory)


Now thinking about it, it doesn't really make sense to halt the host controller 
hardware
before removing the primary hcd. It will just cause devices under the primary 
(USB2) to
be removed uncleanly.  So basically the idea of the workaround makes sense, it 
just needs
to be cleaned up from a workaround to intended behavior.

We might also need an additional quirk for TI TUSB 73x0 that adds a msleep() 
before the
xhci_halt, even if it's moved to the last hcd removed.

-Mathias


Re: [RFC PATCH] xhci: do not halt the secondary HCD

2016-09-19 Thread Sergei Shtylyov

Hello.

On 9/19/2016 9:35 AM, Joel Stanley wrote:


We can't halt the secondary HCD, because it's also the primary HCD,
which will cause problems if we have devices attached to the primary
HCD, like a keyboard.

We've been carrying this in our Linux-as-a-bootloader environment for a little
while now. The machines all have the same TI TUSB73x0 part, and when we kexec
the devices don't come back until a system power cycle.

I'd like some advice on an acceptable way to upstream the fix, so that the xhci
device survives kexec.

Signed-off-by: Joel Stanley 
---
 drivers/usb/host/xhci.c | 20 +++-
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index adc169d2fd76..ec92a843325b 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -682,6 +682,21 @@ void xhci_stop(struct usb_hcd *hcd)

mutex_lock(>mutex);

+   /*
+* We can't halt the secondary HCD, because it's also the primary
+* HCD, which will cause problems if we have devices attached to the
+* primary HCD, like a keyboard.
+*/
+   if (!usb_hcd_is_primary_hcd(hcd)) {
+   /* The shared_hcd is going to be deallocated shortly (the USB
+* core only calls this function when allocation fails in
+* usb_add_hcd(), or usb_remove_hcd() is called).  So we need
+* to unset xHCI's pointer.  */


   Please format this comment the same way as the comment above it.


+   xhci->shared_hcd = NULL;
+   mutex_unlock(>mutex);
+   return;
+   }
+
if (!(xhci->xhc_state & XHCI_STATE_HALTED)) {
spin_lock_irq(>lock);


[...]

MBR, Sergei



Re: [RFC PATCH] xhci: do not halt the secondary HCD

2016-09-19 Thread Sergei Shtylyov

Hello.

On 9/19/2016 9:35 AM, Joel Stanley wrote:


We can't halt the secondary HCD, because it's also the primary HCD,
which will cause problems if we have devices attached to the primary
HCD, like a keyboard.

We've been carrying this in our Linux-as-a-bootloader environment for a little
while now. The machines all have the same TI TUSB73x0 part, and when we kexec
the devices don't come back until a system power cycle.

I'd like some advice on an acceptable way to upstream the fix, so that the xhci
device survives kexec.

Signed-off-by: Joel Stanley 
---
 drivers/usb/host/xhci.c | 20 +++-
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index adc169d2fd76..ec92a843325b 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -682,6 +682,21 @@ void xhci_stop(struct usb_hcd *hcd)

mutex_lock(>mutex);

+   /*
+* We can't halt the secondary HCD, because it's also the primary
+* HCD, which will cause problems if we have devices attached to the
+* primary HCD, like a keyboard.
+*/
+   if (!usb_hcd_is_primary_hcd(hcd)) {
+   /* The shared_hcd is going to be deallocated shortly (the USB
+* core only calls this function when allocation fails in
+* usb_add_hcd(), or usb_remove_hcd() is called).  So we need
+* to unset xHCI's pointer.  */


   Please format this comment the same way as the comment above it.


+   xhci->shared_hcd = NULL;
+   mutex_unlock(>mutex);
+   return;
+   }
+
if (!(xhci->xhc_state & XHCI_STATE_HALTED)) {
spin_lock_irq(>lock);


[...]

MBR, Sergei



Re: [RFC PATCH] xhci: do not halt the secondary HCD

2016-09-19 Thread Joel Stanley
Hi Mathias,

On Mon, Sep 19, 2016 at 4:33 PM, Greg KH  wrote:
> On Mon, Sep 19, 2016 at 04:05:45PM +0930, Joel Stanley wrote:
>> We can't halt the secondary HCD, because it's also the primary HCD,
>> which will cause problems if we have devices attached to the primary
>> HCD, like a keyboard.
>>
>> We've been carrying this in our Linux-as-a-bootloader environment for a 
>> little
>> while now. The machines all have the same TI TUSB73x0 part, and when we kexec
>> the devices don't come back until a system power cycle.
>>
>> I'd like some advice on an acceptable way to upstream the fix, so that the 
>> xhci
>> device survives kexec.
>
> Any reason you didn't cc: Mathias?

Fat fingers - I missed him when grabbing names from get_maintainers.
Thanks for adding him in.

On Mon, Sep 19, 2016 at 5:11 PM, Mathias Nyman
 wrote:
> What kernel version is this?

This patch is against 4.4.21. I've tested newer releases but haven't
seen any improvement.

> As Greg said there are fixes in this area in the 4.8 latest rc kernel.
>
> If that doesn't work then we need to figure out what the real issue is.

No dice on 4.8-rc7 (without any patches).

Here's 4.8-rc7 loading:

[3.699524] xhci_hcd 0021:09:00.0: xHCI Host Controller
[3.699556] xhci_hcd 0021:09:00.0: new USB bus registered, assigned
bus number 1
[3.699640] xhci_hcd 0021:09:00.0: Using 64-bit DMA iommu bypass
[3.699697] xhci_hcd 0021:09:00.0: hcc params 0x0270f06d hci
version 0x96 quirks 0x
[3.700286] hub 1-0:1.0: USB hub found
[3.700299] hub 1-0:1.0: 4 ports detected
[3.700493] xhci_hcd 0021:09:00.0: xHCI Host Controller
[3.700522] xhci_hcd 0021:09:00.0: new USB bus registered, assigned
bus number 2
[3.700552] usb usb2: We don't know the algorithms for LPM for this
host, disabling LPM.
[3.700733] hub 2-0:1.0: USB hub found
[3.700748] hub 2-0:1.0: 4 ports detected

Then we kexec into the second kernel. Here's what the second kernel
prints when trying to bring the controller up:

[1.588272] xhci_hcd 0021:09:00.0: xHCI Host Controller
[1.588282] xhci_hcd 0021:09:00.0: new USB bus registered, assigned
bus number 1
[1.619279] xhci_hcd 0021:09:00.0: Host not halted after 16000 microseconds.
[1.619281] xhci_hcd 0021:09:00.0: can't setup: -110
[1.619447] xhci_hcd 0021:09:00.0: USB bus 1 deregistered
[1.619457] xhci_hcd 0021:09:00.0: init 0021:09:00.0 fail, -110
[1.619571] xhci_hcd: probe of 0021:09:00.0 failed with error -110

Note that the second kernel is a distro one (Ubuntu 4.4.0-36-generic).

> xhci hardware is really just one controller. The split into primary and
> secondary HCD
> is a software only. We always load the primary HCD first (USB2) and
> secondary second (USB3).
> We unload them in reverse order, and need to stop the xhci (halt the hcd) as
> a first step.
>
> load primary
> load secondary  (starts the xhci controller
> ...
> unload secondary (halts the controller)
> unload primary   (free memory)

Thanks for the explanation. I wasn't the author of the first hack we
put in our tree, but I have rewritten it as we rebase on the stable
tree regularly.

So the hack as I sent it doesn't do any halt the secondary, and lets
the primary unload path halt the controller. Any theory as to why this
helps?

Cheers,

Joel


Re: [RFC PATCH] xhci: do not halt the secondary HCD

2016-09-19 Thread Joel Stanley
Hi Mathias,

On Mon, Sep 19, 2016 at 4:33 PM, Greg KH  wrote:
> On Mon, Sep 19, 2016 at 04:05:45PM +0930, Joel Stanley wrote:
>> We can't halt the secondary HCD, because it's also the primary HCD,
>> which will cause problems if we have devices attached to the primary
>> HCD, like a keyboard.
>>
>> We've been carrying this in our Linux-as-a-bootloader environment for a 
>> little
>> while now. The machines all have the same TI TUSB73x0 part, and when we kexec
>> the devices don't come back until a system power cycle.
>>
>> I'd like some advice on an acceptable way to upstream the fix, so that the 
>> xhci
>> device survives kexec.
>
> Any reason you didn't cc: Mathias?

Fat fingers - I missed him when grabbing names from get_maintainers.
Thanks for adding him in.

On Mon, Sep 19, 2016 at 5:11 PM, Mathias Nyman
 wrote:
> What kernel version is this?

This patch is against 4.4.21. I've tested newer releases but haven't
seen any improvement.

> As Greg said there are fixes in this area in the 4.8 latest rc kernel.
>
> If that doesn't work then we need to figure out what the real issue is.

No dice on 4.8-rc7 (without any patches).

Here's 4.8-rc7 loading:

[3.699524] xhci_hcd 0021:09:00.0: xHCI Host Controller
[3.699556] xhci_hcd 0021:09:00.0: new USB bus registered, assigned
bus number 1
[3.699640] xhci_hcd 0021:09:00.0: Using 64-bit DMA iommu bypass
[3.699697] xhci_hcd 0021:09:00.0: hcc params 0x0270f06d hci
version 0x96 quirks 0x
[3.700286] hub 1-0:1.0: USB hub found
[3.700299] hub 1-0:1.0: 4 ports detected
[3.700493] xhci_hcd 0021:09:00.0: xHCI Host Controller
[3.700522] xhci_hcd 0021:09:00.0: new USB bus registered, assigned
bus number 2
[3.700552] usb usb2: We don't know the algorithms for LPM for this
host, disabling LPM.
[3.700733] hub 2-0:1.0: USB hub found
[3.700748] hub 2-0:1.0: 4 ports detected

Then we kexec into the second kernel. Here's what the second kernel
prints when trying to bring the controller up:

[1.588272] xhci_hcd 0021:09:00.0: xHCI Host Controller
[1.588282] xhci_hcd 0021:09:00.0: new USB bus registered, assigned
bus number 1
[1.619279] xhci_hcd 0021:09:00.0: Host not halted after 16000 microseconds.
[1.619281] xhci_hcd 0021:09:00.0: can't setup: -110
[1.619447] xhci_hcd 0021:09:00.0: USB bus 1 deregistered
[1.619457] xhci_hcd 0021:09:00.0: init 0021:09:00.0 fail, -110
[1.619571] xhci_hcd: probe of 0021:09:00.0 failed with error -110

Note that the second kernel is a distro one (Ubuntu 4.4.0-36-generic).

> xhci hardware is really just one controller. The split into primary and
> secondary HCD
> is a software only. We always load the primary HCD first (USB2) and
> secondary second (USB3).
> We unload them in reverse order, and need to stop the xhci (halt the hcd) as
> a first step.
>
> load primary
> load secondary  (starts the xhci controller
> ...
> unload secondary (halts the controller)
> unload primary   (free memory)

Thanks for the explanation. I wasn't the author of the first hack we
put in our tree, but I have rewritten it as we rebase on the stable
tree regularly.

So the hack as I sent it doesn't do any halt the secondary, and lets
the primary unload path halt the controller. Any theory as to why this
helps?

Cheers,

Joel


Re: [RFC PATCH] xhci: do not halt the secondary HCD

2016-09-19 Thread Mathias Nyman

On 19.09.2016 09:35, Joel Stanley wrote:

We can't halt the secondary HCD, because it's also the primary HCD,
which will cause problems if we have devices attached to the primary
HCD, like a keyboard.

We've been carrying this in our Linux-as-a-bootloader environment for a little
while now. The machines all have the same TI TUSB73x0 part, and when we kexec
the devices don't come back until a system power cycle.

I'd like some advice on an acceptable way to upstream the fix, so that the xhci
device survives kexec.

Signed-off-by: Joel Stanley 
---


What kernel version is this?

As Greg said there are fixes in this area in the 4.8 latest rc kernel.

If that doesn't work then we need to figure out what the real issue is.

xhci hardware is really just one controller. The split into primary and 
secondary HCD
is a software only. We always load the primary HCD first (USB2) and secondary 
second (USB3).
We unload them in reverse order, and need to stop the xhci (halt the hcd) as a 
first step.

load primary
load secondary  (starts the xhci controller
...
unload secondary (halts the controller)
unload primary   (free memory)

-Mathias




Re: [RFC PATCH] xhci: do not halt the secondary HCD

2016-09-19 Thread Mathias Nyman

On 19.09.2016 09:35, Joel Stanley wrote:

We can't halt the secondary HCD, because it's also the primary HCD,
which will cause problems if we have devices attached to the primary
HCD, like a keyboard.

We've been carrying this in our Linux-as-a-bootloader environment for a little
while now. The machines all have the same TI TUSB73x0 part, and when we kexec
the devices don't come back until a system power cycle.

I'd like some advice on an acceptable way to upstream the fix, so that the xhci
device survives kexec.

Signed-off-by: Joel Stanley 
---


What kernel version is this?

As Greg said there are fixes in this area in the 4.8 latest rc kernel.

If that doesn't work then we need to figure out what the real issue is.

xhci hardware is really just one controller. The split into primary and 
secondary HCD
is a software only. We always load the primary HCD first (USB2) and secondary 
second (USB3).
We unload them in reverse order, and need to stop the xhci (halt the hcd) as a 
first step.

load primary
load secondary  (starts the xhci controller
...
unload secondary (halts the controller)
unload primary   (free memory)

-Mathias




Re: [RFC PATCH] xhci: do not halt the secondary HCD

2016-09-19 Thread Greg KH
On Mon, Sep 19, 2016 at 04:05:45PM +0930, Joel Stanley wrote:
> We can't halt the secondary HCD, because it's also the primary HCD,
> which will cause problems if we have devices attached to the primary
> HCD, like a keyboard.
> 
> We've been carrying this in our Linux-as-a-bootloader environment for a little
> while now. The machines all have the same TI TUSB73x0 part, and when we kexec
> the devices don't come back until a system power cycle.
> 
> I'd like some advice on an acceptable way to upstream the fix, so that the 
> xhci
> device survives kexec.

Any reason you didn't cc: Mathias?

And have you tried 4.8-rc kernels?  I thought we just fixed an issue
around secondary HCDs...

thanks,

greg k-h


Re: [RFC PATCH] xhci: do not halt the secondary HCD

2016-09-19 Thread Greg KH
On Mon, Sep 19, 2016 at 04:05:45PM +0930, Joel Stanley wrote:
> We can't halt the secondary HCD, because it's also the primary HCD,
> which will cause problems if we have devices attached to the primary
> HCD, like a keyboard.
> 
> We've been carrying this in our Linux-as-a-bootloader environment for a little
> while now. The machines all have the same TI TUSB73x0 part, and when we kexec
> the devices don't come back until a system power cycle.
> 
> I'd like some advice on an acceptable way to upstream the fix, so that the 
> xhci
> device survives kexec.

Any reason you didn't cc: Mathias?

And have you tried 4.8-rc kernels?  I thought we just fixed an issue
around secondary HCDs...

thanks,

greg k-h


[RFC PATCH] xhci: do not halt the secondary HCD

2016-09-19 Thread Joel Stanley
We can't halt the secondary HCD, because it's also the primary HCD,
which will cause problems if we have devices attached to the primary
HCD, like a keyboard.

We've been carrying this in our Linux-as-a-bootloader environment for a little
while now. The machines all have the same TI TUSB73x0 part, and when we kexec
the devices don't come back until a system power cycle.

I'd like some advice on an acceptable way to upstream the fix, so that the xhci
device survives kexec.

Signed-off-by: Joel Stanley 
---
 drivers/usb/host/xhci.c | 20 +++-
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index adc169d2fd76..ec92a843325b 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -682,6 +682,21 @@ void xhci_stop(struct usb_hcd *hcd)
 
mutex_lock(>mutex);
 
+   /*
+* We can't halt the secondary HCD, because it's also the primary
+* HCD, which will cause problems if we have devices attached to the
+* primary HCD, like a keyboard.
+*/
+   if (!usb_hcd_is_primary_hcd(hcd)) {
+   /* The shared_hcd is going to be deallocated shortly (the USB
+* core only calls this function when allocation fails in
+* usb_add_hcd(), or usb_remove_hcd() is called).  So we need
+* to unset xHCI's pointer.  */
+   xhci->shared_hcd = NULL;
+   mutex_unlock(>mutex);
+   return;
+   }
+
if (!(xhci->xhc_state & XHCI_STATE_HALTED)) {
spin_lock_irq(>lock);
 
@@ -693,11 +708,6 @@ void xhci_stop(struct usb_hcd *hcd)
spin_unlock_irq(>lock);
}
 
-   if (!usb_hcd_is_primary_hcd(hcd)) {
-   mutex_unlock(>mutex);
-   return;
-   }
-
xhci_cleanup_msix(xhci);
 
/* Deleting Compliance Mode Recovery Timer */
-- 
2.9.3



[RFC PATCH] xhci: do not halt the secondary HCD

2016-09-19 Thread Joel Stanley
We can't halt the secondary HCD, because it's also the primary HCD,
which will cause problems if we have devices attached to the primary
HCD, like a keyboard.

We've been carrying this in our Linux-as-a-bootloader environment for a little
while now. The machines all have the same TI TUSB73x0 part, and when we kexec
the devices don't come back until a system power cycle.

I'd like some advice on an acceptable way to upstream the fix, so that the xhci
device survives kexec.

Signed-off-by: Joel Stanley 
---
 drivers/usb/host/xhci.c | 20 +++-
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index adc169d2fd76..ec92a843325b 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -682,6 +682,21 @@ void xhci_stop(struct usb_hcd *hcd)
 
mutex_lock(>mutex);
 
+   /*
+* We can't halt the secondary HCD, because it's also the primary
+* HCD, which will cause problems if we have devices attached to the
+* primary HCD, like a keyboard.
+*/
+   if (!usb_hcd_is_primary_hcd(hcd)) {
+   /* The shared_hcd is going to be deallocated shortly (the USB
+* core only calls this function when allocation fails in
+* usb_add_hcd(), or usb_remove_hcd() is called).  So we need
+* to unset xHCI's pointer.  */
+   xhci->shared_hcd = NULL;
+   mutex_unlock(>mutex);
+   return;
+   }
+
if (!(xhci->xhc_state & XHCI_STATE_HALTED)) {
spin_lock_irq(>lock);
 
@@ -693,11 +708,6 @@ void xhci_stop(struct usb_hcd *hcd)
spin_unlock_irq(>lock);
}
 
-   if (!usb_hcd_is_primary_hcd(hcd)) {
-   mutex_unlock(>mutex);
-   return;
-   }
-
xhci_cleanup_msix(xhci);
 
/* Deleting Compliance Mode Recovery Timer */
-- 
2.9.3