Re: [PATCH] xhci-pci: Allow host runtime PM as default for Intel Alder Lake xHCI

2021-04-07 Thread Mathias Nyman
On 7.4.2021 2.35, Azhar Shaikh wrote:
> From: Abhijeet Rao 
> 
> In the same way as Intel Tiger Lake TCSS (Type-C Subsystem) the Alder Lake
> TCSS xHCI needs to be runtime suspended whenever possible to allow the
> TCSS hardware block to enter D3cold and thus save energy.
> 
> Signed-off-by: Abhijeet Rao 
> Signed-off-by: Nikunj A. Dadhania 
> Signed-off-by: Azhar Shaikh 
> ---
>  drivers/usb/host/xhci-pci.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)

Thanks, Added to queue

-Mathias


Re: [PATCH 15/17] usb: xhci-mtk: support to build xhci-mtk-hcd.ko

2021-03-08 Thread Mathias Nyman
On 7.3.2021 18.00, Greg Kroah-Hartman wrote:
> On Fri, Mar 05, 2021 at 05:02:53PM +0800, Chunfeng Yun wrote:
>> Currently xhci-hcd.ko building depends on USB_XHCI_MTK, this
>> is not flexible for some cases. For example:
>> USB_XHCI_HCD is y, and USB_XHCI_MTK is m, then we can't
>> implement extended functions if only update xhci-mtk.ko
>> This patch is used to remove the dependence.
>>
>> Signed-off-by: Chunfeng Yun 
> 
> Oh nice, I tried to unwind this once, but did not succeed.
> 
> Mathias, any objection to this?  I think this is the only patch in this
> series that touches the non-mtk code, want me to just queue it up in my
> tree, or are you going to send it to me through your patches?
> 

No objection, Chunfeng Yun sent v2 already, it looks good to me.
Easier (for me) if you queue up the whole series directly

Thanks
- Mathias



Re: [PATCH v2 15/18] usb: xhci-mtk: support to build xhci-mtk-hcd.ko

2021-03-08 Thread Mathias Nyman
On 8.3.2021 4.52, Chunfeng Yun wrote:
> Currently xhci-hcd.ko building depends on USB_XHCI_MTK, this
> is not flexible for some cases. For example:
> USB_XHCI_HCD is y, and USB_XHCI_MTK is m, then we can't
> implement extended functions if only update xhci-mtk.ko
> This patch is used to remove the dependence.
> 
> Signed-off-by: Chunfeng Yun 
> ---
> v2: no changes
> ---
>  drivers/usb/host/Makefile   |  6 ++---
>  drivers/usb/host/xhci-mem.c |  2 +-
>  drivers/usb/host/xhci-mtk-sch.c | 48 +++--
>  drivers/usb/host/xhci-mtk.c |  2 ++
>  drivers/usb/host/xhci-mtk.h | 33 +++
>  drivers/usb/host/xhci-ring.c|  1 -
>  drivers/usb/host/xhci.c | 30 +
>  drivers/usb/host/xhci.h |  8 ++
>  8 files changed, 64 insertions(+), 66 deletions(-)
> 

Acked-by: Mathias Nyman 


Re: [RFC PATCH] USB:XHCI:Modify XHCI driver for USB2.0 controller

2021-02-26 Thread Mathias Nyman
On 26.2.2021 10.21, Longfang Liu wrote:
> Our current XHCI hardware controller has been customized to only
> support USB 2.0 ports. When using the current xhci driver, an xhci
> controller device and an ehci controller device will be created
> automatically. We want the driver to create only one ehci controller.
> After modifying the driver as follows, an error will occur.
> Is there any other modification method?

The xhci driver relies on the existence of both a main and a shared hcd.
One hcd for handing USB 2 (and slower) and the other for USB 3 devices.

As one example xhci_run(hcd) needs to be called for both hcds, first call
sets up things, second one calls xhci_start() that makes the controller run.

It's probably possible to modify the driver to support xHCI hosts with only
USB 2 ports, but requires a lot more work.

> 
> Signed-off-by: Longfang Liu 
> ---
>  drivers/usb/host/xhci-pci.c | 15 ++-
>  1 file changed, 2 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
> index ef513c2..7296aad 100644
> --- a/drivers/usb/host/xhci-pci.c
> +++ b/drivers/usb/host/xhci-pci.c
> @@ -364,26 +364,15 @@ static int xhci_pci_probe(struct pci_dev *dev, const 
> struct pci_device_id *id)
>   /* USB 2.0 roothub is stored in the PCI device now. */
>   hcd = dev_get_drvdata(>dev);
>   xhci = hcd_to_xhci(hcd);
> - xhci->shared_hcd = usb_create_shared_hcd(_pci_hc_driver, >dev,
> -  pci_name(dev), hcd);
> - if (!xhci->shared_hcd) {
> - retval = -ENOMEM;
> - goto dealloc_usb2_hcd;
> - }
> -
> + xhci->shared_hcd = NULL;
>   retval = xhci_ext_cap_init(xhci);
>   if (retval)
>   goto put_usb3_hcd;
>  
> - retval = usb_add_hcd(xhci->shared_hcd, dev->irq,
> + retval = usb_add_hcd(xhci->main_hcd, dev->irq,
>   IRQF_SHARED);
>   if (retval)
>   goto put_usb3_hcd;
> - /* Roothub already marked as USB 3.0 speed */
> -
> - if (!(xhci->quirks & XHCI_BROKEN_STREAMS) &&
> - HCC_MAX_PSA(xhci->hcc_params) >= 4)
> - xhci->shared_hcd->can_do_streams = 1;
>  
>   /* USB-2 and USB-3 roothubs initialized, allow runtime pm suspend */
>   pm_runtime_put_noidle(>dev);
> 

Something like the above could of course not be accepted upstream. We can't 
break existing
functionality to support one modified xHCI.

Thanks
Mathias


Re: [PATCH] xhci-pci: Set AMD Renoir USB controller to D3 when shutdown

2021-02-10 Thread Mathias Nyman
On 9.2.2021 10.37, Greg Kroah-Hartman wrote:
> On Fri, Feb 05, 2021 at 02:50:15PM +0800, Kai-Heng Feng wrote:
>> On Fri, Feb 5, 2021 at 2:45 PM Aaron Ma  wrote:
>>>
>>>
>>> On 2/5/21 12:27 PM, Kai-Heng Feng wrote:
 Can you please test the following patch, which should address the root 
 cause:
 https://lore.kernel.org/linux-acpi/20201201213019.1558738-1-furq...@google.com/

 It also helps another AMD laptop on S5:
 https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1912935

>>>
>>> No, this patch doesn't help on ThinkPad AMD platform.
>>
>> Thanks for the confirmation!
>>
>> Acked-by: Kai-Heng Feng 
> 
> Mathias, want me to take this in my tree now, or are you going to send
> me more patches for 5.12-rc1?
> 

Nothing more for 5.12-rc1 from me.

Could this be a PCI quirk instead of xhci?
Maybe there is some PCI flag for this already, haven't checked yet.

We want a specific PCI device to go to PCI D3cold at PCI shutdown...

If not, then adding this to xhci is fine for me as well

Thanks
-Mathias 




Re: [RFC PATCH v2 2/3] usb: xhci-mtk: modify the SOF/ITP interval for mt8195

2021-02-08 Thread Mathias Nyman
On 7.2.2021 4.27, Chunfeng Yun wrote:
> Hi Mathias,
> 
> On Wed, 2021-02-03 at 18:26 +0800, Chunfeng Yun wrote:
>> There are 4 USB controllers on MT8195, the controllers (IP1~IP3,
>> exclude IP0) have a wrong default SOF/ITP interval which is
>> calculated from the frame counter clock 24Mhz by default, but
>> in fact, the frame counter clock is 48Mhz, so we should set
>> the accurate interval according to 48Mhz for those controllers.
>> Note: the first controller no need set it.
>>
>> Signed-off-by: Chunfeng Yun 
>> ---
>> v2: fix typo of comaptible
>> ---
>>  drivers/usb/host/xhci-mtk.c | 63 +
>>  1 file changed, 63 insertions(+)
>>
>> diff --git a/drivers/usb/host/xhci-mtk.c b/drivers/usb/host/xhci-mtk.c
>> index 8f321f39ab96..0a68c4ac8b48 100644
>> --- a/drivers/usb/host/xhci-mtk.c
>> +++ b/drivers/usb/host/xhci-mtk.c
>> @@ -68,11 +68,71 @@
>>  #define SSC_IP_SLEEP_EN BIT(4)
>>  #define SSC_SPM_INT_EN  BIT(1)
>>  
> Can I Read/Write the following xHCI controller's registers  in
> xhci-mtk.c?
> 
> Ideally, xhci-mtk.c should not access them, because xhci-mtk is only a
> glue driver used to initialize clocks/power and IPPC registers which
> don't belong to xHCI controller.
> 

These *_EOF registers look like they are Mediatek vendor specific registers
and not part of public xHCI register-level spec. 
So I think accessing them from xhci-mtk.c makes sense.

If those register offsets are hardcoded like this in the Mediatek spec then
this is fine, but if those offsets are found from a vendor specific xHCI
extended capability entry (see xhci spec section 7) then we should dig them out
from there. 
>> +/* xHCI csr */
>> +#define LS_EOF  0x930
>> +#define LS_EOF_OFFSET   0x89
>> +
>> +#define FS_EOF  0x934
>> +#define FS_EOF_OFFSET   0x2e
>> +
>> +#define SS_GEN1_EOF 0x93c
>> +#define SS_GEN1_EOF_OFFSET  0x78
>> +
>> +#define HFCNTR_CFG  0x944
>> +#define ITP_DELTA_CLK   (0xa << 1)
>> +#define ITP_DELTA_CLK_MASK  GENMASK(5, 1)
>> +#define FRMCNT_LEV1_RANG(0x12b << 8)
>> +#define FRMCNT_LEV1_RANG_MASK   GENMASK(19, 8)
>> +
>> +#define SS_GEN2_EOF 0x990
>> +#define SS_GEN2_EOF_OFFSET  0x3c
>> +#define EOF_OFFSET_MASK GENMASK(11, 0)
>> +
>>  enum ssusb_uwk_vers {
>>  SSUSB_UWK_V1 = 1,
>>  SSUSB_UWK_V2,
>>  };
>>  
>> +/*
>> + * MT8195 has 4 controllers, the controller1~3's default SOF/ITP interval
>> + * is calculated from the frame counter clock 24M, but in fact, the clock
>> + * is 48M, so need change the interval.
>> + */
>> +static void xhci_mtk_set_frame_interval(struct xhci_hcd_mtk *mtk)
>> +{
>> +struct device *dev = mtk->dev;
>> +struct usb_hcd *hcd = mtk->hcd;
>> +u32 value;
>> +
>> +if (!of_device_is_compatible(dev->of_node, "mediatek,mt8195-xhci"))
>> +return;
>> +
>> +value = readl(hcd->regs + HFCNTR_CFG);
>> +value &= ~(ITP_DELTA_CLK_MASK | FRMCNT_LEV1_RANG_MASK);
>> +value |= (ITP_DELTA_CLK | FRMCNT_LEV1_RANG);
>> +writel(value, hcd->regs + HFCNTR_CFG);
>> +
>> +value = readl(hcd->regs + LS_EOF);
>> +value &= ~EOF_OFFSET_MASK;
>> +value |= LS_EOF_OFFSET;
>> +writel(value, hcd->regs + LS_EOF);
>> +
>> +value = readl(hcd->regs + FS_EOF);
>> +value &= ~EOF_OFFSET_MASK;
>> +value |= FS_EOF_OFFSET;
>> +writel(value, hcd->regs + FS_EOF);
>> +
>> +value = readl(hcd->regs + SS_GEN1_EOF);
>> +value &= ~EOF_OFFSET_MASK;
>> +value |= SS_GEN1_EOF_OFFSET;
>> +writel(value, hcd->regs + SS_GEN1_EOF);
>> +
>> +value = readl(hcd->regs + SS_GEN2_EOF);
>> +value &= ~EOF_OFFSET_MASK;
>> +value |= SS_GEN2_EOF_OFFSET;
>> +writel(value, hcd->regs + SS_GEN2_EOF);

Minor nit about names,
Register offsets from MMIO start are named *_EOF while clock multipliers? are 
named *_EOF_OFFSET.
This was a bit confusing

Thanks
-Mathias


Re: [PATCH v2] usb: host: xhci: mvebu: make USB 3.0 PHY optional for Armada 3720

2021-02-03 Thread Mathias Nyman
On 2.2.2021 20.41, Greg Kroah-Hartman wrote:
> On Mon, Feb 01, 2021 at 04:08:03PM +0100, Pali Rohár wrote:
>> Older ATF does not provide SMC call for USB 3.0 phy power on functionality
>> and therefore initialization of xhci-hcd is failing when older version of
>> ATF is used. In this case phy_power_on() function returns -EOPNOTSUPP.
>>
>> [3.108467] mvebu-a3700-comphy d0018300.phy: unsupported SMC call, try 
>> updating your firmware
>> [3.117250] phy phy-d0018300.phy.0: phy poweron failed --> -95
>> [3.123465] xhci-hcd: probe of d0058000.usb failed with error -95
>>
>> This patch introduces a new plat_setup callback for xhci platform drivers
>> which is called prior calling usb_add_hcd() function. This function at its
>> beginning skips PHY init if hcd->skip_phy_initialization is set.
>>
>> Current init_quirk callback for xhci platform drivers is called from
>> xhci_plat_setup() function which is called after chip reset completes.
>> It happens in the middle of the usb_add_hcd() function and therefore this
>> callback cannot be used for setting if PHY init should be skipped or not.
>>
>> For Armada 3720 this patch introduce a new xhci_mvebu_a3700_plat_setup()
>> function configured as a xhci platform plat_setup callback. This new
>> function calls phy_power_on() and in case it returns -EOPNOTSUPP then
>> XHCI_SKIP_PHY_INIT quirk is set to instruct xhci-plat to skip PHY
>> initialization.
>>
>> This patch fixes above failure by ignoring 'not supported' error in
>> xhci-hcd driver. In this case it is expected that phy is already power on.
>>
>> It fixes initialization of xhci-hcd on Espressobin boards where is older
>> Marvell's Arm Trusted Firmware without SMC call for USB 3.0 phy power.
>>
>> This is regression introduced in commit bd3d25b07342 ("arm64: dts: marvell:
>> armada-37xx: link USB hosts with their PHYs") where USB 3.0 phy was defined
>> and therefore xhci-hcd on Espressobin with older ATF started failing.
>>
>> Signed-off-by: Pali Rohár 
>> Tested-by: Tomasz Maciej Nowak 
>> Fixes: bd3d25b07342 ("arm64: dts: marvell: armada-37xx: link USB hosts with 
>> their PHYs")
>> Cc:  # 5.1+: ea17a0f153af: phy: marvell: comphy: 
>> Convert internal SMCC firmware return codes to errno
>> Cc:  # 5.1+: f768e718911e: usb: host: xhci-plat: add 
>> priv quirk for skip PHY initialization
> 
> 
> Mathias, any objection for me taking this now to get into 5.11-final?
> 
> thanks,
> 
> greg k-h
> 

No objections, looks good to me.

Acked-by: Mathias Nyman 


Re: [PATCH 0/4] add xhci hooks for USB offload

2021-01-27 Thread Mathias Nyman
On 28.1.2021 5.38, Howard Yen wrote:
> On Tue, Jan 26, 2021 at 10:19 PM Greg KH  wrote:
>>
>> On Fri, Jan 22, 2021 at 05:32:58PM +0200, Mathias Nyman wrote:
>>>
>>> Ok, before adding hooks like this I think we need to see how they are used.
>>> Do you have the rest of the patches that go on top of this series?
>>>
>>> Maybe it could make sense to use overrides for the functions in struct 
>>> hc_driver
>>> instead in some cases? There is support for that already.
>>
>> What overrides could be done for these changes?  At first glance that
>> would seem to require a lot of duplicated code in whatever override
>> happens to be needed.
>>
>> thanks,
>>
>> greg k-h
> 
> This patch series is all the changes for the offload hooks currently.
> 
> I thought about this, but if I tried to override the functions in
> struct hc_driver, that'll need to
> copy many code to the override function, and it won't follow the
> latest change in the core
> xhci driver.
> 
> 
> - Howard

Ok, I see. 

The point I'm trying to make is that there is no way for me to know if
these hooks are the right solution before I see any code using them.

Is the offloading code ready and public somewhere?

Thanks
-Mathias 



Re: [PATCH v6] usb: xhci-mtk: fix unreleased bandwidth data

2021-01-26 Thread Mathias Nyman
On 26.1.2021 16.13, Greg Kroah-Hartman wrote:
> On Wed, Jan 13, 2021 at 06:05:11PM +0800, Ikjoon Jang wrote:
>> xhci-mtk needs XHCI_MTK_HOST quirk functions in add_endpoint() and
>> drop_endpoint() to handle its own sw bandwidth management.
>>
>> It stores bandwidth data into an internal table every time
>> add_endpoint() is called, and drops those in drop_endpoint().
>> But when bandwidth allocation fails at one endpoint, all earlier
>> allocation from the same interface could still remain at the table.
>>
>> This patch moves bandwidth management codes to check_bandwidth() and
>> reset_bandwidth() path. To do so, this patch also adds those functions
>> to xhci_driver_overrides and lets mtk-xhci to release all failed
>> endpoints in reset_bandwidth() path.
>>
>> Fixes: 08e469de87a2 ("usb: xhci-mtk: supports bandwidth scheduling with 
>> multi-TT")
>> Signed-off-by: Ikjoon Jang 
> 
> Mathias, any objection to me taking this patch, or do you have others
> being queued up for 5.11-final?
> 

No objections, haven't tried it out but it looks good to me.

If I finish some additional small fix for 5.11-final I can make it on top of 
this

-Mathias 


Re: [PATCH 0/4] add xhci hooks for USB offload

2021-01-22 Thread Mathias Nyman
On 20.1.2021 12.04, Howard Yen wrote:
> On Tue, Jan 19, 2021 at 8:47 PM Mathias Nyman  wrote:
>>
>> On 19.1.2021 12.10, Howard Yen wrote:
>>> To let the xhci driver support USB offload, add hooks for vendor to have
>>> customized behavior for the initialization, memory allocation, irq work, and
>>> device context synchronization. Detail is in each patch commit message.
>>
>> Is this related to the usb audio sideband capability that was added to the 
>> xHCI specification?
>> If yes, then we should probably implement the generic parts first, and then 
>> add
>> the vendor specific hooks.
>>
>> -Mathias
>>
>>
> 
> This is for offloading, no matter what type of offloading.
> I made the hooks generically and can be used for usb audio on the xhci
> which is not including the usb audio sideband capability.
> 

Ok, before adding hooks like this I think we need to see how they are used.
Do you have the rest of the patches that go on top of this series?

Maybe it could make sense to use overrides for the functions in struct hc_driver
instead in some cases? There is support for that already.

Thanks
-Mathias  



Re: [PATCH 0/4] add xhci hooks for USB offload

2021-01-19 Thread Mathias Nyman
On 19.1.2021 12.10, Howard Yen wrote:
> To let the xhci driver support USB offload, add hooks for vendor to have
> customized behavior for the initialization, memory allocation, irq work, and 
> device context synchronization. Detail is in each patch commit message.

Is this related to the usb audio sideband capability that was added to the xHCI 
specification?
If yes, then we should probably implement the generic parts first, and then add
the vendor specific hooks.

-Mathias
 



Re: [PATCH v2] usb: host: xhci-plat: fix support for XHCI_SKIP_PHY_INIT quirk

2021-01-15 Thread Mathias Nyman
On 14.1.2021 1.20, Pali Rohár wrote:
> On Thursday 24 December 2020 05:59:05 Peter Chen wrote:
>> On 20-12-23 17:18:47, Pali Rohár wrote:
>>> Currently init_quirk callbacks for xhci platform drivers are called
>>> xhci_plat_setup() function which is called after chip reset completes.
>>> It happens in the middle of the usb_add_hcd() function.
>>>
>>> But XHCI_SKIP_PHY_INIT quirk is checked in the xhci_plat_probe() function
>>> prior calling usb_add_hcd() function. Therefore this XHCI_SKIP_PHY_INIT
>>> currently does nothing as prior xhci_plat_setup() it is not set.
>>>
>>> Quirk XHCI_SKIP_PHY_INIT is only setting hcd->skip_phy_initialization value
>>> which really needs to be set prior calling usb_add_hcd() as this function
>>> at its beginning skips PHY init if this member is set.
>>>
>>> This patch fixes implementation of the XHCI_SKIP_PHY_INIT quirk by calling
>>> init_quirk callbacks (via xhci_priv_init_quirk()) prior checking if
>>> XHCI_SKIP_PHY_INIT is set. Also checking if either xhci->quirks or
>>> priv->quirks contains this XHCI_SKIP_PHY_INIT quirk.
>>>
>>> Signed-off-by: Pali Rohár 
>>>
>>> ---
>>> Changes in v2:
>>> * Check also xhci->quirks as xhci_priv_init_quirk() callbacks are setting 
>>> xhci->quirks
>>> * Tested with "usb: host: xhci: mvebu: make USB 3.0 PHY optional for Armada 
>>> 3720" patch
>>> * Removed Fixes: line
>>> ---
>>>  drivers/usb/host/xhci-plat.c | 16 
>>>  1 file changed, 8 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/usb/host/xhci-plat.c b/drivers/usb/host/xhci-plat.c
>>> index 4d34f6005381..0eab7cb5a767 100644
>>> --- a/drivers/usb/host/xhci-plat.c
>>> +++ b/drivers/usb/host/xhci-plat.c
>>> @@ -89,13 +89,6 @@ static void xhci_plat_quirks(struct device *dev, struct 
>>> xhci_hcd *xhci)
>>>  /* called during probe() after chip reset completes */
>>>  static int xhci_plat_setup(struct usb_hcd *hcd)
>>>  {
>>> -   int ret;
>>> -
>>> -
>>> -   ret = xhci_priv_init_quirk(hcd);
>>> -   if (ret)
>>> -   return ret;
>>> -
>>> return xhci_gen_setup(hcd, xhci_plat_quirks);
>>>  }
>>>  
>>> @@ -330,7 +323,14 @@ static int xhci_plat_probe(struct platform_device 
>>> *pdev)
>>>  
>>> hcd->tpl_support = of_usb_host_tpl_support(sysdev->of_node);
>>> xhci->shared_hcd->tpl_support = hcd->tpl_support;
>>> -   if (priv && (priv->quirks & XHCI_SKIP_PHY_INIT))
>>> +
>>> +   if (priv) {
>>> +   ret = xhci_priv_init_quirk(hcd);
>>> +   if (ret)
>>> +   goto disable_usb_phy;
>>> +   }
>>> +
>>> +   if ((xhci->quirks & XHCI_SKIP_PHY_INIT) || (priv && (priv->quirks & 
>>> XHCI_SKIP_PHY_INIT)))
>>> hcd->skip_phy_initialization = 1;
>>
>> I am not sure if others agree with you move the position of
>> xhci_priv_init_quirk, Let's see Mathias opinion.
> 
> Hello! Do you have an opinion how to handle this issue? As currently it
> is needed for another patch which is fixing issue/regression in xhci-mvebu:
> https://lore.kernel.org/linux-usb/20201223162403.10897-1-p...@kernel.org/
> 

I can see the benefit in this. 
In the xhci-plat case usb_create_hcd and usb_add_hcd are separate steps, and
we could both copy the xhci_plat_priv .quirks and run the .init_qurks before
adding the hcd.
I guess the current way is inherited from pci case where the earliest place
to do this after hcd is created is the hcd->driver->reset callback (which is
set to xhci_pci_setup() or xhci_plat_setup()).

xhci-rcar.c is using the .init_quirk to load firmware, we need to check with
them if this change is ok. (added Yoshihiro Shimoda to cc)
Their firmware would be loaded before phy parts are initialized, usb bus
registered, or roothub device allocated.

Thanks
-Mathias


Re: [PATCH v5] usb: xhci-mtk: fix unreleased bandwidth data

2021-01-08 Thread Mathias Nyman
On 8.1.2021 8.11, Chunfeng Yun wrote:
> On Thu, 2021-01-07 at 13:09 +0200, Mathias Nyman wrote:
>> On 29.12.2020 8.24, Ikjoon Jang wrote:
>>> xhci-mtk has hooks on add_endpoint() and drop_endpoint() from xhci
>>> to handle its own sw bandwidth managements and stores bandwidth data
>>> into internal table every time add_endpoint() is called,
>>> so when bandwidth allocation fails at one endpoint, all earlier
>>> allocation from the same interface could still remain at the table.
>>>
>>> This patch adds two more hooks from check_bandwidth() and
>>> reset_bandwidth(), and make mtk-xhci to releases all failed endpoints
>>> from reset_bandwidth().
>>>
>>> Fixes: 08e469de87a2 ("usb: xhci-mtk: supports bandwidth scheduling with 
>>> multi-TT")
>>> Signed-off-by: Ikjoon Jang 
>>>
>>
>> ...
>>
>>>
>>> diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
>>> index d4a8d0efbbc4..e1fcd3cf723f 100644
>>> --- a/drivers/usb/host/xhci.c
>>> +++ b/drivers/usb/host/xhci.c
>>> @@ -2882,6 +2882,12 @@ static int xhci_check_bandwidth(struct usb_hcd *hcd, 
>>> struct usb_device *udev)
>>> xhci_dbg(xhci, "%s called for udev %p\n", __func__, udev);
>>> virt_dev = xhci->devs[udev->slot_id];
>>>  
>>> +   if (xhci->quirks & XHCI_MTK_HOST) {
>>> +   ret = xhci_mtk_check_bandwidth(hcd, udev);
>>> +   if (ret < 0)
>>> +   return ret;
>>> +   }
>>> +
>>
>> Just noticed that XHCI_MTK_HOST quirk is only set in xhci-mtk.c.
>> xhci-mtk.c calls xhci_init_driver(..., xhci_mtk_overrides) with a .reset 
>> override function.
>>
>> why not add override functions for .check_bandwidth and .reset_bandwidth to 
>> xhci_mtk_overrides instead?
>>
>> Another patch to add similar overrides for .add_endpoint and .drop_endpoint 
>> should probably be
>> done so that we can get rid of the xhci_mtk_add/drop_ep_quirk() calls in 
>> xhci.c as well
> You mean, we can export xhci_add/drop_endpoint()?

I think so, unless you have a better idea.
I prefer exporting the generic add/drop_endpoint functions rather than the 
vendor specific quirk functions.

-Mathias
  


Re: [PATCH v5] usb: xhci-mtk: fix unreleased bandwidth data

2021-01-07 Thread Mathias Nyman
On 29.12.2020 8.24, Ikjoon Jang wrote:
> xhci-mtk has hooks on add_endpoint() and drop_endpoint() from xhci
> to handle its own sw bandwidth managements and stores bandwidth data
> into internal table every time add_endpoint() is called,
> so when bandwidth allocation fails at one endpoint, all earlier
> allocation from the same interface could still remain at the table.
> 
> This patch adds two more hooks from check_bandwidth() and
> reset_bandwidth(), and make mtk-xhci to releases all failed endpoints
> from reset_bandwidth().
> 
> Fixes: 08e469de87a2 ("usb: xhci-mtk: supports bandwidth scheduling with 
> multi-TT")
> Signed-off-by: Ikjoon Jang 
> 

...

> 
> diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
> index d4a8d0efbbc4..e1fcd3cf723f 100644
> --- a/drivers/usb/host/xhci.c
> +++ b/drivers/usb/host/xhci.c
> @@ -2882,6 +2882,12 @@ static int xhci_check_bandwidth(struct usb_hcd *hcd, 
> struct usb_device *udev)
>   xhci_dbg(xhci, "%s called for udev %p\n", __func__, udev);
>   virt_dev = xhci->devs[udev->slot_id];
>  
> + if (xhci->quirks & XHCI_MTK_HOST) {
> + ret = xhci_mtk_check_bandwidth(hcd, udev);
> + if (ret < 0)
> + return ret;
> + }
> +

Just noticed that XHCI_MTK_HOST quirk is only set in xhci-mtk.c.
xhci-mtk.c calls xhci_init_driver(..., xhci_mtk_overrides) with a .reset 
override function.

why not add override functions for .check_bandwidth and .reset_bandwidth to 
xhci_mtk_overrides instead?

Another patch to add similar overrides for .add_endpoint and .drop_endpoint 
should probably be
done so that we can get rid of the xhci_mtk_add/drop_ep_quirk() calls in xhci.c 
as well

Thanks
-Mathias



Re: How to enable auto-suspend by default

2020-11-24 Thread Mathias Nyman
On 23.11.2020 15.54, Hans de Goede wrote:
> Hi,
> 
> On 11/11/20 3:31 PM, Mika Westerberg wrote:
>> On Wed, Nov 11, 2020 at 12:27:32PM +0100, Hans de Goede wrote:
>>> Hi,
>>>
>>> On 11/10/20 6:25 PM, Mika Westerberg wrote:
 On Tue, Nov 10, 2020 at 04:02:33PM +, Limonciello, Mario wrote:
>>
>> On Tue, Nov 10, 2020 at 11:57:07AM +0100, Bastien Nocera wrote:
>>> Hey,
>>>
>>> systemd has been shipping this script to enable auto-suspend on a
>>> number of USB and PCI devices:
>>>
>> https://github.com/systemd/systemd/blob/master/tools/chromiumos/gen_autosuspen
>> d_rules.py
>>>
>>> The problem here is twofold. First, the list of devices is updated from
>>> ChromeOS, and the original list obviously won't be updated by ChromeOS
>>> developers unless a device listed exists in a ChromeBook computer,
>>> which means a number of devices that do support autosuspend aren't
>>> listed.
>>>
>>> The other problem is that this list needs to exist at all, and that it
>>> doesn't seem possible for device driver developers (at various levels
>>> of the stack) to opt-in to auto-suspend when all the variants of the
>>> device (or at least detectable ones) support auto-suspend.
>>
>> A driver can say they support autosuspend today, but I think you are
>> concerned about the devices that are controlled by class-compliant
>> drivers, right?  And for those, no, we can't do this in the kernel as
>> there are just too many broken devices out there.
>>
>
> I guess what Bastien is getting at is for newer devices supported by class
> drivers rather than having to store an allowlist in udev rules, can we set
> the allowlist in the kernel instead.  Then distributions that either don't
> use systemd or don't regularly update udev rules from systemd can take
> advantage of better defaults on modern hardware.
>
> The one item that stood out to me in that rules file was 8086:a0ed.
> It's listed as "Volteer XHCI", but that same device ID is actually present
> in an XPS 9310 in front of me as well and used by the xhci-pci kernel 
> module.
>
> Given we're effectively ending up with the combination of runtime PM 
> turned
> on by udev rules, do we need something like this for that ID:
>
> https://github.com/torvalds/linux/commit/6a7c533d4a1854f54901a065d8c672e890400d8a
>
> @Mika Westerberg should 8086:a0ed be quirked like the TCSS xHCI too?

 I think this one is the TGL PCH xHCI. The quirk currently for xHCI
 controllers that are part of the TCSS (Type-C SubSystem) where it is
 important to put all devices into low power mode whenever possible,
 otherwise it keeps the whole block on.
>>>
>>> Note that there are currently some IDs missing from the xHCIs which
>>> are part of the TCSS too. At least the id for the xHCI in the thunderbolt
>>> controller on the Lenovo T14 gen 1 is missing. I started a discussion
>>> about extending the kernel quirk list for this vs switching to hwdb
>>> a while a go:
>>>
>>> https://lore.kernel.org/linux-usb/b8b21ba3-0a8a-ff54-5e12-cf8960651...@redhat.com/
>>>
>>> The conclusion back then was to switch to hwdb, but I never got around to 
>>> this.
>>
>> The reason I've added these to the xHCI driver is that it works even if
>> you are running some really small userspace (like busybox). Also for the
>> xHCI in TCSS we know for sure that it fully supports D3cold.
>>
>> (The one you refer above is actually mistake from my side as I never
>>  tested Alpine Ridge LP controller which I think this is).
> 
> Ok, so I'll submit a patch adding the 15c1 product-id for the
> INTEL_ALPINE_RIDGE_LP_2C_XHCI controller to the list of ids for which we
> set the XHCI_DEFAULT_PM_RUNTIME_ALLOW quirk. To fix the much too high
> idle-power consumption problem on devices with this Alpine Ridge variant.

Thanks

> 
 Typically we haven't done that for PCH side xHCI controllers though, but
 I don't see why not if it works that is. Adding Mathias to comment more
 on that since he is the xHCI maintainer.
>>>
>>> If we are also going to enable this for the non TCSS Intel XHCI controllers,
>>> maybe just uncondtionally enable it for all Intel XHCI controllers, or
>>> if necessary do a deny-list for some older models and enable it for anything
>>> not on the deny-list (so all newer models). That should avoid the game of
>>> whack-a-mole which we will have with this otherwise.
>>
>> This is really up to Mathias to decide. I'm fine either way :)
> 
> Ok, Matthias what do you think about this?

I don't think we are ready to enable runtime pm as default for all Intel xHCI 
controllers.
The risk of xHCI not waking up when user plugs a mouse/keyboard, making the 
system unusable
just seems too high compared to the powersaving benefit.

The powersaving benefit from autosuspending the TCSS xHCI is a lot better, and 
we, (Mika mostly)
has 

Re: 5.10 regression, many XHCI swiotlb buffer is full / DMAR: Device bounce map failed errors on thunderbolt connected XHCI controller

2020-11-24 Thread Mathias Nyman
On 24.11.2020 12.31, Hans de Goede wrote:
> Hi,
> 
> On 11/24/20 11:27 AM, Christoph Hellwig wrote:
>> On Mon, Nov 23, 2020 at 03:49:09PM +0100, Hans de Goede wrote:
>>> Hi,
>>>
>>> +Cc Christoph Hellwig 
>>>
>>> Christoph, this is still an issue, so I've been looking around a bit and 
>>> think this
>>> might have something to do with the dma-mapping-5.10 changes.
>>>
>>> Do you have any suggestions to debug this, or is it time to do a git bisect
>>> on this before 5.10 ships with regression?
>>
>> Given that DMAR prefix this seems to be about using intel-iommu + bounce
>> buffering for external devices.  I can't really think of anything specific
>> in 5.10 related to that, so maybe you'll need to bisect.
>>
>> I doub this means we are actually leaking swiotlb buffers, so while
>> I'm pretty sure we broke something in lower layers this also means
>> xhci doesn't handle swiotlb operation very gracefully in general.

Can't think of any xhci change since 5.9 that would cause this.
It's possible there's some underlying xhci issue the 5.10 dma-mapping
changes reveal.

> 
> Ok, I've re-arranged my schedule a bit so that I have time to bisect this
> tomorrow, so with some luck I will be able to provide info on which commit
> introduced this issue tomorrow around the end of the day.

Thanks for looking into it.

-Mathias



Re: [PATCH v2] usb: xhci: Workaround for S3 issue on AMD SNPS 3.0 xHC

2020-10-26 Thread Mathias Nyman
On 23.10.2020 16.15, Sandeep Singh wrote:
> From: Sandeep Singh 
> 
> On some platform of AMD, S3 fails with HCE and SRE errors. To fix this,
> need to disable a bit which is enable in sparse controller.
> 
> Signed-off-by: Sanket Goswami 
> Signed-off-by: Sandeep Singh 
> ---
> Changes since v1:(https://lkml.org/lkml/2020/10/23/368)
>   -> Add xhci.h changes
> 

Added to queue.
This looks like it should go to stable as well.

-Mathias


Re: [PATCH] usb: remove unneeded break

2020-10-26 Thread Mathias Nyman
On 19.10.2020 18.02, t...@redhat.com wrote:
> From: Tom Rix 
> 
> A break is not needed if it is preceded by a return or goto
> 
> Signed-off-by: Tom Rix 
> ---
>  drivers/usb/gadget/function/f_hid.c | 9 -
>  drivers/usb/host/xhci-mem.c | 1 -
>  drivers/usb/misc/iowarrior.c| 3 ---
>  drivers/usb/serial/iuu_phoenix.c| 2 --
>  drivers/usb/storage/freecom.c   | 1 -
>  5 files changed, 16 deletions(-)

Would probably be better to split this into several patches.
The xhci change looks good (as a separate patch)

-Mathias


Re: [PATCH] xhci: Fix sizeof() mismatch

2020-10-26 Thread Mathias Nyman
On 8.10.2020 20.11, Colin King wrote:
> From: Colin Ian King 
> 
> An incorrect sizeof() is being used, sizeof(rhub->ports) is not
> correct, it should be sizeof(*rhub->ports).  This bug did not
> cause any issues because it just so happens the sizes are the same.
> 
> Addresses-Coverity: ("Sizeof not portable (SIZEOF_MISMATCH)")
> Fixes: bcaa9d5c5900 ("xhci: Create new structures to store xhci port 
> information")
> Signed-off-by: Colin Ian King 
> ---

Thanks, added

-Mathias



Re: [patch 05/12] usb: xhci: Remove in_interrupt() checks

2020-10-23 Thread Mathias Nyman
On 14.10.2020 17.52, Thomas Gleixner wrote:
> From: Ahmed S. Darwish 
> 
> The usage of in_interrupt() in drivers is phased out for various reasons.
> 
> xhci_set_hc_event_deq() has an !in_interrupt() check which is pointless
> because the function is only invoked from xhci_mem_init() which is clearly
> task context as it does GFP_KERNEL allocations. Remove it.
> 
> xhci_urb_enqueue() prints a debug message if an URB is submitted after the
> underlying hardware was suspended. But that warning is only issued when
> in_interrupt() is true, which makes no sense. Simply return -ESHUTDOWN and
> be done with it.
> 
> Signed-off-by: Ahmed S. Darwish 
> Signed-off-by: Sebastian Andrzej Siewior 
> Signed-off-by: Thomas Gleixner 
> Cc: Mathias Nyman 
> Cc: Greg Kroah-Hartman 
> Cc: linux-...@vger.kernel.org
> ---
>  drivers/usb/host/xhci-mem.c |2 +-
>  drivers/usb/host/xhci.c |6 ++----
>  2 files changed, 3 insertions(+), 5 deletions(-)

Acked-by: Mathias Nyman 



Re: [PATCH v3] xhci: Prevent runtime suspend on Etron EJ168

2020-09-28 Thread Mathias Nyman
On 28.9.2020 12.10, Kai-Heng Feng wrote:
> 
> 
>> On Jun 8, 2020, at 11:56, Kai-Heng Feng  wrote:
>>
>>
>>
>>> On May 5, 2020, at 01:16, Kai-Heng Feng  wrote:
>>>
>>> Etron EJ168 USB 3.0 Host Controller stops working after S3, if it was
>>> runtime suspended previously:
>>> [  370.080359] pci :02:00.0: can't change power state from D3cold to D0 
>>> (config space inaccessible)
>>> [  370.080477] xhci_hcd :04:00.0: can't change power state from D3cold 
>>> to D0 (config space inaccessible)
>>> [  370.080532] pcieport :00:1c.0: DPC: containment event, status:0x1f05 
>>> source:0x0200
>>> [  370.080533] pcieport :00:1c.0: DPC: ERR_FATAL detected
>>> [  370.080536] xhci_hcd :04:00.0: can't change power state from D3hot 
>>> to D0 (config space inaccessible)
>>> [  370.080552] xhci_hcd :04:00.0: AER: can't recover (no error_detected 
>>> callback)
>>> [  370.080566] usb usb3: root hub lost power or was reset
>>> [  370.080566] usb usb4: root hub lost power or was reset
>>> [  370.080572] xhci_hcd :04:00.0: Host halt failed, -19
>>> [  370.080574] xhci_hcd :04:00.0: Host not accessible, reset failed.
>>> [  370.080575] xhci_hcd :04:00.0: PCI post-resume error -19!
>>> [  370.080586] xhci_hcd :04:00.0: HC died; cleaning up
>>>
>>> This can be fixed by not runtime suspend the controller at all.
>>>
>>> So disable runtime suspend for EJ168 xHCI device.
>>>
>>> Signed-off-by: Kai-Heng Feng 
>>
>> A gentle ping...
> 
> Another gentle ping...

Thanks, somehow I didn't notice this earlier.

Was the rootcause ever investigated?
Preventing runtime suspend looks like a quick fix to get rid of the issue, but 
possibly just hides some other underlying power management problem

-Mathias


Re: [PATCH] xhci: workaround for S3 issue on AMD SNPS 3.0 xHC

2020-09-16 Thread Mathias Nyman
On 31.8.2020 9.52, Nehal Bakulchandra Shah wrote:
> From: Nehal Bakulchandra Shah 
> 
> On some platform of AMD, S3 fails with HCE and SRE errors.To fix this,
> sparse controller enable bit has to be disabled.
> 
> Signed-off-by: Nehal Bakulchandra Shah 
> ---
>  drivers/usb/host/xhci-pci.c | 12 
>  drivers/usb/host/xhci.h |  1 +
>  2 files changed, 13 insertions(+)
> 
> diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
> index 3feaafebfe58..865a16e6c1ed 100644
> --- a/drivers/usb/host/xhci-pci.c
> +++ b/drivers/usb/host/xhci-pci.c
> @@ -160,6 +160,9 @@ static void xhci_pci_quirks(struct device *dev, struct 
> xhci_hcd *xhci)
>   (pdev->device == 0x15e0 || pdev->device == 0x15e1))
>   xhci->quirks |= XHCI_SNPS_BROKEN_SUSPEND;
>  
> + if (pdev->vendor == PCI_VENDOR_ID_AMD && pdev->device == 0x15e5)
> + xhci->quirks |= XHCI_DISABLE_SPARSE;
> +
>   if (pdev->vendor == PCI_VENDOR_ID_AMD)
>   xhci->quirks |= XHCI_TRUST_TX_LENGTH;
>  
> @@ -371,6 +374,15 @@ static int xhci_pci_probe(struct pci_dev *dev, const 
> struct pci_device_id *id)
>   /* USB 2.0 roothub is stored in the PCI device now. */
>   hcd = dev_get_drvdata(>dev);
>   xhci = hcd_to_xhci(hcd);
> +
> + if (xhci->quirks & XHCI_DISABLE_SPARSE) {
> + u32 reg;
> +
> + reg = readl(hcd->regs + 0xC12C);
> + reg &=  ~BIT(17);
> + writel(reg, hcd->regs + 0xC12C);
> + }
> +

Is disabling the bit once in probe enough?
xHC will be reset after hibernation as well, does this bit need to be cleared 
after that?

Also consider turning this into a separate function with a proper description,
see xhci_pme_quirk() for example. Avoids cluttering probe.
Actually if this bit only needs to be cleared once then that function could be 
called from xhci_pci_setup()

-Mathias


Re: [PATCH] xhci: workaround for S3 issue on AMD SNPS 3.0 xHC

2020-09-16 Thread Mathias Nyman
On 16.9.2020 11.06, Greg KH wrote:
> On Mon, Aug 31, 2020 at 06:52:46AM +, Nehal Bakulchandra Shah wrote:
>> From: Nehal Bakulchandra Shah 
>>
>> On some platform of AMD, S3 fails with HCE and SRE errors.To fix this,
>> sparse controller enable bit has to be disabled.
>>
>> Signed-off-by: Nehal Bakulchandra Shah 
>> ---
>>  drivers/usb/host/xhci-pci.c | 12 
>>  drivers/usb/host/xhci.h |  1 +
>>  2 files changed, 13 insertions(+)
>>
>> diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c  
>> @@ -371,6 +374,15 @@ static int xhci_pci_probe(struct pci_dev *dev, const 
>> struct pci_device_id *id)
>>  /* USB 2.0 roothub is stored in the PCI device now. */
>>  hcd = dev_get_drvdata(>dev);
>>  xhci = hcd_to_xhci(hcd);
>> +
>> +if (xhci->quirks & XHCI_DISABLE_SPARSE) {
>> +u32 reg;
>> +
>> +reg = readl(hcd->regs + 0xC12C);
>> +reg &=  ~BIT(17); 
>
> And is this the proper place to be testing for quirks and applying them?
> Why not do the above in the xhci_pci_quirks() call?

xhci_pci_quirks() is a confusing name, it actually only sets quirk flags based 
on PCI
vendor/device.

Anyway, point is still valid, this level of bitops in probe isn't very nice.
Turn it into a separate function, and call it from xhci_pci_setup(), or if flag
needs to be cleared more often, and is related to S3 problems then possibly 
from xhci_pci_suspend()

-Mathias


Re: kworker/0:3+pm hogging CPU

2020-08-31 Thread Mathias Nyman
On 29.8.2020 18.59, Alan Stern wrote:
> On Sat, Aug 29, 2020 at 11:50:03AM +0200, Salvatore Bonaccorso wrote:
>> Hi Alan,
>>
>> I'm following up on this thread because a user in Debian (Dirk, Cc'ed)
>> as well encountered the same/similar issue:
>>
>> On Tue, Jul 21, 2020 at 10:33:25AM -0400, Alan Stern wrote:
>>> On Tue, Jul 21, 2020 at 07:59:17AM +0200, Michal Hocko wrote:
> Sorry, my mistake.  The module name needs to be "xhci_hcd" with an '_' 
> character, not a '-' character -- the same as what shows up in the lsmod 
> output.


 [14766.973734] xhci_hcd :00:14.0: Get port status 2-1 read: 0xe88, 
 return 0x88
 [14766.973738] xhci_hcd :00:14.0: Get port status 2-2 read: 0xe88, 
 return 0x88
 [14766.973742] xhci_hcd :00:14.0: Get port status 2-3 read: 0xe0002a0, 
 return 0x2a0
 [14766.973746] xhci_hcd :00:14.0: Get port status 2-4 read: 0xe0002a0, 
 return 0x2a0
 [14766.973750] xhci_hcd :00:14.0: Get port status 2-5 read: 0xe0002a0, 
 return 0x2a0
 [14766.973754] xhci_hcd :00:14.0: Get port status 2-6 read: 0xe0002a0, 
 return 0x2a0
 [14766.973759] xhci_hcd :00:14.0: Get port status 2-1 read: 0xe88, 
 return 0x88
 [14766.973763] xhci_hcd :00:14.0: Get port status 2-2 read: 0xe88, 
 return 0x88
>>>
>>> According to the xHCI specification, those 02a0 values are normal and 
>>> the 0088 values indicate the port is disabled and has an over-current 
>>> condition.  I don't know about the e000 bits in the upper part of the 
>>> word; according to my copy of the spec those bits should be 0.

That's a 0x0e88 where the 0e00 bits are the wake bits. Leading zeroes are 
not shown. 

>>>
>>> If your machine has only two physical SuperSpeed (USB-3) ports then 
>>> perhaps the other four ports are internally wired in a way that creates 
>>> a permanent over-current indication.
>>>
 [14766.973771] xhci_hcd :00:14.0: set port remote wake mask, actual 
 port 0 status  = 0xe88
 [14766.973780] xhci_hcd :00:14.0: set port remote wake mask, actual 
 port 1 status  = 0xe88
 [14766.973789] xhci_hcd :00:14.0: set port remote wake mask, actual 
 port 2 status  = 0xe0002a0
 [14766.973798] xhci_hcd :00:14.0: set port remote wake mask, actual 
 port 3 status  = 0xe0002a0
 [14766.973807] xhci_hcd :00:14.0: set port remote wake mask, actual 
 port 4 status  = 0xe0002a0
 [14766.973816] xhci_hcd :00:14.0: set port remote wake mask, actual 
 port 5 status  = 0xe0002a0
 [14766.973830] xhci_hcd :00:14.0: Bus suspend bailout, port 
 over-current detected

 Repeating again and again. The last message suggests a HW problem? But
 why does the kernel try the same thing over and over?
>>>
>>> Because over-current is supposed to be a transient condition that goes 
>>> away quickly.  It means there's a short circuit or something similar.
>>
>> Dirk exprienced the same issue aand enabled dynamic debugging showed
>> similar pattern. His dmesg excerpt is attached.
>>
>> The Debian report is at https://bugs.debian.org/966703
>>
>> What could be tracked down is that the issue is uncovered since
>> e9fb08d617bf ("xhci: prevent bus suspend if a roothub port detected a
>> over-current condition") which was applied in 5.7-rc3 and backported
>> to several stable releases (v5.6.8, v5.4.36 and v4.19.119).
>>
>> Dirk found additionally:
>>
>>> I just found out, that if none of the two USB ports is connected, there
>>> are two kworker processes with permanently high CPU load, if one USB
>>> port is connected and the other not, there is one such kworker process,
>>> and if both USB ports are connected, there is no kworker process with
>>> high CPU load.
>>> I think, this supports your suspicion that these kworker processes are
>>> connected with the overcurrent condition for both USB ports that I also
>>> see in the dmesg output.
>>
>> Reverting the above commit covers the problem again. But I'm not
>> exprienced enough here to claim if this is a HW issue or if the Kernel
>> should handle the situation otherwise. Is there anything else Dirk can
>> provide?
> 
> It is undoubtedly a hardware issue.  The dmesg extract shows that ports 
> 1-10, 1-11, and 2-5 (which is probably the same port as one of the 
> others) have overcurrent conditions; I'm guessing that these are the 
> ports which have external connections.
> 
> What were the devices Dirk plugged in that got rid of the kworker 
> processes?  In particular, were they USB-2 or USB-3?  (The dmesg log for 
> when the devices were first attached can answer these questions.)
> 
> As far as I know, there is no way for the kernel to work around this 
> problem.  Preventing the controller from going into runtime suspend is 
> probably the best solution.
> 
> Perhaps Mathias (the xhci-hcd maintainer) will have more suggestions.

In the original case the over-current condition was 

Re: [PATCH] xhci: Always restore EP_SOFT_CLEAR_TOGGLE even if ep reset failed

2020-08-21 Thread Mathias Nyman
On 21.8.2020 10.31, Greg KH wrote:
> On Fri, Aug 21, 2020 at 03:06:52PM +0800, Ding Hui wrote:
>> Some devices driver call libusb_clear_halt when target ep queue
>> is not empty. (eg. spice client connected to qemu for usb redir)
>>
>> Before commit f5249461b504 ("xhci: Clear the host side toggle
>> manually when endpoint is soft reset"), that works well.
>> But now, we got the error log:
>>
>> EP not empty, refuse reset
>>
>> xhch_endpoint_reset failed and left ep_state's EP_SOFT_CLEAR_TOGGLE
>> bit is still on
>>
>> So all the subsequent urb sumbit to the ep will fail with the
>> warn log:
>>
>> Can't enqueue URB while manually clearing toggle
>>
>> We need restore ep_state EP_SOFT_CLEAR_TOGGLE bit after
>> xhci_endpoint_reset, even if it is failed.
>>
>> Signed-off-by: Ding Hui 

Thanks, nice catch.

>> ---
>>  drivers/usb/host/xhci.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> Shouldn't this have a Fixes: tag on it and be backported to the affected
> stable trees?

It should, but I like this patch and want it in, so I'll add the tags this time.

Thanks
-Mathias


Re: [PATCH] x86/hotplug: Silence APIC only after all irq's are migrated

2020-08-20 Thread Mathias Nyman
On 15.8.2020 0.38, Ashok Raj wrote:
> When offlining CPU's, fixup_irqs() migrates all interrupts away from the
> outgoing CPU to an online CPU. Its always possible the device sent an
> interrupt to the previous CPU destination. Pending interrupt bit in IRR in
> lapic identifies such interrupts. apic_soft_disable() will not capture any
> new interrupts in IRR. This causes interrupts from device to be lost during
> cpu offline. The issue was found when explicitly setting MSI affinity to a
> CPU and immediately offlining it. It was simple to recreate with a USB
> ethernet device and doing I/O to it while the CPU is offlined. Lost
> interrupts happen even when Interrupt Remapping is enabled.
> 
> Current code does apic_soft_disable() before migrating interrupts.
> 
> native_cpu_disable()
> {
>   ...
>   apic_soft_disable();
>   cpu_disable_common();
> --> fixup_irqs(); // Too late to capture anything in IRR.
> }
> 
> Just fliping the above call sequence seems to hit the IRR checks
> and the lost interrupt is fixed for both legacy MSI and when
> interrupt remapping is enabled.
> 
> 
> Fixes: 60dcaad5736f ("x86/hotplug: Silence APIC and NMI when CPU is dead")
> Link: https://lore.kernel.org/lkml/875zdarr4h@nanos.tec.linutronix.de/
> Signed-off-by: Ashok Raj 
> 
> To: linux-kernel@vger.kernel.org
> To: Thomas Gleixner 
> Cc: Sukumar Ghorai 
> Cc: Srikanth Nandamuri 
> Cc: Evan Green 
> Cc: Mathias Nyman 
> Cc: Bjorn Helgaas 
> Cc: sta...@vger.kernel.org
> ---

This fixes the lost xhci interrupt for me.
Before this patch a msi interupt was lost after ~200 cycles of toggling CPUs
offline/online under heavy usb traffic.

With this patch I ran 3x2000 cycles without any issues (Comet lake, patch on 
top of 5.8)
Tried both with and without CONFIG_IRQ_REMAP. No issues seen.

Tested-by: Mathias Nyman 


Re: [PATCH] xhci: Do warm-reset when both CAS and XDEV_RESUME are set

2020-08-11 Thread Mathias Nyman
On 10.8.2020 17.07, Kai-Heng Feng wrote:
> Sometimes re-plugging a USB device during system sleep renders the device
> useless:
> [  173.418345] xhci_hcd :00:14.0: Get port status 2-4 read: 0x14203e2, 
> return 0x10262
> ...
> [  176.496485] usb 2-4: Waited 2000ms for CONNECT
> [  176.496781] usb usb2-port4: status .0262 after resume, -19
> [  176.497103] usb 2-4: can't resume, status -19
> [  176.497438] usb usb2-port4: logical disconnect
> 
> Because PLS equals to XDEV_RESUME, xHCI driver reports U3 to usbcore,
> despite of CAS bit is flagged.
> 
> So proritize CAS over XDEV_RESUME to let usbcore handle warm-reset for
> the port.
> 
> Signed-off-by: Kai-Heng Feng 
> ---

Thanks, nice catch.
Adding to queue

-Mathias



Re: [PATCH v2 0/2] Small fixes for ASMedia host controllers

2020-07-28 Thread Mathias Nyman
On 28.7.2020 14.16, Greg KH wrote:
> On Mon, Jul 27, 2020 at 11:24:06PM -0500, Forest Crossman wrote:
>> The first patch just defines some host controller device IDs to make the
>> code a bit easier to read (since the controller part number is not
>> always the same as the DID) and to prepare for the next patch.
>>
>> The second patch defines a new device ID for the ASM1142 and enables the
>> XHCI_NO_64BIT_SUPPORT quirk for that device, since it has the same
>> problem with truncating the higher bits as the ASM2142/ASM3142.
>>
>>
>> Changes since v1:
>>  - Added changelog text to the first patch.
>>
>>
>> Forest Crossman (2):
>>   usb: xhci: define IDs for various ASMedia host controllers
>>   usb: xhci: Fix ASMedia ASM1142 DMA addressing
>>
>>  drivers/usb/host/xhci-pci.c | 10 +++---
>>  1 file changed, 7 insertions(+), 3 deletions(-)
> 
> Mathias, any objection for me just taking these now?
> 

No objection

Acked-by: Mathias Nyman 

Note that I'll be away most of this and the the next week so response will be 
slow. 

Thanks
-Mathias


Re: [PATCH 02/30] usb: host: pci-quirks: Demote function header from kerneldoc to comment block

2020-07-03 Thread Mathias Nyman
On 2.7.2020 17.45, Lee Jones wrote:
> quirk_usb_handoff_xhci()'s function header is the only one across
> the sourcefile which is denoted as a kerneldoc header.  Despite
> no attempt to document its arguments.  Drop it down in status from
> kerneldoc to a standard comment block to match the other headers
> in the file.
> 
> Fixes the following W=1 kernel build warning:
> 
>  drivers/usb/host/pci-quirks.c:1145: warning: Function parameter or member 
> 'pdev' not described in 'quirk_usb_handoff_xhci'
> 
> Cc: Mathias Nyman 
> Cc: Martin Mares 
> Cc: aleksey_gore...@phoenix.com
> Signed-off-by: Lee Jones 

Acked-by: Mathias Nyman 





Re: [PATCH] xhci: Make debug message consistent with bus and port number

2020-06-30 Thread Mathias Nyman
On 30.6.2020 14.46, Kai-Heng Feng wrote:
> 
>>
>> Added to my for-usb-next branch, (which I'll need to rebase on 5.8-rc1 once 
>> released)
> 
> Hmm, not seeing this patch from mainline, next or xhci tree..
> 

Apparently I never pushed it to kernel.org, now it should be in my for-usb-next 
branch

-Mathias


Re: XHCI vs PCM2903B/PCM2904 part 2

2020-06-30 Thread Mathias Nyman
On 30.6.2020 16.08, Rik van Riel wrote:
> I misread the code, it's not a bitfield, so state 1 means an endpoint marked 
> with running state. The next urb is never getting a response, though.
> 
> However, the xhci spec says an endpoint is halted upon a babble error.

I was looking at the same, so according to specs this state shouldn't be 
possible. 

> 
> The code right above the babble handling case adds halted into the endpoint 
> state itself. Does the code handling the babble error need to do something 
> similar to trigger cleanup elsewhere? 

It's a flag to prevent ringing the doorbell for a halted endpoint.
Anyway, reset endpoint is meant to recover an endpoint in a halted state.
Resetting non-halted endpoints will just lead to a context state error, and
besides, isoc endpoints shouldn't halt.

Anyways, I haven't got any better idea at the moment.
You can try and see what a forced reset does with:

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 2c255d0620b0..d79aca0df6d4 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -1900,8 +1900,7 @@ static int xhci_requires_manual_halt_cleanup(struct 
xhci_hcd *xhci,
 * endpoint anyway.  Check if a babble halted the
 * endpoint.
 */
-   if (GET_EP_CTX_STATE(ep_ctx) == EP_STATE_HALTED)
-   return 1;
+   return 1;
 
return 0;
 }

Traces also showed thet endpoint doorbell was rang after th babble error, so
we know that didn't help restarting the endpoint.

-Mathias


Re: [PATCH 2/2] xhci: Poll for U0 after disabling USB2 LPM

2020-06-23 Thread Mathias Nyman
On 19.6.2020 17.19, Kai-Heng Feng wrote:
> Hi Mathias,
> 
>> On Jun 9, 2020, at 18:15, Kai-Heng Feng  wrote:
>>
>>
>>
>>> On Jun 8, 2020, at 19:21, Mathias Nyman  
>>> wrote:
>>>
>>> On 20.5.2020 13.18, Kai-Heng Feng wrote:
>>>> USB2 devices with LPM enabled may interrupt the system suspend:
>>>> [  932.510475] usb 1-7: usb suspend, wakeup 0
>>>> [  932.510549] hub 1-0:1.0: hub_suspend
>>>> [  932.510581] usb usb1: bus suspend, wakeup 0
>>>> [  932.510590] xhci_hcd :00:14.0: port 9 not suspended
>>>> [  932.510593] xhci_hcd :00:14.0: port 8 not suspended
>>>> ..
>>>> [  932.520323] xhci_hcd :00:14.0: Port change event, 1-7, id 7, 
>>>> portsc: 0x400e03
>>>
>>> 400e03 = Connected, Enabled, U0 with port ink state change flag (PLC) set.
>>>
>>>> ..
>>>> [  932.591405] PM: pci_pm_suspend(): hcd_pci_suspend+0x0/0x30 returns -16
>>>> [  932.591414] PM: dpm_run_callback(): pci_pm_suspend+0x0/0x160 returns -16
>>>> [  932.591418] PM: Device :00:14.0 failed to suspend async: error -16
>>>>
>>>> During system suspend, USB core will let HC suspends the device if it
>>>> doesn't have remote wakeup enabled and doesn't have any children.
>>>> However, from the log above we can see that the usb 1-7 doesn't get bus
>>>> suspended due to not in U0. After a while the port finished U2 -> U0
>>>> transition, interrupts the suspend process.
>>>
>>> In USB2 HW link PM the PLC flag should not be set in U2Exit -> U0 
>>> transitions.
>>> Only case we should see a port change event is U2Entry -> U0 due to STALL or
>>> error/timeout. (see xhci 4.23.5.1.1.1)
>>>
>>>>
>>>> The observation is that after disabling LPM, port doesn't transit to U0
>>>> immediately and can linger in U2. xHCI spec 4.23.5.2 states that the
>>>> maximum exit latency for USB2 LPM should be BESL + 10us. The BESL for
>>>> the affected device is advertised as 400us, which is still not enough
>>>> based on my testing result.
>>>>
>>>> So let's use the maximum permitted latency, 1, to poll for U0
>>>> status to solve the issue.
>>>
>>> I don't recall all details, but it could be that disabling LPM before 
>>> suspend
>>> is unnecessary. 
>>> At least xhci should be able to set a port to U3 from U1 and U2 (see xhci 
>>> 4.15.1)
>>> so that is one change that could be done to xhci_bus_suspend()
>>
>> Yes, put the device to U3 directly does the trick.
> 
> As discussed, will you pick this series over the v2?
> Or is there anything I should improve for this one?
> 
> Kai-Heng

Added this to queueI

Thanks
-Mathias


Re: [PATCH v3 5/9] usb: xhci-pci: Add support for reset controllers

2020-06-15 Thread Mathias Nyman
On 12.6.2020 20.13, Nicolas Saenz Julienne wrote:
> Some atypical users of xhci-pci might need to manually reset their xHCI
> controller before starting the HCD setup. Check if a reset controller
> device is available to the PCI bus and trigger a reset.
> 
> Signed-off-by: Nicolas Saenz Julienne 

Acked-by: Mathias Nyman 


Re: [PATCH 1/2] xhci: Suspend ports to U3 directly from U1 or U2

2020-06-10 Thread Mathias Nyman
On 10.6.2020 18.43, Kai-Heng Feng wrote:
> 
> 
>> On Jun 10, 2020, at 22:32, Alan Stern  wrote:
>>
>> On Wed, Jun 10, 2020 at 02:42:30PM +0800, Kai-Heng Feng wrote:
>>> xHCI spec "4.15.1 Port Suspend" states that port can be put to U3 as long
>>> as Enabled bit is set and from U0, U1 or U2 state.
>>>
>>> Currently only USB_PORT_FEAT_LINK_STATE puts port to U3 directly, let's
>>> do the same for USB_PORT_FEAT_SUSPEND and bus suspend case.
>>>
>>> This is particularly useful for USB2 devices, which may take a very long
>>> time to switch USB2 LPM on and off.
>>
>> Have these two patches been tested with a variety of USB-2.0 and USB-2.1 
>> devices?
> 
> I tested some laptops around and they work fine.
> Only internally connected USB devices like USB Bluetooth and USB Camera have 
> USB2 LPM enabled, so this patch won't affect external connected devices.
> 

Took a fresh look at the USB2 side and it's not as clear as the USB3 case, where
we know the hub must support transition to U3 from any other state. [1]

Supporting link state transition to U3 (USB2 L2) from any other U state for 
USB2 seems
to be xHCI specific feature. xHC hardware will make sure it goes via the U0 
state.

I have no clue about other hosts (or hubs), USB2 LPM ECN just shows that link
state transitions to L1 or L2 should always goes via L0.
It's possible this has to be done in software by disabling USB2 LPM before 
suspending the device.

So I guess the original suggestion to wait for link state to reach U0 before
is a better solution. Sorry about my hasty suggestion.

Kai-Heng, does it help if you fist manually set the link to U0 before disabling
USB2 LPM (set PLS to 0 before clearing the HLE bit). Does it transition to U0
any faster, or get rid of the extra port event with PLC:U0?

-Mathias

 
[1]
USB3.1 spec (10.16.2.10) Set Port feature:   
   "If the value is 3, then host software wants to selectively suspend the
device connected to this port. The hub shall transition the link to U3
from any of the other U states using allowed link state transitions.
If the port is not already in the U0 state, then it shall transition the
port to the U0 state and then initiate the transition to U3.


Re: [PATCH 2/2] xhci: Poll for U0 after disabling USB2 LPM

2020-06-08 Thread Mathias Nyman
On 20.5.2020 13.18, Kai-Heng Feng wrote:
> USB2 devices with LPM enabled may interrupt the system suspend:
> [  932.510475] usb 1-7: usb suspend, wakeup 0
> [  932.510549] hub 1-0:1.0: hub_suspend
> [  932.510581] usb usb1: bus suspend, wakeup 0
> [  932.510590] xhci_hcd :00:14.0: port 9 not suspended
> [  932.510593] xhci_hcd :00:14.0: port 8 not suspended
> ..
> [  932.520323] xhci_hcd :00:14.0: Port change event, 1-7, id 7, portsc: 
> 0x400e03

400e03 = Connected, Enabled, U0 with port ink state change flag (PLC) set.

> ..
> [  932.591405] PM: pci_pm_suspend(): hcd_pci_suspend+0x0/0x30 returns -16
> [  932.591414] PM: dpm_run_callback(): pci_pm_suspend+0x0/0x160 returns -16
> [  932.591418] PM: Device :00:14.0 failed to suspend async: error -16
> 
> During system suspend, USB core will let HC suspends the device if it
> doesn't have remote wakeup enabled and doesn't have any children.
> However, from the log above we can see that the usb 1-7 doesn't get bus
> suspended due to not in U0. After a while the port finished U2 -> U0
> transition, interrupts the suspend process.

In USB2 HW link PM the PLC flag should not be set in U2Exit -> U0 transitions.
Only case we should see a port change event is U2Entry -> U0 due to STALL or
error/timeout. (see xhci 4.23.5.1.1.1)

> 
> The observation is that after disabling LPM, port doesn't transit to U0
> immediately and can linger in U2. xHCI spec 4.23.5.2 states that the
> maximum exit latency for USB2 LPM should be BESL + 10us. The BESL for
> the affected device is advertised as 400us, which is still not enough
> based on my testing result.
> 
> So let's use the maximum permitted latency, 1, to poll for U0
> status to solve the issue.

I don't recall all details, but it could be that disabling LPM before suspend
is unnecessary. 
At least xhci should be able to set a port to U3 from U1 and U2 (see xhci 
4.15.1)
so that is one change that could be done to xhci_bus_suspend()

Also just noticed that we are not really checking L1S field in PORTPMSC 
register, 
this should tell if there was an issue with USB2 lpm state transitions, and
perhaps we should disable lpm for that device. 

Does the L1S field show anything unuaual in your case?
That could explain the port event with the PLC bit set.

I think we can avoid a readl_poll_timeout() solution in this case.

-Mathias


Re: [PATCH] xhci: Make debug message consistent with bus and port number

2020-06-08 Thread Mathias Nyman
On 8.6.2020 6.57, Kai-Heng Feng wrote:
> 
> 
>> On May 7, 2020, at 18:35, Mathias Nyman  
>> wrote:
>>
>> On 7.5.2020 11.21, Greg Kroah-Hartman wrote:
>>> On Thu, May 07, 2020 at 03:58:36PM +0800, Kai-Heng Feng wrote:
>>>>
>>>>
>>>>> On May 7, 2020, at 15:31, Greg Kroah-Hartman  
>>>>> wrote:
>>>>>
>>>>> On Thu, May 07, 2020 at 03:15:01PM +0800, Kai-Heng Feng wrote:
>>>>>>
>>>>>>
>>>>>>> On May 7, 2020, at 14:45, Greg Kroah-Hartman 
>>>>>>>  wrote:
>>>>>>>
>>>>>>> On Thu, May 07, 2020 at 02:17:55PM +0800, Kai-Heng Feng wrote:
>>>>>>>> Current xhci debug message doesn't always output bus number, so it's
>>>>>>>> hard to figure out it's from USB2 or USB3 root hub.
>>>>>>>>
>>>>>>>> In addition to that, some port numbers are offset to 0 and others are
>>>>>>>> offset to 1. Use the latter to match the USB core.
>>>>>>>>
>>>>>>>> So use "bus number - port index + 1" to make debug message consistent.
>>>>>>>>
>>>>>>>> Signed-off-by: Kai-Heng Feng 
>>>>>>>> ---
>>>>>>>> drivers/usb/host/xhci-hub.c | 41 +
>>>>>>>> 1 file changed, 23 insertions(+), 18 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
>>>>>>>> index f37316d2c8fa..83088c262cc4 100644
>>>>>>>> --- a/drivers/usb/host/xhci-hub.c
>>>>>>>> +++ b/drivers/usb/host/xhci-hub.c
>>>>>>>> @@ -1241,7 +1241,8 @@ int xhci_hub_control(struct usb_hcd *hcd, u16 
>>>>>>>> typeReq, u16 wValue,
>>>>>>>>temp = readl(ports[wIndex]->addr);
>>>>>>>>/* Disable port */
>>>>>>>>if (link_state == USB_SS_PORT_LS_SS_DISABLED) {
>>>>>>>> -  xhci_dbg(xhci, "Disable port %d\n", 
>>>>>>>> wIndex);
>>>>>>>> +  xhci_dbg(xhci, "Disable port %d-%d\n",
>>>>>>>> +   hcd->self.busnum, wIndex + 1);
>>>>>>>
>>>>>>> Shouldn't xhci_dbg() show the bus number already?  
>>>>>>
>>>>>> It's the PCI bus number, different to USB2/USB3 root hub bus number...
>>>>>
>>>>> But if this is using dev_dbg(), and it is, then you know how to look
>>>>> that up by seeing where that device is in sysfs at that point in time.
>>>>>
>>>>> So why add this again?
>>>>
>>>> xHCI has two HCD, one for USB2 and one for USB3.
>>>> If both of their port with same number are in use, for instance, port 1, 
>>>> then they are port 1-1 and port 2-1.
>>>> Right now the debug message only show "Port 1", we still can't find the 
>>>> corresponding port via sysfs with insufficient info.
>>>
>>> Look at the full kernel log line, the xhci hcd device should be showing
>>> you unique information.  If not, something else is wrong.
>>>
>>
>> What Kai-Heng suggest here makes sense, and is useful.
>> We use similar style debugging in other places, and it is helpful as it 
>> matches
>> usb core debugging style.
>>
>> This might seem odd but reason is that the xHC controller is one device which
>> doesn't really separate USB2 and USB3.
>> All ports are for example in one long array.
>>
>> On the xhci driver side things look very different. We register two HCD's,
>> one for usb 2 and one for USB 3. In many cases the debugging is not tied to 
>> a HCD
>> in any way,  (starting, stopping controller, command completion interrupts 
>> etc),
>> other cases the debugging is very much tied to a specific hcd,
>> for example when we are handling a port requsts for the roothub.
> 
> A gentle ping...
> 

Added to my for-usb-next branch, (which I'll need to rebase on 5.8-rc1 once 
released)

-Mathias



Re: [PATCH v4] usb: host: xhci-mtk: avoid runtime suspend when removing hcd

2020-06-04 Thread Mathias Nyman
On 4.6.2020 6.01, Macpaul Lin wrote:
> When runtime suspend was enabled, runtime suspend might happen
> when xhci is removing hcd. This might cause kernel panic when hcd
> has been freed but runtime pm suspend related handle need to
> reference it.
> 
> Signed-off-by: Macpaul Lin 
> Reviewed-by: Chunfeng Yun 
> Cc: sta...@vger.kernel.org
> ---
> Changes for v3:
>   - Replace better sequence for disabling the pm_runtime suspend.
> Changes for v4:
>   - Thanks for Sergei's review, typo in commit description has been corrected.
> 

Thanks, added to queue

-Mathias


Re: [PATCH] usb: host: xhci-mtk: avoid runtime suspend when removing hcd

2020-06-03 Thread Mathias Nyman
On 29.5.2020 7.29, Macpaul Lin wrote:
> When runtime suspend was enabled, runtime suspend might happened
> when xhci is removing hcd. This might cause kernel panic when hcd
> has been freed but runtime pm suspend related handle need to
> reference it.
> 
> Change-Id: I70a5dc8006207caeecbac6955ce8e5345dcc70e6
> Signed-off-by: Macpaul Lin 
> ---
>  drivers/usb/host/xhci-mtk.c |5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/usb/host/xhci-mtk.c b/drivers/usb/host/xhci-mtk.c
> index bfbdb3c..641d24e 100644
> --- a/drivers/usb/host/xhci-mtk.c
> +++ b/drivers/usb/host/xhci-mtk.c
> @@ -587,6 +587,9 @@ static int xhci_mtk_remove(struct platform_device *dev)
>   struct xhci_hcd *xhci = hcd_to_xhci(hcd);
>   struct usb_hcd  *shared_hcd = xhci->shared_hcd;
>  
> + pm_runtime_put_sync(>dev);

Might runtime suspend here.
It's a lot better than before, no panic as hcd isn't released, but a bit 
unnecessary.

how about this sequence instead:
pm_runtime_disable()
pm_runtime_put_noidle()

> + pm_runtime_disable(>dev);
> +

-Mathias


Re: XHCI vs PCM2903B/PCM2904 part 2

2020-05-25 Thread Mathias Nyman
On 21.5.2020 6.45, Rik van Riel wrote:
> On Wed, 2020-05-20 at 16:34 -0400, Alan Stern wrote:
>> On Wed, May 20, 2020 at 03:21:44PM -0400, Rik van Riel wrote:
>>>
>>> Interesting. That makes me really curious why things are
>>> getting stuck, now...
>>
>> This could be a bug in xhci-hcd.  Perhaps the controller's endpoint 
>> state needs to be updated after one of these errors occurs.  Mathias 
>> will know all about that.
> 
> I am seeing something potentially interesting in the
> giant trace. First the final enqueue/dequeue before
> the babble error:
> 
>   -0 [005] d.s. 776367.638233: xhci_inc_enq: ISOC
> 33a6879e: enq 0x001014070420(0x00101407) deq
> 0x001014070360(0x00101407) segs 2 stream 0 free_trbs 497
> bounce 196 cycle 1
> 
> The next reference to 0x001014070360 is the babble error,
> and some info on the ISOC buffer itself:
> 
>   -0 [005] d.h. 776367.639187: xhci_handle_event:
> EVENT: TRB 001014070360 status 'Babble Detected' len 196 slot 15 ep
> 9 type 'Transfer Event' flags e:C
>   -0 [005] d.h. 776367.639195: xhci_handle_transfer:
> ISOC: Buffer 000e2676f400 length 196 TD size 0 intr 0 type 'Isoch'
> flags b:i:I:c:s:I:e:C
>n
> Immediately after the babble error, the next request is enqueued,
> and the doorbell is rung:
> 
>   -0 [005] d.h. 776367.639196: xhci_inc_deq: ISOC 
> 33a6879e: enq 0x001014070420(0x00101407) deq 
> 0x001014070370(0x00101407) segs 2 stream 0 free_trbs 498 bounce 
> 196 cycle 1
>   -0 [005] d.h. 776367.639197: xhci_urb_giveback: 
> ep4in-isoc: urb 72126553 pipe 135040 slot 15 length 196/196 sgs 0/0 
> stream 0 flags 0206
>   -0 [005] d.h. 776367.639197: xhci_inc_deq: EVENT 
> 97f84b16: enq 0x0010170b5000(0x0010170b5000) deq 
> 0x0010170b5670(0x0010170b5000) segs 1 stream 0 free_trbs 254 bounce 0 
> cycle 1
>   -0 [005] ..s. 776367.639212: xhci_urb_enqueue: 
> ep4in-isoc: urb 72126553 pipe 135040 slot 15 length 0/196 sgs 0/0 
> stream 0 flags 0206
>   -0 [005] d.s. 776367.639214: xhci_queue_trb: ISOC: Buffer 
> 000e2676f400 length 196 TD size 0 intr 0 type 'Isoch' flags 
> b:i:I:c:s:I:e:c
>   -0 [005] d.s. 776367.639214: xhci_inc_enq: ISOC 
> 33a6879e: enq 0x001014070430(0x00101407) deq 
> 0x001014070370(0x00101407) segs 2 stream 0 free_trbs 497 bounce 
> 196 cycle 1
>   -0 [005] d.s. 776367.639215: xhci_ring_ep_doorbell: Ring 
> doorbell for Slot 15 ep4in
> 
> However, after that point, no more xhci_handle_transfer: ISOC
> lines ar seen in the log. The doorbell line above is the last
> line in the log for ep4in.
> 
> Is this some area where USB3 and USB2 behave differently?

It acts as if the endpoint is no longer running.

If the endpoint would be halted then xhci_requires_manual_halt_cleanup() should
reset the endpoints and it would show in the traces.

Could you add the code below and take new traces, it will show the endpoint
state after the Babble error.

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 0fda0c0f4d31..373d89ef7275 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -2455,6 +2455,7 @@ static int handle_tx_event(struct xhci_hcd *xhci,
case COMP_BABBLE_DETECTED_ERROR:
xhci_dbg(xhci, "Babble error for slot %u ep %u on endpoint\n",
 slot_id, ep_index);
+   trace_xhci_handle_tx_event(ep_ctx);
status = -EOVERFLOW;
break;
/* Completion codes for endpoint error state */
diff --git a/drivers/usb/host/xhci-trace.h b/drivers/usb/host/xhci-trace.h
index b19582b2a72c..5081df079f4a 100644
--- a/drivers/usb/host/xhci-trace.h
+++ b/drivers/usb/host/xhci-trace.h
@@ -360,6 +360,11 @@ DEFINE_EVENT(xhci_log_ep_ctx, xhci_add_endpoint,
TP_ARGS(ctx)
 );
 
+DEFINE_EVENT(xhci_log_ep_ctx, xhci_handle_tx_event,
+   TP_PROTO(struct xhci_ep_ctx *ctx),
+   TP_ARGS(ctx)
+);
+
 DECLARE_EVENT_CLASS(xhci_log_slot_ctx,
TP_PROTO(struct xhci_slot_ctx *ctx),
TP_ARGS(ctx),




Re: XHCI vs PCM2903B/PCM2904 part 2

2020-05-20 Thread Mathias Nyman
On 20.5.2020 14.26, Rik van Riel wrote:
> After a few more weeks of digging, I have come to the tentative
> conclusion that either the XHCI driver, or the USB sound driver,
> or both, fail to handle USB errors correctly.
> 
> I have some questions at the bottom, after a (brief-ish) explanation
> of exactly what seems to go wrong.
> 
> TL;DR: arecord from a misbehaving device can hang forever
> after a USB error, due to poll on /dev/snd/timer never returning.
> 
> The details: under some mysterious circumstances, the PCM290x
> family sound chips can send more data than expected during an
> isochronous transfer, leading to a babble error. Those
> circumstances seem to in part depend on the USB host controller
> and/or the electrical environment, since the chips work just
> fine for most people.
> 
> Receiving data past the end of the isochronous transfer window
> scheduled for a device results in the XHCI controller throwing
> a babble error, which moves the endpoint into halted state.
> 
> This is followed by the host controller software sending a
> reset endpoint command, and moving the endpoint into stopped
> state, as specified on pages 164-165 of the XHCI specification.

Note that isoch endpoints should generate buffer overrun instead of
babble detect error on TD babble conditions. 
See xHCI spec 6.4.5 additional note 115.

Or maybe a frame babble could halt an isoc endpoint, see xhci 4.10.2.4.1
but then you should see a port error and port going to disabled state.

Any logs of this?

mount -t debugfs none /sys/kernel/debug
echo 'module xhci_hcd =p' >/sys/kernel/debug/dynamic_debug/control
echo 'module usbcore =p' >/sys/kernel/debug/dynamic_debug/control
echo 81920 > /sys/kernel/debug/tracing/buffer_size_kb
echo 1 > /sys/kernel/debug/tracing/events/xhci-hcd/enable
< trigger the issue >
Send output of dmesg
Send content of /sys/kernel/debug/tracing/trace

> 
> However, the USB sound driver seems to have no idea that this
> error happened. The function retire_capture_urb looks at the
> status of each isochronous frame, but seems to be under the
> assumption that the sound device just keeps on running.
> 
> The function snd_complete_urb seems to only detect that the
> device is not running if usb_submit_urb returns a failure.
> 
> err = usb_submit_urb(urb, GFP_ATOMIC);
> if (err == 0)
> return;
> 
> usb_audio_err(ep->chip, "cannot submit urb (err = %d)\n", err);
> 
> if (ep->data_subs && ep->data_subs->pcm_substream) {
> substream = ep->data_subs->pcm_substream;
> snd_pcm_stop_xrun(substream);
> }
> 
> However, the XHCI driver will happily submit an URB to a
> stopped device. Looking at the call trace usb_submit_urb ->
> xhci_urb_enqueue -> xhci_queue_isoc_tx_prepare -> prepare_ring,
> you can see this code:
> 
> /* Make sure the endpoint has been added to xHC schedule */
> switch (ep_state) {
> ...
> case EP_STATE_HALTED:
> xhci_dbg(xhci, "WARN halted endpoint, queueing URB 
> anyway.\n");
> case EP_STATE_STOPPED:
> case EP_STATE_RUNNING:
> break;
> 
> This leads me to a few questions:
> - should retire_capture_urb call snd_pcm_stop_xrun,
>   or another function like it, if it sees certain
>   errors in the iso frame in the URB?
> - should snd_complete_urb do something with these
>   errors, too, in case they happen on the sync frames
>   and not the data frames?
> - does the XHCI code need to ring the doorbell when
>   submitting an URB to a stopped device, or is it
>   always up to the higher-level driver to fully reset
>   the device before it can do anything useful?

xhci driver should ring the doorbell.

xhci_queue_isoc_tx()
  giveback_first_trb()
xhci_ring_ep_doorbell()

when we are talking about babble or transaction errors the device might be
completely unaware of the situation. Device side of the endpoint is not 
necessarily halted.
xHC host will only halt its internal endpoint state, and it needs a reset
endopoint command from the xhci driver to clear the internal halt state. 

-Mathias

> - if a device in stopped state does not do anything
>   useful, should usb_submit_urb return an error?
> - how should the USB sound driver recover from these
>   occasional and/or one-off errors? stop the sound
>   stream, or try to reinitialize the device and start
>   recording again?
> 
> I am willing to write patches and can test with my
> setup, but both the sound code and the USB code are
> new to me so I would like to know what direction I
> should go in :)
> 



Re: [PATCH] xhci: Fix log mistake of xhci_start

2020-05-15 Thread Mathias Nyman
On 15.5.2020 8.45, jiahao wrote:
> It is obvious that XCHI_MAX_HALT_USEC is usec,
>  not milliseconds; Replace 'milliseconds' with
> 'usec' of the debug message.
> 
> Signed-off-by: jiahao 
> ---
>  drivers/usb/host/xhci.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
> index bee5dec..d011472 100644
> --- a/drivers/usb/host/xhci.c
> +++ b/drivers/usb/host/xhci.c
> @@ -147,7 +147,7 @@ int xhci_start(struct xhci_hcd *xhci)
>   STS_HALT, 0, XHCI_MAX_HALT_USEC);
>   if (ret == -ETIMEDOUT)
>   xhci_err(xhci, "Host took too long to start, "
> - "waited %u microseconds.\n",
> + "waited %u usec.\n",

It already says "microseconds", no need to change it

-Mathias



Re: [PATCH v13 0/5] usb: xhci: Add support for Renesas USB controllers

2020-05-13 Thread Mathias Nyman
On 13.5.2020 15.40, Greg Kroah-Hartman wrote:
> On Wed, May 13, 2020 at 03:19:29PM +0300, Mathias Nyman wrote:
>> On 6.5.2020 9.00, Vinod Koul wrote:
>>> This series add support for Renesas USB controllers uPD720201 and uPD720202.
>>> These require firmware to be loaded and in case devices have ROM those can
>>> also be programmed if empty. If ROM is programmed, it runs from ROM as well.
>>>
>>> This includes patches from Christian which supported these controllers w/o
>>> ROM and later my patches for ROM support and debugfs hook for rom erase and
>>> export of xhci-pci functions.
>>>
>>
>> First four patches look ok to me, but 5/5 that adds rom erase debugfs support
>> still needs some work.
>>
>> If you prefer I can take the first four, maybe we can get them to 5.8, and 
>> then
>> later add that debugs rom erase support.
>>
>> Let me know what you prefer
> 
> Oops, I just added all of these to my testing tree :)
> 
> What's wrong with the debugfs patch?  I can drop it, but it seemed
> simple enough to me.

Added "usb_renesas" directory under debugfs root when we have easily accessible
debugfs/usb/xhci directory to use as parent. 

Also not checking if adding directory failed (if debufs not supported)
 
-Mathias



Re: [PATCH v13 5/5] usb: xhci: provide a debugfs hook for erasing rom

2020-05-13 Thread Mathias Nyman
On 6.5.2020 9.00, Vinod Koul wrote:
> run "echo 1 > /sys/kernel/debug/renesas-usb/rom_erase" to erase
> firmware when driver is loaded.
> 
> Subsequent init of driver shall reload the firmware
> 
> Signed-off-by: Vinod Koul 
> ---
>  drivers/usb/host/xhci-pci-renesas.c | 33 +
>  1 file changed, 33 insertions(+)
> 
> diff --git a/drivers/usb/host/xhci-pci-renesas.c 
> b/drivers/usb/host/xhci-pci-renesas.c
> index f7d2445d30ec..fa32ec352dc8 100644
> --- a/drivers/usb/host/xhci-pci-renesas.c
> +++ b/drivers/usb/host/xhci-pci-renesas.c
> @@ -2,6 +2,7 @@
>  /* Copyright (C) 2019-2020 Linaro Limited */
>  
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -170,6 +171,8 @@ static int renesas_fw_verify(const void *fw_data,
>   return 0;
>  }
>  
> +static void debugfs_init(struct pci_dev *pdev);
> +
>  static bool renesas_check_rom(struct pci_dev *pdev)
>  {
>   u16 rom_status;
> @@ -183,6 +186,7 @@ static bool renesas_check_rom(struct pci_dev *pdev)
>   rom_status &= RENESAS_ROM_STATUS_ROM_EXISTS;
>   if (rom_status) {
>   dev_dbg(>dev, "External ROM exists\n");
> + debugfs_init(pdev);
>   return true; /* External ROM exists */
>   }
>  
> @@ -449,6 +453,34 @@ static void renesas_rom_erase(struct pci_dev *pdev)
>   dev_dbg(>dev, "ROM Erase... Done success\n");
>  }
>  
> +static int debugfs_rom_erase(void *data, u64 value)
> +{
> + struct pci_dev *pdev = data;
> +
> + if (value == 1) {
> + dev_dbg(>dev, "Userspace requested ROM erase\n");
> + renesas_rom_erase(pdev);
> + return 0;
> + }
> + return -EINVAL;
> +}
> +DEFINE_DEBUGFS_ATTRIBUTE(rom_erase_ops, NULL, debugfs_rom_erase, "%llu\n");
> +
> +static struct dentry *debugfs_root;
> +
> +static void debugfs_init(struct pci_dev *pdev)
> +{
> + debugfs_root = debugfs_create_dir("renesas_usb", NULL);

This will create a renesas_usb directory right under debugfs root.
xhci has its own struct dentry xhci_debugfs_root; 
Use that as parent instead
 
Also note that debugs_create_dir() can fail, for example if debugfs isn't 
supported.

- Mathias



Re: [PATCH v13 0/5] usb: xhci: Add support for Renesas USB controllers

2020-05-13 Thread Mathias Nyman
On 6.5.2020 9.00, Vinod Koul wrote:
> This series add support for Renesas USB controllers uPD720201 and uPD720202.
> These require firmware to be loaded and in case devices have ROM those can
> also be programmed if empty. If ROM is programmed, it runs from ROM as well.
> 
> This includes patches from Christian which supported these controllers w/o
> ROM and later my patches for ROM support and debugfs hook for rom erase and
> export of xhci-pci functions.
> 

First four patches look ok to me, but 5/5 that adds rom erase debugfs support
still needs some work.

If you prefer I can take the first four, maybe we can get them to 5.8, and then
later add that debugs rom erase support.

Let me know what you prefer

-Mathias



Re: [PATCH v8 4/4] USB: pci-quirks: Add Raspberry Pi 4 quirk

2020-05-13 Thread Mathias Nyman
On 11.5.2020 17.25, Lorenzo Pieralisi wrote:
> On Tue, May 05, 2020 at 06:13:17PM +0200, Nicolas Saenz Julienne wrote:
>> On the Raspberry Pi 4, after a PCI reset, VL805's firmware may either be
>> loaded directly from an EEPROM or, if not present, by the SoC's
>> VideoCore. Inform VideoCore that VL805 was just reset.
>>
>> Also, as this creates a dependency between USB_PCI and VideoCore's
>> firmware interface, and since USB_PCI can't be set as a module neither
>> this can. Reflect that on the firmware interface Kconfg.
>>
>> Signed-off-by: Nicolas Saenz Julienne 
>> ---
>>
>> Changes since v5:
>>  - Fix Kconfig issue with allmodconfig
>>
>> Changes since v4:
>>  - Do not split up error message
>>
>> Changes since v3:
>>  - Add more complete error message
>>
>> Changes since v1:
>>  - Make RASPBERRYPI_FIRMWARE dependent on this quirk to make sure it
>>gets compiled when needed.
>>
>>  drivers/firmware/Kconfig  |  3 ++-
>>  drivers/usb/host/pci-quirks.c | 16 
>>  2 files changed, 18 insertions(+), 1 deletion(-)
> 
> Hi Mathias,
> 
> I would need your ACK to merge this series, thanks.
> 
> Lorenzo

Ah, yes, of course

Acked-by: Mathias Nyman 



Re: [PATCH] xhci: Make debug message consistent with bus and port number

2020-05-07 Thread Mathias Nyman
On 7.5.2020 11.21, Greg Kroah-Hartman wrote:
> On Thu, May 07, 2020 at 03:58:36PM +0800, Kai-Heng Feng wrote:
>>
>>
>>> On May 7, 2020, at 15:31, Greg Kroah-Hartman  
>>> wrote:
>>>
>>> On Thu, May 07, 2020 at 03:15:01PM +0800, Kai-Heng Feng wrote:


> On May 7, 2020, at 14:45, Greg Kroah-Hartman  
> wrote:
>
> On Thu, May 07, 2020 at 02:17:55PM +0800, Kai-Heng Feng wrote:
>> Current xhci debug message doesn't always output bus number, so it's
>> hard to figure out it's from USB2 or USB3 root hub.
>>
>> In addition to that, some port numbers are offset to 0 and others are
>> offset to 1. Use the latter to match the USB core.
>>
>> So use "bus number - port index + 1" to make debug message consistent.
>>
>> Signed-off-by: Kai-Heng Feng 
>> ---
>> drivers/usb/host/xhci-hub.c | 41 +
>> 1 file changed, 23 insertions(+), 18 deletions(-)
>>
>> diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
>> index f37316d2c8fa..83088c262cc4 100644
>> --- a/drivers/usb/host/xhci-hub.c
>> +++ b/drivers/usb/host/xhci-hub.c
>> @@ -1241,7 +1241,8 @@ int xhci_hub_control(struct usb_hcd *hcd, u16 
>> typeReq, u16 wValue,
>>  temp = readl(ports[wIndex]->addr);
>>  /* Disable port */
>>  if (link_state == USB_SS_PORT_LS_SS_DISABLED) {
>> -xhci_dbg(xhci, "Disable port %d\n", 
>> wIndex);
>> +xhci_dbg(xhci, "Disable port %d-%d\n",
>> + hcd->self.busnum, wIndex + 1);
>
> Shouldn't xhci_dbg() show the bus number already?  

 It's the PCI bus number, different to USB2/USB3 root hub bus number...
>>>
>>> But if this is using dev_dbg(), and it is, then you know how to look
>>> that up by seeing where that device is in sysfs at that point in time.
>>>
>>> So why add this again?
>>
>> xHCI has two HCD, one for USB2 and one for USB3.
>> If both of their port with same number are in use, for instance, port 1, 
>> then they are port 1-1 and port 2-1.
>> Right now the debug message only show "Port 1", we still can't find the 
>> corresponding port via sysfs with insufficient info.
> 
> Look at the full kernel log line, the xhci hcd device should be showing
> you unique information.  If not, something else is wrong.
> 

What Kai-Heng suggest here makes sense, and is useful.
We use similar style debugging in other places, and it is helpful as it matches
usb core debugging style.

This might seem odd but reason is that the xHC controller is one device which
doesn't really separate USB2 and USB3.
All ports are for example in one long array.

On the xhci driver side things look very different. We register two HCD's,
one for usb 2 and one for USB 3. In many cases the debugging is not tied to a 
HCD
in any way,  (starting, stopping controller, command completion interrupts etc),
other cases the debugging is very much tied to a specific hcd,
for example when we are handling a port requsts for the roothub.

Thanks
-Mathias




Re: [PATCH v12 2/5] usb: renesas-xhci: Add the renesas xhci driver

2020-05-04 Thread Mathias Nyman
On 30.4.2020 19.59, Vinod Koul wrote:
> From: Christian Lamparter 
> 
> This add a new driver for renesas xhci which is basically a firmware
> loader for uPD720201 and uPD720202 w/o ROM. The xhci-pci driver will
> invoke this driver for loading/unloading on relevant devices.
> 
> This patch adds a firmware loader for the uPD720201K8-711-BAC-A
> and uPD720202K8-711-BAA-A variant. Both of these chips are listed
> in Renesas' R19UH0078EJ0500 Rev.5.00 "User's Manual: Hardware" as
> devices which need the firmware loader on page 2 in order to
> work as they "do not support the External ROM".
> 
> The "Firmware Download Sequence" is describe in chapter
> "7.1 FW Download Interface" R19UH0078EJ0500 Rev.5.00 page 131.
> 
> The firmware "K2013080.mem" is available from a USB3.0 Host to
> PCIe Adapter (PP2U-E card) "Firmware download" archive. An
> alternative version can be sourced from Netgear's WNDR4700 GPL
> archives.
> 
> The release notes of the PP2U-E's "Firmware Download" ver 2.0.1.3
> (2012-06-15) state that the firmware is for the following devices:
>  - uPD720201 ES 2.0 sample whose revision ID is 2.
>  - uPD720201 ES 2.1 sample & CS sample & Mass product, ID is 3.
>  - uPD720202 ES 2.0 sample & CS sample & Mass product, ID is 2.
> 
> Signed-off-by: Christian Lamparter 
> Signed-off-by: Bjorn Andersson 
> [vkoul: fixed comments:
>   used macros for timeout count and delay
>   removed renesas_fw_alive_check
>   cleaned renesas_fw_callback
>   removed recursion for renesas_fw_download
>   add register defines and field names
>   move to a separate file
>   make fw loader as sync probe so that we execute in probe and
> prevent race
>   export symbols for xhci-pci to use]
> Signed-off-by: Vinod Koul 
> ---
>  drivers/usb/host/Makefile   |   2 +-
>  drivers/usb/host/xhci-pci-renesas.c | 365 
>  drivers/usb/host/xhci-pci.h |  16 ++
>  3 files changed, 382 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/usb/host/xhci-pci-renesas.c
>  create mode 100644 drivers/usb/host/xhci-pci.h
> 
> diff --git a/drivers/usb/host/Makefile b/drivers/usb/host/Makefile
> index b191361257cc..f3a5a2f01874 100644
> --- a/drivers/usb/host/Makefile
> +++ b/drivers/usb/host/Makefile
> @@ -70,7 +70,7 @@ obj-$(CONFIG_USB_OHCI_HCD_DAVINCI)  += ohci-da8xx.o
>  obj-$(CONFIG_USB_UHCI_HCD)   += uhci-hcd.o
>  obj-$(CONFIG_USB_FHCI_HCD)   += fhci.o
>  obj-$(CONFIG_USB_XHCI_HCD)   += xhci-hcd.o
> -obj-$(CONFIG_USB_XHCI_PCI)   += xhci-pci.o
> +obj-$(CONFIG_USB_XHCI_PCI)   += xhci-pci.o xhci-pci-renesas.o

Hmm, now we end up with two modules, xhci-pci and xhci-pci-renesas, even if
xhci-pci-renesas just includes helper functions to load firmware for renesas.

My kbuild knowledge is limited, but one way to solve this would be to 
rename xhci-pci.c to xhci-pci-core.c and add:

xhci-pci-y := xhci-pci-core.o xhci-pci-renesas.o

unless someone can suggest a better way to solve this

-Mathias



Re: [PATCH] xhci: Prevent runtime suspend all the time with XHCI_RESET_ON_RESUME quirk

2020-05-04 Thread Mathias Nyman
On 4.5.2020 13.02, Kai-Heng Feng wrote:
> 
> 
>> On May 4, 2020, at 17:47, Oliver Neukum  wrote:
>>
>> Am Montag, den 04.05.2020, 17:19 +0800 schrieb Kai-Heng Feng:
>>> Etron EJ168 USB 3.0 Host Controller stops working after S3, if it was
>>> runtime suspended previously:
>>> [  370.080359] pci :02:00.0: can't change power state from D3cold to D0 
>>> (config space inaccessible)
>>
>> Apparently this controller has issues with D3cold
>>
>>> [  370.080477] xhci_hcd :04:00.0: can't change power state from D3cold 
>>> to D0 (config space inaccessible)
>>> [  370.080532] pcieport :00:1c.0: DPC: containment event, status:0x1f05 
>>> source:0x0200
>>> [  370.080533] pcieport :00:1c.0: DPC: ERR_FATAL detected
>>> [  370.080536] xhci_hcd :04:00.0: can't change power state from D3hot 
>>> to D0 (config space inaccessible)
>>> [  370.080552] xhci_hcd :04:00.0: AER: can't recover (no error_detected 
>>> callback)
>>> [  370.080566] usb usb3: root hub lost power or was reset
>>> [  370.080566] usb usb4: root hub lost power or was reset
>>> [  370.080572] xhci_hcd :04:00.0: Host halt failed, -19
>>> [  370.080574] xhci_hcd :04:00.0: Host not accessible, reset failed.
>>> [  370.080575] xhci_hcd :04:00.0: PCI post-resume error -19!
>>> [  370.080586] xhci_hcd :04:00.0: HC died; cleaning up
>>>
>>> This can be fixed by not runtime suspend the controller at all.
>>>
>>> So instead of conditionally runtime suspend the controller, always
>>> prevent runtime suspend with XHCI_RESET_ON_RESUME quirk.
>>
>> What does that do to other controllers that can do runtime suspend
>> under the current scheme?
> 
> Ok, I'll add a new quirk specific to this controller.
> 
> Kai-Heng

Host shouldn't runtime suspend by default unless set by userspace, or
it has XHCI_DEFAULT_PM_RUNTIME_ALLOW quirk set.  

-Mathias


Re: [PATCH v11 2/5] usb: renesas-xhci: Add the renesas xhci driver

2020-04-30 Thread Mathias Nyman
On 30.4.2020 13.10, Vinod Koul wrote:
> From: Christian Lamparter 
> 
> This add a new driver for renesas xhci which is basically a firmware
> loader for uPD720201 and uPD720202 w/o ROM. It uses xhci-pci driver for
> most of the work.
> 
> This is added in Makefile before the xhci-pci driver so that it first
> get probed for renesas devices before the xhci-pci driver has a chance
> to claim any device.

Most of the above is no longer correct

> 
> This patch adds a firmware loader for the uPD720201K8-711-BAC-A
> and uPD720202K8-711-BAA-A variant. Both of these chips are listed
> in Renesas' R19UH0078EJ0500 Rev.5.00 "User's Manual: Hardware" as
> devices which need the firmware loader on page 2 in order to
> work as they "do not support the External ROM".
> 
> The "Firmware Download Sequence" is describe in chapter
> "7.1 FW Download Interface" R19UH0078EJ0500 Rev.5.00 page 131.
> 
> The firmware "K2013080.mem" is available from a USB3.0 Host to
> PCIe Adapter (PP2U-E card) "Firmware download" archive. An
> alternative version can be sourced from Netgear's WNDR4700 GPL
> archives.
> 
> The release notes of the PP2U-E's "Firmware Download" ver 2.0.1.3
> (2012-06-15) state that the firmware is for the following devices:
>  - uPD720201 ES 2.0 sample whose revision ID is 2.
>  - uPD720201 ES 2.1 sample & CS sample & Mass product, ID is 3.
>  - uPD720202 ES 2.0 sample & CS sample & Mass product, ID is 2.
> 
> Signed-off-by: Christian Lamparter 
> Signed-off-by: Bjorn Andersson 
> [vkoul: fixed comments:
>   used macros for timeout count and delay
>   removed renesas_fw_alive_check
>   cleaned renesas_fw_callback
>   removed recursion for renesas_fw_download
>   add register defines and field names
>   move to a separate file
>   make fw loader as sync probe so that we execute in probe and
> prevent race]
> Signed-off-by: Vinod Koul 
> ---
>  drivers/usb/host/Makefile   |   3 +-
>  drivers/usb/host/xhci-pci-renesas.c | 361 
>  drivers/usb/host/xhci-pci.h |  16 ++
>  3 files changed, 379 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/usb/host/xhci-pci-renesas.c
>  create mode 100644 drivers/usb/host/xhci-pci.h
> 
> diff --git a/drivers/usb/host/Makefile b/drivers/usb/host/Makefile
> index b191361257cc..c3a79f626393 100644
> --- a/drivers/usb/host/Makefile
> +++ b/drivers/usb/host/Makefile
> @@ -70,7 +70,8 @@ obj-$(CONFIG_USB_OHCI_HCD_DAVINCI)  += ohci-da8xx.o
>  obj-$(CONFIG_USB_UHCI_HCD)   += uhci-hcd.o
>  obj-$(CONFIG_USB_FHCI_HCD)   += fhci.o
>  obj-$(CONFIG_USB_XHCI_HCD)   += xhci-hcd.o
> -obj-$(CONFIG_USB_XHCI_PCI)   += xhci-pci.o
> +usb-xhci-pci-objs:= xhci-pci.o xhci-pci-renesas.o
> +obj-$(CONFIG_USB_XHCI_PCI)   += usb-xhci-pci.o

I don't think it's a good idea to rename the xhci-pci module to usb-xhci-pci

does

xhci-pci-y  := xhci-pci.o xhci-pci-renesas.o
obj-$(CONFIG_USB_XHCI_PCI)  += xhci-pci.o

cause some kbuild issues?

-Mathias


Re: [PATCH v10 3/5] usb: xhci: Add support for Renesas controller with memory

2020-04-30 Thread Mathias Nyman
On 30.4.2020 9.20, Vinod Koul wrote:
> On 29-04-20, 19:58, Vinod Koul wrote:
>> On 29-04-20, 16:53, Mathias Nyman wrote:
>>> On 24.4.2020 13.14, Vinod Koul wrote:
> 
>>>>/* Prevent runtime suspending between USB-2 and USB-3 initialization */
>>>>pm_runtime_get_noresume(>dev);
>>>> @@ -388,6 +401,9 @@ static void xhci_pci_remove(struct pci_dev *dev)
>>>>  {
>>>>struct xhci_hcd *xhci;
>>>>  
>>>> +  if (renesas_device)
>>>> +  renesas_xhci_pci_exit(dev);
>>>> +
>>>
>>> Ah, I see, what we really should do is make sure the quirks in the driver 
>>> data get
>>> added to xhci->quirks, and then just check for the correct quirk in 
>>> xhci_pci_remove.
>>
>> Ah sure that does sound better, I will update this as well and send an
>> update with these changes
> 
> This works for me.. But I have kept the code as in the xhci_pci_probe(),
> ofcourse removed bool renesas_device.

That's fine, xhci is just hcd->hcd_priv, and it doesn't exists before
usb_hcd_pci_probe() is called

usb_hcd_pci_probe()
  usb_create_hcd()
hcd = kzalloc(sizeof(*hcd) + driver->hcd_priv_size, GFP_KERNEL);

-Mathias




Re: [PATCH v10 4/5] usb: renesas-xhci: Add ROM loader for uPD720201

2020-04-29 Thread Mathias Nyman
On 24.4.2020 13.14, Vinod Koul wrote:
> uPD720201 supports ROM and allows software to program the ROM and boot
> from it. Add support for detecting if ROM is present, if so load the ROM
> if not programmed earlier.
> 
> Signed-off-by: Vinod Koul 
> Cc: Yoshihiro Shimoda 
> Cc: Christian Lamparter 
> ---
>  drivers/usb/host/xhci-pci-renesas.c | 342 +++-
>  1 file changed, 341 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/usb/host/xhci-pci-renesas.c 
> b/drivers/usb/host/xhci-pci-renesas.c
> index 402e86912c9f..6bb537999754 100644
> --- a/drivers/usb/host/xhci-pci-renesas.c
> +++ b/drivers/usb/host/xhci-pci-renesas.c
> @@ -50,6 +50,22 @@
>  #define RENESAS_RETRY1
>  #define RENESAS_DELAY10
>  
> +#define ROM_VALID_01 0x2013
> +#define ROM_VALID_02 0x2026
> +
> +static int renesas_verify_fw_version(struct pci_dev *pdev, u32 version)
> +{
> + switch (version) {
> + case ROM_VALID_01:
> + case ROM_VALID_02:
> + return 0;
> + default:
> + dev_err(>dev, "FW has invalid version :%d\n", version);
> + return 1;
> + }
> + return -EINVAL;

This never returns -EINVAL
Maybe just get rid of the default case and print
the error message before returning

> +}
> +
>  static int renesas_fw_download_image(struct pci_dev *dev,
>const u32 *fw,
>size_t step)
> @@ -144,10 +160,84 @@ static int renesas_fw_verify(const void *fw_data,
>   return 0;
>  }
>  
> -static int renesas_fw_check_running(struct pci_dev *pdev)
> +static bool renesas_check_rom(struct pci_dev *pdev)
>  {
> + u16 rom_status;
> + int retval;
> +
> + /* Check if external ROM exists */
> + retval = pci_read_config_word(pdev, RENESAS_ROM_STATUS, _status);
> + if (retval)
> + return false;
> +
> + rom_status &= RENESAS_ROM_STATUS_ROM_EXISTS;
> + if (rom_status) {
> + dev_dbg(>dev, "External ROM exists\n");
> + return true; /* External ROM exists */
> + }
> +
> + return false;
> +}
> +
> +static int renesas_check_rom_state(struct pci_dev *pdev)
> +{
> + u16 rom_state;
> + u32 version;
>   int err;
> +
> + /* check FW version */
> + err = pci_read_config_dword(pdev, RENESAS_FW_VERSION, );
> + if (err)
> + return pcibios_err_to_errno(err);
> +
> + version &= RENESAS_FW_VERSION_FIELD;
> + version = version >> RENESAS_FW_VERSION_OFFSET;
> +
> + err = renesas_verify_fw_version(pdev, version);
> + if (err)
> + return err;
> +
> + /*
> +  * Test if ROM is present and loaded, if so we can skip everything
> +  */
> + err = pci_read_config_word(pdev, RENESAS_ROM_STATUS, _state);
> + if (err)
> + return pcibios_err_to_errno(err);
> +
> + if (rom_state & BIT(15)) {
> + /* ROM exists */
> + dev_dbg(>dev, "ROM exists\n");
> +
> + /* Check the "Result Code" Bits (6:4) and act accordingly */
> + switch (rom_state & RENESAS_ROM_STATUS_RESULT) {
> + case RENESAS_ROM_STATUS_SUCCESS:
> + return 0;
> +
> + case RENESAS_ROM_STATUS_NO_RESULT: /* No result yet */
> + return 0;
> +
> + case RENESAS_ROM_STATUS_ERROR: /* Error State */
> + default: /* All other states are marked as "Reserved states" */
> + dev_err(>dev, "Invalid ROM..");
> + break;
> + }
> + }
> +
> + return -EIO;
> +}
> +
> +static int renesas_fw_check_running(struct pci_dev *pdev)
> +{
>   u8 fw_state;
> + int err;
> +
> + /* Check if device has ROM and loaded, if so skip everything */
> + err = renesas_check_rom(pdev);
> + if (err) { /* we have rom */
> + err = renesas_check_rom_state(pdev);
> + if (!err)
> + return err;
> + }
>  
>   /*
>* Test if the device is actually needing the firmware. As most
> @@ -303,11 +393,261 @@ static int renesas_fw_download(struct pci_dev *pdev,
>   return 0;
>  }
>  
> +static void renesas_rom_erase(struct pci_dev *pdev)
> +{
> + int retval, i;
> + u8 status;
> +
> + dev_dbg(>dev, "Performing ROM Erase...\n");
> + retval = pci_write_config_dword(pdev, RENESAS_DATA0,
> + RENESAS_ROM_ERASE_MAGIC);
> + if (retval) {
> + dev_err(>dev, "ROM erase, magic word write failed: %d\n",
> + pcibios_err_to_errno(retval));
> + return;
> + }
> +
> + retval = pci_read_config_byte(pdev, RENESAS_ROM_STATUS, );
> + if (retval) {
> + dev_err(>dev, "ROM status read failed: %d\n",
> + pcibios_err_to_errno(retval));
> + return;
> + }
> + status |= RENESAS_ROM_STATUS_ERASE;
> + retval = pci_write_config_byte(pdev, 

Re: [PATCH v10 3/5] usb: xhci: Add support for Renesas controller with memory

2020-04-29 Thread Mathias Nyman
On 24.4.2020 13.14, Vinod Koul wrote:
> Some rensas controller like uPD720201 and uPD720202 need firmware to be
> loaded. Add these devices in table and invoke renesas firmware loader
> functions to check and load the firmware into device memory when
> required.
> 
> Signed-off-by: Vinod Koul 
> ---
>  drivers/usb/host/xhci-pci.c | 28 
>  drivers/usb/host/xhci.h |  1 +
>  2 files changed, 29 insertions(+)
> 
> diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
> index b6c2f5c530e3..f26cf072836d 100644
> --- a/drivers/usb/host/xhci-pci.c
> +++ b/drivers/usb/host/xhci-pci.c
> @@ -15,6 +15,7 @@
>  
>  #include "xhci.h"
>  #include "xhci-trace.h"
> +#include "xhci-pci.h"
>  
>  #define SSIC_PORT_NUM2
>  #define SSIC_PORT_CFG2   0x880c
> @@ -319,6 +320,8 @@ static int xhci_pci_setup(struct usb_hcd *hcd)
>   return xhci_pci_reinit(xhci, pdev);
>  }
>  
> +static bool renesas_device;

hmm, we shouldn't need this

> +
>  /*
>   * We need to register our own PCI probe function (instead of the USB core's
>   * function) in order to create a second roothub under xHCI.
> @@ -328,6 +331,16 @@ static int xhci_pci_probe(struct pci_dev *dev, const 
> struct pci_device_id *id)
>   int retval;
>   struct xhci_hcd *xhci;
>   struct usb_hcd *hcd;
> + struct xhci_driver_data *driver_data;
> +
> + renesas_device = false;
> + driver_data = (struct xhci_driver_data *)id->driver_data;
> + if (driver_data && driver_data->quirks & XHCI_RENESAS_FW_QUIRK) {
> + retval = renesas_xhci_check_request_fw(dev, id);
> + if (retval)
> + return retval;
> + renesas_device = true;
> + }
>  
>   /* Prevent runtime suspending between USB-2 and USB-3 initialization */
>   pm_runtime_get_noresume(>dev);
> @@ -388,6 +401,9 @@ static void xhci_pci_remove(struct pci_dev *dev)
>  {
>   struct xhci_hcd *xhci;
>  
> + if (renesas_device)
> + renesas_xhci_pci_exit(dev);
> +

Ah, I see, what we really should do is make sure the quirks in the driver data 
get
added to xhci->quirks, and then just check for the correct quirk in 
xhci_pci_remove.

if (xhci->quirks & XHCI_RENESAS_FW_QUIRK)
renesas_xhci_pci_exit(dev);
 

Heikki Krogerus did some work on this a long time ago, below code is based on 
his
work. It needs to be added:
diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index f26cf072836d..5ae4fc10fc31 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -88,8 +88,16 @@ static int xhci_pci_reinit(struct xhci_hcd *xhci, struct 
pci_dev *pdev)
 
 static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
 {
-   struct pci_dev  *pdev = to_pci_dev(dev);
+   struct pci_dev  *pdev = to_pci_dev(dev);
+   struct xhci_driver_data *driver_data;
+   const struct pci_device_id  *id;
 
+   id = pci_match_id(pdev->driver->id_table, pdev);
+
+   if (id && id->driver_data) {
+   driver_data = (struct xhci_driver_data *)id->driver_data;
+   xhci->quirks |= driver_data->quirks;
+   }
/* Look for vendor-specific quirks */
if (pdev->vendor == PCI_VENDOR_ID_FRESCO_LOGIC &&
(pdev->device == PCI_DEVICE_ID_FRESCO_LOGIC_PDK ||


-Mathias


Re: [PATCH] usb: xhci: fix Immediate Data Transfer endianness

2019-10-23 Thread Mathias Nyman

On 20.10.2019 4.53, Samuel Holland wrote:

The arguments to queue_trb are always byteswapped to LE for placement in
the ring, but this should not happen in the case of immediate data; the
bytes copied out of transfer_buffer are already in the correct order.
Add a complementary byteswap so the bytes end up in the ring correctly.

This was observed on BE ppc64 with a "Texas Instruments TUSB73x0
SuperSpeed USB 3.0 xHCI Host Controller [104c:8241]" as a ch341
usb-serial adapter ("1a86:7523 QinHeng Electronics HL-340 USB-Serial
adapter") always transmitting the same character (generally NUL) over
the serial link regardless of the key pressed.



Thanks, nice catch.

It's unfortunate that we ended up with a situation where this fix is the
least intrusive one.
With IDT we would just want to memcpy() bytes an not care about endianness,
but on BE we end up storing data bytes in a u64, and start with a complementary 
u64
byteswap to counter a later u32 byteswap done after splitting the u64 to upper
and lower 32 bit parts.

This because that TRB field is normally used for 64bit data buffer pointers,
and all code is written to support that

adding to queue

-Mathias


Re: [PATCH v3] usb: Add a new quirk to let buggy hub enable and disable LPM during suspend and resume

2019-10-21 Thread Mathias Nyman

On 18.10.2019 21.59, Greg Kroah-Hartman wrote:

On Thu, Oct 17, 2019 at 02:33:00PM +0800, Kai-Heng Feng wrote:




On Oct 4, 2019, at 03:04, Alan Stern  wrote:

On Fri, 4 Oct 2019, Kai-Heng Feng wrote:


Dell WD15 dock has a topology like this:
/:  Bus 04.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 1M
|__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/7p, 5000M
|__ Port 2: Dev 3, If 0, Class=Vendor Specific Class, Driver=r8152, 
5000M

Their IDs:
Bus 004 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 004 Device 002: ID 0424:5537 Standard Microsystems Corp.
Bus 004 Device 004: ID 0bda:8153 Realtek Semiconductor Corp.

Ethernet cannot be detected after plugging ethernet cable to the dock,
the hub and roothub get runtime resumed and runtime suspended
immediately:
...
[  433.315169] xhci_hcd :3a:00.0: hcd_pci_runtime_resume: 0
[  433.315204] usb usb4: usb auto-resume
[  433.315226] hub 4-0:1.0: hub_resume
[  433.315239] xhci_hcd :3a:00.0: Get port status 4-1 read: 0x10202e2, 
return 0x10343
[  433.315264] usb usb4-port1: status 0343 change 0001
[  433.315279] xhci_hcd :3a:00.0: clear port1 connect change, portsc: 
0x10002e2
[  433.315293] xhci_hcd :3a:00.0: Get port status 4-2 read: 0x2a0, return 
0x2a0
[  433.317012] xhci_hcd :3a:00.0: xhci_hub_status_data: stopping port 
polling.
[  433.422282] xhci_hcd :3a:00.0: Get port status 4-1 read: 0x10002e2, 
return 0x343

At this point the SMSC hub (usb 4-1) enters into compliance mode
(USB_SS_PORT_LS_COMP_MOD), and USB core tries to warm-reset it,

[  433.422307] usb usb4-port1: do warm reset
[  433.422311] usb 4-1: device reset not allowed in state 8
[  433.422339] hub 4-0:1.0: state 7 ports 2 chg 0002 evt 
[  433.422346] xhci_hcd :3a:00.0: Get port status 4-1 read: 0x10002e2, 
return 0x343
[  433.422356] usb usb4-port1: do warm reset
[  433.422358] usb 4-1: device reset not allowed in state 8
[  433.422428] xhci_hcd :3a:00.0: set port remote wake mask, actual port 0 
status  = 0xf0002e2
[  433.422455] xhci_hcd :3a:00.0: set port remote wake mask, actual port 1 
status  = 0xe0002a0
[  433.422465] hub 4-0:1.0: hub_suspend
[  433.422475] usb usb4: bus auto-suspend, wakeup 1
[  433.426161] xhci_hcd :3a:00.0: xhci_hub_status_data: stopping port 
polling.
[  433.466209] xhci_hcd :3a:00.0: port 0 polling in bus suspend, waiting
[  433.510204] xhci_hcd :3a:00.0: port 0 polling in bus suspend, waiting
[  433.554051] xhci_hcd :3a:00.0: port 0 polling in bus suspend, waiting
[  433.598235] xhci_hcd :3a:00.0: port 0 polling in bus suspend, waiting
[  433.642154] xhci_hcd :3a:00.0: port 0 polling in bus suspend, waiting
[  433.686204] xhci_hcd :3a:00.0: port 0 polling in bus suspend, waiting
[  433.730205] xhci_hcd :3a:00.0: port 0 polling in bus suspend, waiting
[  433.774203] xhci_hcd :3a:00.0: port 0 polling in bus suspend, waiting
[  433.818207] xhci_hcd :3a:00.0: port 0 polling in bus suspend, waiting
[  433.862040] xhci_hcd :3a:00.0: port 0 polling in bus suspend, waiting
[  433.862053] xhci_hcd :3a:00.0: xhci_hub_status_data: stopping port 
polling.
[  433.862077] xhci_hcd :3a:00.0: xhci_suspend: stopping port polling.
[  433.862096] xhci_hcd :3a:00.0: // Setting command ring address to 
0x8578fc001
[  433.862312] xhci_hcd :3a:00.0: hcd_pci_runtime_suspend: 0
[  433.862445] xhci_hcd :3a:00.0: PME# enabled
[  433.902376] xhci_hcd :3a:00.0: restoring config space at offset 0xc (was 
0x0, writing 0x20)
[  433.902395] xhci_hcd :3a:00.0: restoring config space at offset 0x4 (was 
0x10, writing 0x100403)
[  433.902490] xhci_hcd :3a:00.0: PME# disabled
[  433.902504] xhci_hcd :3a:00.0: enabling bus mastering
[  433.902547] xhci_hcd :3a:00.0: // Setting command ring address to 
0x8578fc001
[  433.902649] pcieport :00:1b.0: PME: Spurious native interrupt!
[  433.902839] xhci_hcd :3a:00.0: Port change event, 4-1, id 3, portsc: 
0xb0202e2
[  433.902842] xhci_hcd :3a:00.0: resume root hub
[  433.902845] xhci_hcd :3a:00.0: handle_port_status: starting port polling.
[  433.902877] xhci_hcd :3a:00.0: xhci_resume: starting port polling.
[  433.902889] xhci_hcd :3a:00.0: xhci_hub_status_data: stopping port 
polling.
[  433.902891] xhci_hcd :3a:00.0: hcd_pci_runtime_resume: 0
[  433.902919] usb usb4: usb wakeup-resume
[  433.902942] usb usb4: usb auto-resume
[  433.902966] hub 4-0:1.0: hub_resume
...

However the warm-reset never success, the asserted PCI PME keeps the
runtime-resume, warm-reset and runtime-suspend loop which never bring it back
and causing spurious interrupts floods.

After some trial and errors, the issue goes away if LPM on the SMSC hub
is disabled. Digging further, enabling and disabling LPM during runtime
resume and runtime suspend respectively can solve the issue.

So bring back the old LPM behavior as a quirk and use it for the SMSC
hub to solve the issue.

Fixes: 

Re: [PATCH] xhci: Don't use soft retry if slot id > 0

2019-10-14 Thread Mathias Nyman

On 13.10.2019 3.33, Bernhard Gebetsberger wrote:

According to the xhci specification(chapter 4.6.8.1) soft retry
shouldn't be used if the slot id is higher than 0. Currently some usb
devices break on some systems because soft retry is being used when
there is a transaction error, without checking the slot id.


That is not correct

Specs say that soft retry should not be used if we are dealing with a
FS/LS device behind a HS hub, this can be checked from the "TT Hub Slot ID"
field in the slot context, which we do.

In xhci all devices have a slot id, so this change would prevent
soft retry almost completely.

Specs 4.6.8.1:
"Soft Retry attempts shall not be performed if the device is behind a
TT in a HS Hub (i.e. TT Hub Slot ID > ‘0’)."

-Mathias


Re: [BISECTED] Suspend / USB broken on XPS 9370 + TB16

2019-10-07 Thread Mathias Nyman

On 7.10.2019 10.21, Carlo Caione wrote:

Hi,
I bisected an issue down to commit f7fac17ca925 "xhci: Convert xhci_handshake() to 
use readl_poll_timeout_atomic()".

Setup:
XPS 9370 + Thunderbolt dock Dell TB16

Issue:
The laptop is unable to go to sleep. It never really goes to sleep and after a 
few seconds the USB dies.

Log:
https://termbin.com/icix

Cheers!



Does the below patch help?  Greg just applied it to his usb-linus branch.

ac343366846a xhci: Increase STS_SAVE timeout in xhci_suspend()
Link: 
https://lore.kernel.org/r/1570190373-30684-8-git-send-email-mathias.ny...@linux.intel.com

-Mathias


Re: [PATCH v1 4/4] usb: host: xhci-tegra: Switch to use %ptT

2019-10-01 Thread Mathias Nyman

On 1.10.2019 14.56, Andy Shevchenko wrote:

On Fri, Jan 04, 2019 at 09:30:09PM +0200, Andy Shevchenko wrote:

Use %ptT instead of open coded variant to print content of
time64_t type in human readable format.


Any comments on this?


Untested, but looks ok to me.

xhci-tegra.c is written by Thierry Reding, so its more up to him





Cc: Mathias Nyman 
Cc: Thierry Reding 
Cc: Jonathan Hunter 
Signed-off-by: Andy Shevchenko 
---
  drivers/usb/host/xhci-tegra.c | 6 +-
  1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/usb/host/xhci-tegra.c b/drivers/usb/host/xhci-tegra.c
index 938ff06c0349..ed3eea3876e2 100644
--- a/drivers/usb/host/xhci-tegra.c
+++ b/drivers/usb/host/xhci-tegra.c
@@ -820,7 +820,6 @@ static int tegra_xusb_load_firmware(struct tegra_xusb 
*tegra)
const struct firmware *fw;
unsigned long timeout;
time64_t timestamp;
-   struct tm time;
u64 address;
u32 value;
int err;
@@ -925,11 +924,8 @@ static int tegra_xusb_load_firmware(struct tegra_xusb 
*tegra)
}
  
  	timestamp = le32_to_cpu(header->fwimg_created_time);

-   time64_to_tm(timestamp, 0, );
  
-	dev_info(dev, "Firmware timestamp: %ld-%02d-%02d %02d:%02d:%02d UTC\n",

-time.tm_year + 1900, time.tm_mon + 1, time.tm_mday,
-time.tm_hour, time.tm_min, time.tm_sec);
+   dev_info(dev, "Firmware timestamp: %ptT UTC\n", );
  
  	return 0;

  }
--
2.19.2







Re: [PATCH] xhci: Increase STS_SAVE timeout in xhci_suspend()

2019-09-27 Thread Mathias Nyman

On 26.9.2019 20.57, Kai-Heng Feng wrote:

After commit f7fac17ca925 ("xhci: Convert xhci_handshake() to use
readl_poll_timeout_atomic()"), ASMedia xHCI may fail to suspend.

Although the algorithms are essentially the same, the old max timeout is
(usec + usec * time of doing readl()), and the new max timeout is just
usec, which is much less than the old one.

Increase the timeout to make ASMedia xHCI able to suspend again.

BugLink: https://bugs.launchpad.net/bugs/1844021
Fixes: f7fac17ca925 ("xhci: Convert xhci_handshake() to use 
readl_poll_timeout_atomic()")
Signed-off-by: Kai-Heng Feng 
---


Thanks, adding to queue with stable tag for v5.2+ kernels

-Mathias


Re: [PATCH] xhci: fix memleak on setup address fails.

2019-08-30 Thread Mathias Nyman

On 26.8.2019 9.41, Ikjoon Jang wrote:

On Wed, Aug 14, 2019 at 9:57 PM Mathias Nyman
 wrote:


On 11.8.2019 11.22, Ikjoon Jang wrote:

Xhci re-enables a slot on transaction error in set_address using
xhci_disable_slot() + xhci_alloc_dev().

But in this case, xhci_alloc_dev() creates debugfs entries upon an
existing device without cleaning up old entries, thus memory leaks.

So this patch simply moves calling xhci_debugfs_free_dev() from
xhci_free_dev() to xhci_disable_slot().



Othwerwise this looks good, but xhci_alloc_dev() will call xhci_disable_slot()
in some failure cases before the slot debugfs entry is created.

In these cases xhci_debugfs_remove_slot() will be called without
xhci_debugfs_create_slot() ever being called.

This might not be an issue as xhci_debugfs_remove_slot() checks
if (!dev || !dev->debugfs_private) before doing anything, but should
be checked out.



I checked out the case by adding simple fault injection on xhci_alloc_dev(),
to simulate xhci_debugfs_remove_slot() can be called without
xhci_debugfs_create_slot() being called.



Thanks, patch sent forward

-Mathias


Re: [PATCH v2 4/7] usb: mtk-xhci: support ip-sleep wakeup for MT8183

2019-08-29 Thread Mathias Nyman

On 28.8.2019 10.34, Chunfeng Yun wrote:

Support USB wakeup by ip-sleep mode for MT8183, it's similar to
MT8173

Signed-off-by: Chunfeng Yun 


Acked-by: Mathias Nyman 


Re: [RESEND PATCH v2 2/2] usb: xhci-mtk: add an optional xhci_ck clock

2019-08-23 Thread Mathias Nyman

On 23.8.2019 9.40, Chunfeng Yun wrote:

Some SoCs may have an optional clock xhci_ck (125M or 200M), it
usually uses the same PLL as sys_ck, so support it.

Signed-off-by: Chunfeng Yun 


Acked-by: Mathias Nyman 


Re: [PATCH v4 2/4] usb: xhci: Use register defined and field names

2019-06-28 Thread Mathias Nyman

On 26.6.2019 10.55, Vinod Koul wrote:

Instead of using register values and fields lets define them and
use in the driver.

Signed-off-by: Vinod Koul 
Cc: Yoshihiro Shimoda 
Cc: Christian Lamparter 
Tested-by: Christian Lamparter 
---
  drivers/usb/host/xhci-pci.c | 60 ++---
  1 file changed, 43 insertions(+), 17 deletions(-)

diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index 237df5c47fca..0f2574b42cb1 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -57,6 +57,27 @@
  #define PCI_DEVICE_ID_AMD_PROMONTORYA_1   0x43bc
  #define PCI_DEVICE_ID_ASMEDIA_1042A_XHCI  0x1142
  
+#define RENESAS_FW_VERSION0x6C

+#define RENESAS_ROM_CONFIG 0xF0
+#define RENESAS_FW_STATUS  0xF4
+#define RENESAS_FW_STATUS_MSB  0xF5
+#define RENESAS_ROM_STATUS 0xF6
+#define RENESAS_ROM_STATUS_MSB 0xF7
+#define RENESAS_DATA0  0xF8
+#define RENESAS_DATA1  0xFC
+
+#define RENESAS_FW_VERSION_FIELD   GENMASK(23, 7)
+#define RENESAS_FW_VERSION_OFFSET  8
+
+#define RENESAS_FW_STATUS_DOWNLOAD_ENABLE  BIT(0)
+#define RENESAS_FW_STATUS_LOCK BIT(1)
+#define RENESAS_FW_STATUS_RESULT   GENMASK(6, 4)
+  #define RENESAS_FW_STATUS_INVALID0
+  #define RENESAS_FW_STATUS_SUCCESSBIT(4)
+  #define RENESAS_FW_STATUS_ERROR  BIT(5)
+#define RENESAS_FW_STATUS_SET_DATA0BIT(8)
+#define RENESAS_FW_STATUS_SET_DATA1BIT(9)
+
  #define RENESAS_RETRY 1
  #define RENESAS_DELAY 10
  
@@ -347,7 +368,8 @@ static int renesas_fw_download_image(struct pci_dev *dev,
  
  	/* step+1. Read "Set DATAX" and confirm it is cleared. */

for (i = 0; i < RENESAS_RETRY; i++) {
-   err = pci_read_config_byte(dev, 0xF5, _status);
+   err = pci_read_config_byte(dev, RENESAS_FW_STATUS_MSB,
+  _status);
if (err)
return pcibios_err_to_errno(err);
if (!(fw_status & BIT(data0_or_data1)))
@@ -362,7 +384,8 @@ static int renesas_fw_download_image(struct pci_dev *dev,
 * step+2. Write FW data to "DATAX".
 * "LSB is left" => force little endian
 */
-   err = pci_write_config_dword(dev, data0_or_data1 ? 0xFC : 0xF8,
+   err = pci_write_config_dword(dev, data0_or_data1 ?
+RENESAS_DATA1 : RENESAS_DATA0,
 (__force u32)cpu_to_le32(fw[step]));
if (err)
return pcibios_err_to_errno(err);
@@ -370,7 +393,8 @@ static int renesas_fw_download_image(struct pci_dev *dev,
udelay(100);
  
  	/* step+3. Set "Set DATAX". */

-   err = pci_write_config_byte(dev, 0xF5, BIT(data0_or_data1));
+   err = pci_write_config_byte(dev, RENESAS_FW_STATUS_MSB,
+   BIT(data0_or_data1));
if (err)
return pcibios_err_to_errno(err);
  
@@ -440,7 +464,7 @@ static int renesas_fw_check_running(struct pci_dev *pdev)

 * BIOSes will initialize the device for us. If the device is
 * initialized.
 */
-   err = pci_read_config_byte(pdev, 0xF4, _state);
+   err = pci_read_config_byte(pdev, RENESAS_FW_STATUS, _state);
if (err)
return pcibios_err_to_errno(err);
  
@@ -449,10 +473,10 @@ static int renesas_fw_check_running(struct pci_dev *pdev)

 * ready we can simply continue. If the FW is not ready, we have
 * to give up.
 */
-   if (fw_state & BIT(1)) {
+   if (fw_state & RENESAS_FW_STATUS_LOCK) {
dev_dbg(>dev, "FW Download Lock is engaged.");
  
-		if (fw_state & BIT(4))

+   if (fw_state & RENESAS_FW_STATUS_SUCCESS)
return 0;
  
  		dev_err(>dev,

@@ -465,33 +489,33 @@ static int renesas_fw_check_running(struct pci_dev *pdev)
 * with it and it can't be resetted, we have to give up too... and
 * ask for a forgiveness and a reboot.
 */
-   if (fw_state & BIT(0)) {
+   if (fw_state & RENESAS_FW_STATUS_DOWNLOAD_ENABLE) {
dev_err(>dev,
"FW Download Enable is stale. Giving Up (poweroff/reboot 
needed).");
return -EIO;
}
  
  	/* Otherwise, Check the "Result Code" Bits (6:4) and act accordingly */

-   switch ((fw_state & 0x70)) {
+   switch (fw_state & RENESAS_FW_STATUS_RESULT) {
case 0: /* No result yet */
dev_dbg(>dev, "FW is not ready/loaded yet.");
  
  		/* tell the caller, that this device needs the firmware. */

return 1;
  
-	

Re: [PATCH v4 1/4] usb: xhci: add firmware loader for uPD720201 and uPD720202 w/o ROM

2019-06-28 Thread Mathias Nyman

On 26.6.2019 10.55, Vinod Koul wrote:

From: Christian Lamparter 

This patch adds a firmware loader for the uPD720201K8-711-BAC-A
and uPD720202K8-711-BAA-A variant. Both of these chips are listed
in Renesas' R19UH0078EJ0500 Rev.5.00 "User's Manual: Hardware" as
devices which need the firmware loader on page 2 in order to
work as they "do not support the External ROM".

The "Firmware Download Sequence" is describe in chapter
"7.1 FW Download Interface" R19UH0078EJ0500 Rev.5.00 page 131.

The firmware "K2013080.mem" is available from a USB3.0 Host to
PCIe Adapter (PP2U-E card) "Firmware download" archive. An
alternative version can be sourced from Netgear's WNDR4700 GPL
archives.

The release notes of the PP2U-E's "Firmware Download" ver 2.0.1.3
(2012-06-15) state that the firmware is for the following devices:
  - uPD720201 ES 2.0 sample whose revision ID is 2.
  - uPD720201 ES 2.1 sample & CS sample & Mass product, ID is 3.
  - uPD720202 ES 2.0 sample & CS sample & Mass product, ID is 2.

Cc: Yoshihiro Shimoda 
Signed-off-by: Christian Lamparter 
Signed-off-by: Bjorn Andersson 
[vkoul: fixed comments:
used macros for timeout count and delay
removed renesas_fw_alive_check
cleaned renesas_fw_callback
removed recurion for renesas_fw_download
added MODULE_FIRMWARE]
Tested-by: Christian Lamparter 
Signed-off-by: Vinod Koul 
---
  drivers/usb/host/xhci-pci.c | 454 
  1 file changed, 454 insertions(+)

diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index c2fe218e051f..237df5c47fca 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -12,6 +12,8 @@
  #include 
  #include 
  #include 
+#include 
+#include 
  
  #include "xhci.h"

  #include "xhci-trace.h"
@@ -55,6 +57,9 @@
  #define PCI_DEVICE_ID_AMD_PROMONTORYA_1   0x43bc
  #define PCI_DEVICE_ID_ASMEDIA_1042A_XHCI  0x1142
  
+#define RENESAS_RETRY	1

+#define RENESAS_DELAY  10
+
  static const char hcd_name[] = "xhci_hcd";
  
  static struct hc_driver __read_mostly xhci_pci_hc_driver;

@@ -279,6 +284,429 @@ static void xhci_pme_acpi_rtd3_enable(struct pci_dev *dev)
  static void xhci_pme_acpi_rtd3_enable(struct pci_dev *dev) { }
  #endif /* CONFIG_ACPI */
  
+static const struct renesas_fw_entry {

+   const char *firmware_name;
+   u16 device;
+   u8 revision;
+   u16 expected_version;
+} renesas_fw_table[] = {
+   /*
+* Only the uPD720201K8-711-BAC-A or uPD720202K8-711-BAA-A
+* are listed in R19UH0078EJ0500 Rev.5.00 as devices which
+* need the software loader.
+*
+* PP2U/ReleaseNote_USB3-201-202-FW.txt:
+*
+* Note: This firmware is for the following devices.
+*  - uPD720201 ES 2.0 sample whose revision ID is 2.
+*  - uPD720201 ES 2.1 sample & CS sample & Mass product, ID is 3.
+*  - uPD720202 ES 2.0 sample & CS sample & Mass product, ID is 2.
+*/
+   { "K2013080.mem", 0x0014, 0x02, 0x2013 },
+   { "K2013080.mem", 0x0014, 0x03, 0x2013 },
+   { "K2013080.mem", 0x0015, 0x02, 0x2013 },
+};
+
+MODULE_FIRMWARE("K2013080.mem");
+
+static const struct renesas_fw_entry *renesas_needs_fw_dl(struct pci_dev *dev)
+{
+   const struct renesas_fw_entry *entry;
+   size_t i;
+
+   /* This loader will only work with a RENESAS device. */
+   if (!(dev->vendor == PCI_VENDOR_ID_RENESAS))
+   return NULL;
+
+   for (i = 0; i < ARRAY_SIZE(renesas_fw_table); i++) {
+   entry = _fw_table[i];
+   if (entry->device == dev->device &&
+   entry->revision == dev->revision)
+   return entry;
+   }
+
+   return NULL;
+}
+
+static int renesas_fw_download_image(struct pci_dev *dev,
+const u32 *fw,
+size_t step)
+{
+   size_t i;
+   int err;
+   u8 fw_status;
+   bool data0_or_data1;
+
+   /*
+* The hardware does alternate between two 32-bit pages.
+* (This is because each row of the firmware is 8 bytes).
+*
+* for even steps we use DATA0, for odd steps DATA1.
+*/
+   data0_or_data1 = (step & 1) == 1;
+
+   /* step+1. Read "Set DATAX" and confirm it is cleared. */
+   for (i = 0; i < RENESAS_RETRY; i++) {
+   err = pci_read_config_byte(dev, 0xF5, _status);
+   if (err)
+   return pcibios_err_to_errno(err);
+   if (!(fw_status & BIT(data0_or_data1)))
+   break;
+
+   udelay(RENESAS_DELAY);
+   }
+   if (i == RENESAS_RETRY)
+   return -ETIMEDOUT;
+
+   /*
+* step+2. Write FW data to "DATAX".
+* "LSB is left" => force little endian
+*/
+   err = pci_write_config_dword(dev, data0_or_data1 ? 0xFC : 0xF8,
+

Re: [PATCH] xhci: clear port_remote_wakeup after resume failure

2019-06-04 Thread Mathias Nyman

On 27.5.2019 14.28, Nicolas Saenz Julienne wrote:

Hi Matthias,
thanks for the review.

On Mon, 2019-05-27 at 14:16 +0300, Mathias Nyman wrote:

On 24.5.2019 17.52, Nicolas Saenz Julienne wrote:

This was seen on a Dell Precision 5520 using it's WD15 dock. The dock's
Ethernet device interfaces with the laptop through one of it's USB3
ports. While idle, the Ethernet device and HCD are suspended by runtime
PM, being the only device connected on the bus. Then, both are resumed on
behalf of the Ethernet device, which has remote wake-up capabilities.

The Ethernet device was observed to randomly disconnect from the USB
port shortly after submitting it's remote wake-up request. Probably a
weird timing issue yet to be investigated. This causes runtime PM to
busyloop causing some tangible CPU load. The reason is the port gets
stuck in the middle of a remote wake-up operation, waiting for the
device to switch to U0. This never happens, leaving "port_remote_wakeup"
enabled, and automatically triggering a failure on any further suspend
operation.

This patch clears "port_remote_wakeup" upon detecting a device with a
wrong resuming port state (see Table 4-9 in 4.15.2.3). Making sure the
above mentioned situation doesn't trigger a PM busyloop.



There was a similar case where the USB3 link had transitioned to a
lower power U1 or U2 state after resume, before driver read the state,
leaving port_remote_wakeup flag uncleared.

This was fixed in 5.1 kernel by commit:

6cbcf59 xhci: Fix port resume done detection for SS ports with LPM enable

Can you check if you have it?
It should be in recent stable releases as well.


I was aware of that patch, unfortunately it doesn't address the same issue. In
my case I never get a second port status event (so no PLC == 1 or any state
change seen in PLS). The device simply disconnects from the bus.



I see, ok, then we need to clear the flag in the hub thread.

But to me it looks like this patch could cause a small race risk in the 
successful
device initiated resume cases.

If the hub thread, i.e. the get_port_status() function, notices the U0 state 
before
the interrupt handler, i.e. handle_port_status() function, then 
port_remote_wakeup
flag is cleared in the hub thread and the wakeup notification is never called 
from
handle_port_status().

Would it be enough to just check for (port_remote_wakeup flag && !PORT_CONNECT) 
in the hub thread?
USB3 PORT_CONNECT bit is lost in most error cases.

-Mathias


Re: [PATCH v11 2/2] usb: xhci: Add Clear_TT_Buffer

2019-06-03 Thread Mathias Nyman

On 3.6.2019 13.53, Jim Lin wrote:

USB 2.0 specification chapter 11.17.5 says "as part of endpoint halt
processing for full-/low-speed endpoints connected via a TT, the host
software must use the Clear_TT_Buffer request to the TT to ensure
that the buffer is not in the busy state".

In our case, a full-speed speaker (ConferenceCam) is behind a high-
speed hub (ConferenceCam Connect), sometimes once we get STALL on a
request we may continue to get STALL with the folllowing requests,
like Set_Interface.

Here we invoke usb_hub_clear_tt_buffer() to send Clear_TT_Buffer
request to the hub of the device for the following Set_Interface
requests to the device to get ACK successfully.

Signed-off-by: Jim Lin 
---
v2: xhci_clear_tt_buffer_complete: add static, shorter indentation
 , remove its claiming in xhci.h
v3: Add description for clearing_tt (xhci.h)
v4: Remove clearing_tt flag because hub_tt_work has hub->tt.lock
 to protect for Clear_TT_Buffer to be run serially.
 Remove xhci_clear_tt_buffer_complete as it's not necessary.
 Same reason as the above.
 Extend usb_hub_clear_tt_buffer parameter
v5: Not extending usb_hub_clear_tt_buffer parameter
 Add description.
v6: Remove unused parameter slot_id from xhci_clear_hub_tt_buffer
v7: Add devaddr field in "struct usb_device"
v8: split as two patches
v9: no change flag
v10: Add EP_CLEARING_TT flag
v11: Add spin_lock/unlock in xhci_clear_tt_buffer_complete


  drivers/usb/host/xhci-ring.c | 27 ++-
  drivers/usb/host/xhci.c  | 21 +
  drivers/usb/host/xhci.h  |  5 +
  3 files changed, 52 insertions(+), 1 deletion(-)



xhci parts look good to me,

In case Greg wants to pick up both the core changes (1/2) and
the xhci changes (2/2) at the same time:

Acked-by: Mathias Nyman 




Re: [PATCH] xhci: clear port_remote_wakeup after resume failure

2019-05-27 Thread Mathias Nyman

On 24.5.2019 17.52, Nicolas Saenz Julienne wrote:

This was seen on a Dell Precision 5520 using it's WD15 dock. The dock's
Ethernet device interfaces with the laptop through one of it's USB3
ports. While idle, the Ethernet device and HCD are suspended by runtime
PM, being the only device connected on the bus. Then, both are resumed on
behalf of the Ethernet device, which has remote wake-up capabilities.

The Ethernet device was observed to randomly disconnect from the USB
port shortly after submitting it's remote wake-up request. Probably a
weird timing issue yet to be investigated. This causes runtime PM to
busyloop causing some tangible CPU load. The reason is the port gets
stuck in the middle of a remote wake-up operation, waiting for the
device to switch to U0. This never happens, leaving "port_remote_wakeup"
enabled, and automatically triggering a failure on any further suspend
operation.

This patch clears "port_remote_wakeup" upon detecting a device with a
wrong resuming port state (see Table 4-9 in 4.15.2.3). Making sure the
above mentioned situation doesn't trigger a PM busyloop.



There was a similar case where the USB3 link had transitioned to a
lower power U1 or U2 state after resume, before driver read the state,
leaving port_remote_wakeup flag uncleared.

This was fixed in 5.1 kernel by commit:

6cbcf59 xhci: Fix port resume done detection for SS ports with LPM enable

Can you check if you have it?
It should be in recent stable releases as well.

-Mathias


Re: [RFC] xhci: clear port_remote_wakeup on device disconnection

2019-03-20 Thread Mathias Nyman

On 19.3.2019 16.54, Nicolas Saenz Julienne wrote:

Hi Oliver, thanks for the review!

On Tue, 2019-03-19 at 12:01 +0100, Oliver Neukum wrote:

On Mo, 2019-03-18 at 18:00 +0100, Nicolas Saenz Julienne wrote:

This patch clears "port_remote_wakeup" upon detecting a device
disconnection. Making sure the above mentioned situation doesn't trigger
a PM busyloop.


Hi,

that is an interesting race condition.



Turns out that port_remote_wakeup was only properly cleared in case of a
successful resume where port stays in U0, or in case of an error where port
goes to inactive state.

In all other cases port_remote_wakeup bit was left uncleared.


Signed-off-by: Nicolas Saenz Julienne 
---
  drivers/usb/host/xhci-hub.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
index e2eece693655..bea853f45aec 100644
--- a/drivers/usb/host/xhci-hub.c
+++ b/drivers/usb/host/xhci-hub.c
@@ -942,6 +942,9 @@ static void xhci_get_usb3_port_status(struct xhci_port
*port, u32 *status,
bus_state->suspended_ports &= ~(1 << portnum);
}
  
+	if (!(portsc & PORT_CONNECT))

+   bus_state->port_remote_wakeup &= ~(1 << portnum);
+


A check like this should cover the failing resume cases, thanks

We also saw a case when a successful resume was missed as port went into U1
Link power save state (resume->U0->U1) before xhci driver read the port status.

A patch adding U1 and U2 to successful resume states is pending in my 
for-usb-linus
branch for this.



Why are you putting that logic into xhci_get_usb3_port_status()?
It looks to me like there is already something related in the caller


You're right, xhci_get_usb3_port_status() is not the ideal spot I'll move the
code there.



Sounds good, thanks

-Mathias


Re: Linux Kernel - problem

2019-03-04 Thread Mathias Nyman

On 4.3.2019 14.17, Cezary Lenkiewicz wrote:

Hello,
I have problem with Euphony system - this is linux for audio.
I talking with Euphony support but they say that the problem is in the linux 
kernel.

Please show my conversation with euphony:

https://euphony-audio.com/hesk/ticket.php?track=T4N-M6G-XHAE=missionfanboy%40gmail.com=49662

Can you fix this PLEASE ?


Feb 28 17:11:34 euphony kernel: xhci_hcd :04:00.0: ERROR: unexpected 
command completion code 0x11.
Feb 28 17:11:34 euphony kernel: usb 5-1.1: Not enough bandwidth for altsetting 1

Completion code 0x11 (17) is parameter error, this means some value in
a context parameter is invalid.

Many of these context parameter are taken from the descriptors of the
attached usb device  (device, interface, endpoint descriptors)

More details are needed, output of "lsusb -v", and dynamic debug of
xhci and usbcore would be a good start

-Mathias




Re: [PATCH] xhci: use iopoll for xhci_handshake

2019-02-28 Thread Mathias Nyman

On 28.2.2019 9.09, Greg Kroah-Hartman wrote:

On Wed, Feb 27, 2019 at 03:19:17PM -0700, Daniel Kurtz wrote:

In cases such as xhci_abort_cmd_ring(), xhci_handshake() is called with
a spin lock held (and local interrupts disabled) with a huge 5 second
timeout.  This can translates to 5 million calls to udelay(1).  By its
very nature, udelay() is not meant to be precise, it only guarantees to
delay a minimum of 1 microsecond. Therefore the actual delay of
xhci_handshake() can be significantly longer.  If the average udelay(1)
is greater than 2.2 us, the total time in xhci_handshake() - with
interrupts disabled can be > 11 seconds triggering the kernel's soft lockup
detector.

To avoid this, let's replace the open coded io polling loop with one from
iopoll.h that uses a loop timed with the more presumably reliable ktime
infrastructure.

Signed-off-by: Daniel Kurtz 


Looks sane to me, nice fixup.

Reviewed-by: Greg Kroah-Hartman 

Is this causing problems on older kernels/devices today such that we
should backport this?



A very similar patch was submitted some weeks ago by Andrey Smirnov.

https://lore.kernel.org/lkml/20190208014816.21869-1-andrew.smir...@gmail.com/

His commit message only mentions that readl_poll_timeout_atomic() does the same 
job,
not about any issues with the loop,  so I was going to send it forward to 
usb-next
after 5.1-rc (to 5.2).

-Mathias


Re: [PATCH 4.4 026/143] usb: hub: delay hub autosuspend if USB3 port is still link training

2019-02-19 Thread Mathias Nyman

On 18.2.2019 17.39, Alan Stern wrote:

On Mon, 18 Feb 2019, Greg Kroah-Hartman wrote:


4.4-stable review patch.  If anyone has any objections, please let me know.

--

[ Upstream commit e86108940e541febf35813402ff29fa6f4a9ac0b ]

When initializing a hub we want to give a USB3 port in link training
the same debounce delay time before autosuspening the hub as already
trained, connected enabled ports.

USB3 ports won't reach the enabled state with "current connect status" and
"connect status change" bits set until the USB3 link training finishes.

Catching the port in link training (polling) and adding the debounce delay
prevents unnecessary failed attempts to autosuspend the hub.

Signed-off-by: Mathias Nyman 
Acked-by: Alan Stern 
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Sasha Levin 


We should be careful with this commit; it has caused problems for some
people.  Mathias has been working to fix them, but this commit
shouldn't go into -stable until the fixes are also ready to go.



This commit should be fine, it gives ports in link training a bit more time
to finish. The commit causing issues is related, preventing runtime suspend
if ports are link training:

commit 2f31a67f01a8beb22cae754c53522cb61a005750
Author: Mathias Nyman 
Date:   Thu Nov 15 11:38:41 2018 +0200

usb: xhci: Prevent bus suspend if a port connect change or polling state is 
detected

Turns out it causes suspend issues on some MacBooks

-Mathias


Re: [RFC PATCH 0/2] Fix for the internal card reader and suspend on MacBooks

2019-02-14 Thread Mathias Nyman

On 13.2.2019 23.13, Ivan Mironov wrote:

Hi all,

There is a known problem on some MacBooks: internal card reader
disappears after the first suspend/resume and all subsequent attempts
to suspend laptop are failing.

This was reported[1][2] and even discussed in the mailing lists[3],
without any real solution. After trying various things[4], including
some existing quirks, I discovered that switching link state to DISABLED
before suspend both fixes the card reader device and allows any subsequent
suspend to succeed.

First patch adds code for this new quirk, and second patch enables this
quirk for card reader device which is used in my macbook.

I'm not really familiar with either USB standards or kernel code to
support them, so this patch series is RFC. I'm especially unsure with the
"resume" part, because I implemented it by trial and error mostly.
However, I'm using kernel with these patches and it works for me.

Also, feel free to suggest other kernel patches or existing quirks or
quirk combinations to fix the same problem.

Oh, and by the way: I've checked schematics of various macbooks available
on the internet. It seems that the actual chip is Genesys Logic GL3219,
probably just with the custom ID. What I found curious, is that USB 2.0
pins of this chip (D+ and D-) are not really connected anywhere, but
instead shorted through the resistor. Could it be possible that this
somehow messes up some logic inside the device, host controller or
linux kernel?



This card reader prevents second system suspend on latest kernels, see thread:
https://marc.info/?l=linux-usb=154816680816246=2

In that case the card reader fails to resume from usb3 U3 suspend state,
and ends up stuck in USB3 polling state, which now prevents suspend

Could you try a testpatch (attached) to see if it helps?

Thanks
-Mathias
>From 6b3b9f3d41b8ac9cf993bf4b88a20e30c99e3b7f Mon Sep 17 00:00:00 2001
From: Mathias Nyman 
Date: Thu, 14 Feb 2019 15:40:12 +0200
Subject: [PATCH] usb: warm reset USB3 ports stuck in polling

warm reset USB3 ports stuck in polling after 360ms.

In the polling state USB3 ports are link training, which should
not take longer than 360ms according to USB3 specification
tPollingLFPSTimeout value.

The card reader connected to xhci in MacBookPro is found stuck
in this state after resuming from suspend.

Signed-off-by: Mathias Nyman 
---
 drivers/usb/core/hub.c | 36 +---
 1 file changed, 33 insertions(+), 3 deletions(-)

diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index 8d4631c..448884d 100644
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -1151,9 +1151,10 @@ static void hub_activate(struct usb_hub *hub, enum hub_activation_type type)
 		 */
 		if (hub_is_superspeed(hdev) &&
 		((portstatus & USB_PORT_STAT_LINK_STATE) ==
-		USB_SS_PORT_LS_POLLING))
+		USB_SS_PORT_LS_POLLING)) {
 			need_debounce_delay = true;
-
+			set_bit(port1, hub->event_bits);
+		}
 		/* Clear status-change flags; we'll debounce later */
 		if (portchange & USB_PORT_STAT_C_CONNECTION) {
 			need_debounce_delay = true;
@@ -2697,6 +2698,9 @@ static unsigned hub_is_wusb(struct usb_hub *hub)
 #define HUB_LONG_RESET_TIME	200
 #define HUB_RESET_TIMEOUT	800
 
+#define HUB_LFPS_TIME		24
+#define HUB_LFPS_TIMEOUT	360
+
 /*
  * "New scheme" enumeration causes an extra state transition to be
  * exposed to an xhci host and causes USB3 devices to receive control
@@ -2737,6 +2741,31 @@ static bool hub_port_warm_reset_required(struct usb_hub *hub, int port1,
 		|| link_state == USB_SS_PORT_LS_COMP_MOD;
 }
 
+static bool hub_port_stuck_in_polling(struct usb_hub *hub, int port1,
+  u16 portstatus)
+{
+	u16 link_state;
+	u16 portchange;
+	int lfps_delay = 0;
+
+	if (!hub_is_superspeed(hub->hdev))
+		return false;
+
+	link_state = portstatus & USB_PORT_STAT_LINK_STATE;
+
+	while (link_state == USB_SS_PORT_LS_POLLING) {
+		msleep(HUB_LFPS_TIME);
+
+		hub_port_status(hub, port1, , );
+		link_state = portstatus & USB_PORT_STAT_LINK_STATE;
+
+		lfps_delay += HUB_LFPS_TIME;
+		if (lfps_delay > HUB_LFPS_TIMEOUT)
+			return true;
+	}
+	return false;
+}
+
 static int hub_port_wait_reset(struct usb_hub *hub, int port1,
 			struct usb_device *udev, unsigned int delay, bool warm)
 {
@@ -5329,7 +5358,8 @@ static void port_event(struct usb_hub *hub, int port1)
 	 * Warm reset a USB3 protocol port if it's in
 	 * SS.Inactive state.
 	 */
-	if (hub_port_warm_reset_required(hub, port1, portstatus)) {
+	if (hub_port_warm_reset_required(hub, port1, portstatus) ||
+	hub_port_stuck_in_polling(hub, port1, portstatus)) {
 		dev_dbg(_dev->dev, "do warm reset\n");
 		if (!udev || !(portstatus & USB_PORT_STAT_CONNECTION)
 || udev->state == USB_STATE_NOTATTACHED) {
-- 
2.7.4



Re: [PATCH 1/6] dt-bindings: usb: xhci-tegra: Add Tegra186 support

2019-02-13 Thread Mathias Nyman

On 07.02.2019 13:14, Thierry Reding wrote:

On Fri, Jan 25, 2019 at 12:30:08PM +0100, Thierry Reding wrote:

From: Thierry Reding 

Extend the bindings to cover the set of features found in Tegra186.

Signed-off-by: Thierry Reding 
---
  .../devicetree/bindings/usb/nvidia,tegra124-xusb.txt  | 4 
  1 file changed, 4 insertions(+)


Hi Greg, Mathias,

do you have any comments on this series or is it good to go into v5.1?



xhci parts look good to me, (patches 2/6 and 3/6)
Sorry about the delay.

Acked-by: Mathias Nyman 


Re: [PATCH 2/2] drivers: xhci: Add quirk to reset xHCI port PHY

2019-02-08 Thread Mathias Nyman

On 07.02.2019 17:17, Srinath Mannam wrote:

Hi Mathias,

Thanks for review, please see my comments below inline.

On Thu, Feb 7, 2019 at 8:32 PM Mathias Nyman
 wrote:


On 05.02.2019 08:18, Srinath Mannam wrote:

Add a quirk to reset xHCI port PHY on port disconnect event.
Stingray USB HS PHY has an issue, that USB High Speed device detected
at Full Speed after the same port has connected to Full speed device.
This problem can be resolved with that port PHY reset on disconnect.

Signed-off-by: Srinath Mannam 
Reviewed-by: Ray Jui 
---
   drivers/usb/core/hcd.c   |  6 ++
   drivers/usb/core/phy.c   | 21 +
   drivers/usb/core/phy.h   |  1 +
   drivers/usb/host/xhci-plat.c |  3 +++
   drivers/usb/host/xhci-ring.c |  9 ++---
   drivers/usb/host/xhci.h  |  1 +
   include/linux/usb/hcd.h  |  1 +
   7 files changed, 39 insertions(+), 3 deletions(-)

diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
index 015b126..e2b87a6 100644
--- a/drivers/usb/core/hcd.c
+++ b/drivers/usb/core/hcd.c
@@ -2663,6 +2663,12 @@ int usb_hcd_find_raw_port_number(struct usb_hcd *hcd, 
int port1)
   return hcd->driver->find_raw_port_number(hcd, port1);
   }

+int usb_hcd_phy_port_reset(struct usb_hcd *hcd, int port)
+{
+ return usb_phy_roothub_port_reset(hcd->phy_roothub, port);
+}
+EXPORT_SYMBOL_GPL(usb_hcd_phy_port_reset);
+
   static int usb_hcd_request_irqs(struct usb_hcd *hcd,
   unsigned int irqnum, unsigned long irqflags)
   {
diff --git a/drivers/usb/core/phy.c b/drivers/usb/core/phy.c
index 38b2c77..c64767d 100644
--- a/drivers/usb/core/phy.c
+++ b/drivers/usb/core/phy.c
@@ -162,6 +162,27 @@ void usb_phy_roothub_power_off(struct usb_phy_roothub 
*phy_roothub)
   }
   EXPORT_SYMBOL_GPL(usb_phy_roothub_power_off);

+int usb_phy_roothub_port_reset(struct usb_phy_roothub *phy_roothub, int port)
+{
+ struct usb_phy_roothub *roothub_entry;
+ struct list_head *head;
+ int i = 0;
+
+ if (!phy_roothub)
+ return -EINVAL;
+
+ head = _roothub->list;
+
+ list_for_each_entry(roothub_entry, head, list) {
+ if (i == port)
+ return phy_reset(roothub_entry->phy);
+ i++;
+ }


I'm not that familiar with SoC's that have several PHYs per controller,
but this looks odd.

For the above code to work wouldn't it require that each port has their own PHY,
and the PHYs are added to the list of usb_phy_roothub is in the same order as 
usb ports?

Or is there something I don't understand here?


In our SOC (Stingray), xHCI controller has three ports and each port
connected to separate PHY.
Stingray xHCI controller supports both SS and HS ports and connected
separate PHYs.
We passed PHY phandlers in xHCI DT node in the order of port numbers.
as shown below xHCI DT node.
So that all PHYs added to usb_phy_roothub are in order of port numbers.
  xhci1: usb@11000 {
 compatible = "generic-xhci";
 reg = <0x00011000 0x1000>;
 interrupts = ;
 phys = <_phy1>, <>, <_phy0>;
 phy-names = "phy0", "phy1", "phy2";
 dma-coherent;
 status = "disabled";
};
But we have issue with HS PHYs, so that those PHYs are required to reset.


This is very specific to your SOC.
A quick grep shows most xhci controllers have one or two PHYs in total.

The suggested usb_phy_roothub_port_reset() above will only work for your SOC.
I think it would be better to have a stingray specific quirk/workaround where 
you
can find the right PHY to reset based on somthing like port number and PHY name.

For a generic solution we would need a better way to map usb ports to the PHY 
it uses.

-Mathias


Re: [PATCH 2/2] drivers: xhci: Add quirk to reset xHCI port PHY

2019-02-07 Thread Mathias Nyman

On 05.02.2019 08:18, Srinath Mannam wrote:

Add a quirk to reset xHCI port PHY on port disconnect event.
Stingray USB HS PHY has an issue, that USB High Speed device detected
at Full Speed after the same port has connected to Full speed device.
This problem can be resolved with that port PHY reset on disconnect.

Signed-off-by: Srinath Mannam 
Reviewed-by: Ray Jui 
---
  drivers/usb/core/hcd.c   |  6 ++
  drivers/usb/core/phy.c   | 21 +
  drivers/usb/core/phy.h   |  1 +
  drivers/usb/host/xhci-plat.c |  3 +++
  drivers/usb/host/xhci-ring.c |  9 ++---
  drivers/usb/host/xhci.h  |  1 +
  include/linux/usb/hcd.h  |  1 +
  7 files changed, 39 insertions(+), 3 deletions(-)

diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
index 015b126..e2b87a6 100644
--- a/drivers/usb/core/hcd.c
+++ b/drivers/usb/core/hcd.c
@@ -2663,6 +2663,12 @@ int usb_hcd_find_raw_port_number(struct usb_hcd *hcd, 
int port1)
return hcd->driver->find_raw_port_number(hcd, port1);
  }
  
+int usb_hcd_phy_port_reset(struct usb_hcd *hcd, int port)

+{
+   return usb_phy_roothub_port_reset(hcd->phy_roothub, port);
+}
+EXPORT_SYMBOL_GPL(usb_hcd_phy_port_reset);
+
  static int usb_hcd_request_irqs(struct usb_hcd *hcd,
unsigned int irqnum, unsigned long irqflags)
  {
diff --git a/drivers/usb/core/phy.c b/drivers/usb/core/phy.c
index 38b2c77..c64767d 100644
--- a/drivers/usb/core/phy.c
+++ b/drivers/usb/core/phy.c
@@ -162,6 +162,27 @@ void usb_phy_roothub_power_off(struct usb_phy_roothub 
*phy_roothub)
  }
  EXPORT_SYMBOL_GPL(usb_phy_roothub_power_off);
  
+int usb_phy_roothub_port_reset(struct usb_phy_roothub *phy_roothub, int port)

+{
+   struct usb_phy_roothub *roothub_entry;
+   struct list_head *head;
+   int i = 0;
+
+   if (!phy_roothub)
+   return -EINVAL;
+
+   head = _roothub->list;
+
+   list_for_each_entry(roothub_entry, head, list) {
+   if (i == port)
+   return phy_reset(roothub_entry->phy);
+   i++;
+   }


I'm not that familiar with SoC's that have several PHYs per controller,
but this looks odd.

For the above code to work wouldn't it require that each port has their own PHY,
and the PHYs are added to the list of usb_phy_roothub is in the same order as 
usb ports?

Or is there something I don't understand here?

-Mathias



Re: [PATCH] xhci: Use ffs() to find page size in xhci_mem_init()

2019-02-07 Thread Mathias Nyman

On 07.02.2019 12:58, Felipe Balbi wrote:


Hi,

Mathias Nyman  writes:

Get page size order using ffs() instead of open coding it with a loop.

Signed-off-by: Andrey Smirnov 
Cc: Mathias Nyman 
Cc: Greg Kroah-Hartman 
Cc: linux-...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
drivers/usb/host/xhci-mem.c | 6 +-
1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
index 36a3eb8849f1..44b43c3d819f 100644
--- a/drivers/usb/host/xhci-mem.c
+++ b/drivers/usb/host/xhci-mem.c
@@ -2362,11 +2362,7 @@ int xhci_mem_init(struct xhci_hcd *xhci, gfp_t flags)
page_size = readl(>op_regs->page_size);
xhci_dbg_trace(xhci, trace_xhci_dbg_init,
"Supported page size register = 0x%x", page_size);
-   for (i = 0; i < 16; i++) {
-   if ((0x1 & page_size) != 0)
-   break;
-   page_size = page_size >> 1;
-   }
+   i = ffs(page_size);
if (i < 16)
xhci_dbg_trace(xhci, trace_xhci_dbg_init,
"Supported page size of %iK", (1 << (i+12)) / 1024);


Hi

using ffs() is a welcome change, but it will give different a result than the 
loop.

*old loop
 valid page_size value if i < 16
*ffs()
 valid page_size value if i >= 1 and i < 17


off-by-one, just use i = ffs() - 1. Or use __ffs().


and handle the page_size == 0 case.


Can it be zero in real life, or are you protecting against academic
possibility that's never going to happen in HW?



whole page_size check is not really doing much, just printing out
different debug messages.

-Mathias



Re: [PATCH] xhci: Use ffs() to find page size in xhci_mem_init()

2019-02-07 Thread Mathias Nyman

On 07.02.2019 11:06, Felipe Balbi wrote:


Hi,

Mathias Nyman  writes:

On 07.02.2019 02:03, Andrey Smirnov wrote:

Get page size order using ffs() instead of open coding it with a loop.

Signed-off-by: Andrey Smirnov 
Cc: Mathias Nyman 
Cc: Greg Kroah-Hartman 
Cc: linux-...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
   drivers/usb/host/xhci-mem.c | 6 +-
   1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
index 36a3eb8849f1..44b43c3d819f 100644
--- a/drivers/usb/host/xhci-mem.c
+++ b/drivers/usb/host/xhci-mem.c
@@ -2362,11 +2362,7 @@ int xhci_mem_init(struct xhci_hcd *xhci, gfp_t flags)
page_size = readl(>op_regs->page_size);
xhci_dbg_trace(xhci, trace_xhci_dbg_init,
"Supported page size register = 0x%x", page_size);
-   for (i = 0; i < 16; i++) {
-   if ((0x1 & page_size) != 0)
-   break;
-   page_size = page_size >> 1;
-   }
+   i = ffs(page_size);
if (i < 16)
xhci_dbg_trace(xhci, trace_xhci_dbg_init,
"Supported page size of %iK", (1 << (i+12)) / 1024);


Hi

using ffs() is a welcome change, but it will give different a result than the 
loop.

*old loop
valid page_size value if i < 16
*ffs()
valid page_size value if i >= 1 and i < 17


off-by-one, just use i = ffs() - 1. Or use __ffs().



and handle the page_size == 0 case.

-Mathias  


Re: [PATCH] xhci: Use ffs() to find page size in xhci_mem_init()

2019-02-07 Thread Mathias Nyman

On 07.02.2019 02:03, Andrey Smirnov wrote:

Get page size order using ffs() instead of open coding it with a loop.

Signed-off-by: Andrey Smirnov 
Cc: Mathias Nyman 
Cc: Greg Kroah-Hartman 
Cc: linux-...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
  drivers/usb/host/xhci-mem.c | 6 +-
  1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
index 36a3eb8849f1..44b43c3d819f 100644
--- a/drivers/usb/host/xhci-mem.c
+++ b/drivers/usb/host/xhci-mem.c
@@ -2362,11 +2362,7 @@ int xhci_mem_init(struct xhci_hcd *xhci, gfp_t flags)
page_size = readl(>op_regs->page_size);
xhci_dbg_trace(xhci, trace_xhci_dbg_init,
"Supported page size register = 0x%x", page_size);
-   for (i = 0; i < 16; i++) {
-   if ((0x1 & page_size) != 0)
-   break;
-   page_size = page_size >> 1;
-   }
+   i = ffs(page_size);
if (i < 16)
xhci_dbg_trace(xhci, trace_xhci_dbg_init,
"Supported page size of %iK", (1 << (i+12)) / 1024);


Hi

using ffs() is a welcome change, but it will give different a result than the 
loop.

*old loop
  valid page_size value if i < 16
*ffs()
  valid page_size value if i >= 1 and i < 17
 
-Mathias


 



Re: [PATCH] xhci: Convert xhci_handshake() to use readl_poll_timeout()

2019-02-07 Thread Mathias Nyman

On 07.02.2019 02:03, Andrey Smirnov wrote:

Xhci_handshake() implements the algorithm already captured by
readl_poll_timeout(). Convert the former to use the latter to avoid
repetition.


readl_poll_timeout() doesn't really work here as it might sleep.

iopoll.h:

/**
 * readx_poll_timeout - Periodically poll an address until a condition is met 
or a timeout occurs
 *
...  
 * Returns 0 on success and -ETIMEDOUT upon a timeout. In either

 * case, the last read value at @addr is stored in @val. Must not
 * be called from atomic context if sleep_us or timeout_us are used.
 
-Mathias


Re: kernel: xhci_hcd 0000:00:14.0: ERROR unknown event type 37 - Kernel 4.19.13

2019-01-24 Thread Mathias Nyman

On 10.01.2019 00:11, Nathan Royce wrote:

Wow, my system got wrecked (exaggeration) during this latest stretch...
Pulseaudio was stretched to the limit and beyond and was forced to
restart. Anything that was producing audio had to be restarted to get
it back.
This time was much like the first time and went from timestamp
573100.060927 (line 1) to 572506.604155 (line 11069), where 100%
(literally) of it was that event 37 in the journal, no other kernel
log entries except for the systemd-hostnamed audit before it all went
down.
And as usual, it was my USB TV tuner (tvheadend really) giving the
Poll Timeout log entries.
Those same uploaded trace files will be updated with the latest bugout.



Hi.

Finally had a chance to look at this. Sorry about the delay.

Logs show event ring is full:

573047.104801: xhci_handle_event: EVENT: TRB  status 'Event 
Ring Full Error' len 0 slot 0 ep 0 type 'Host Controller Event' flags e:C

It's filled with 0 length short transfer events due to a the following loop:

1. Class driver asks for 58658 bytes from device (queues BULK IN URB)
2. short transfer event, xhci interrupts, saying we got less than 58658 bytes.
   We actually got 0 bytes.
3. return URB with zero bytes to class driver
4. Class driver immediately queues a new URB, asking for 58658 bytes (see step 
1)

Last 6ms before event ring is full this looped 255 times.

one cycle of the loop in log:

573047.104748: xhci_handle_event: EVENT: TRB 00020b267770 status 'Short 
Packet' len 58658 slot 4 ep 7 type 'Transfer Event' flags e:C
573047.104749: xhci_handle_transfer: BULK: Buffer 1de2 length 58658 
TD size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:c
573047.104749: xhci_inc_deq: BULK 02f14758: enq 
0x00020b267860(0x00020b267000) deq 
0x00020b267780(0x00020b267000) segs 2 stream 0 free_trbs 495 bounce 512 
cycle 0
573047.104752: xhci_urb_giveback: ep3in-bulk: urb 0bdcbe77 pipe 
3225519232 slot 4 length 0/58658 sgs 0/0 stream 0 flags 00010200
573047.104758: xhci_urb_enqueue: ep3in-bulk: urb 0bdcbe77 pipe 
3225519232 slot 4 length 0/58658 sgs 0/0 stream 0 flags 00010200
573047.104758: xhci_queue_trb: BULK: Buffer 1de2 length 58658 TD 
size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:C
573047.104759: xhci_inc_enq: BULK 02f14758: enq 
0x00020b267870(0x00020b267000) deq 
0x00020b267780(0x00020b267000) segs 2 stream 0 free_trbs 494 bounce 512 
cycle 0
573047.104759: xhci_inc_deq: EVENT 22f906c2: enq 
0x00020b317000(0x00020b317000) deq 
0x00020b3178d0(0x00020b317000) segs 1 stream 0 free_trbs 254 bounce 0 
cycle 1

I'll continue digging into this

-Mathias  







Re: kernel: xhci_hcd 0000:00:14.0: ERROR unknown event type 37 - Kernel 4.19.13

2019-01-02 Thread Mathias Nyman

On 01.01.2019 20:57, Nathan Royce wrote:

Kernel 4.19.13

00:14.0 USB controller: Intel Corporation 9 Series Chipset Family USB
xHCI Controller

Around 400 "unknown event type 37" messages logged in a 2 second span.
*
Jan 01 02:08:07 computername tvheadend[2370]: linuxdvb: Auvitek AU8522
QAM/8VSB Frontend #0 : ATSC-T #0 - poll TIMEOUT
Jan 01 02:08:00 computername tvheadend[2370]: linuxdvb: Auvitek AU8522
QAM/8VSB Frontend #0 : ATSC-T #0 - poll TIMEOUT
Jan 01 02:07:56 computername kernel: xhci_hcd :00:14.0: ERROR
unknown event type 37
...
Jan 01 02:07:55 computername kernel: xhci_hcd :00:14.0: ERROR
unknown event type 37
Jan 01 02:07:55 computername kernel: xhci_hcd :00:14.0: ERROR
unknown event type 37
Jan 01 02:07:52 computername tvheadend[2370]: linuxdvb: Auvitek AU8522
QAM/8VSB Frontend #0 : ATSC-T #0 - poll TIMEOUT
Jan 01 02:07:44 computername tvheadend[2370]: linuxdvb: Auvitek AU8522
QAM/8VSB Frontend #0 : ATSC-T #0 - poll TIMEOUT
*

I question whether this also caused kemleak to crash as well (will
post after this).

Regarding my tv tuner, it isn't supported by the kernel specifically,
but is close enough that all I have to do is alter a single source
file to include my device's pid, and it works just fine almost all of
the time.



The event type 37 is a host controller event, most likely a event ring full 
error.

So there are probably so many events that we fill the event ring before we can 
handle them.

Could you take traces of this?
Note that the trace file will be huge.

mount -t debugfs none /sys/kernel/debug
echo 81920 > /sys/kernel/debug/tracing/buffer_size_kb
echo 1 > /sys/kernel/debug/tracing/events/xhci-hcd/enable

copy the traces somewhere safe once the error is triggered:
cp /sys/kernel/debug/tracing/trace  /

-Mathias


Re: [PATCH] xhci: fix 'broken_suspend' placement in struct xchi_hcd

2018-12-17 Thread Mathias Nyman

On 17.12.2018 15:37, Nicolas Saenz Julienne wrote:

As commented in the struct's definition there shouldn't be anything
underneath it's 'priv[0]' member as it would break some macros.

The patch converts the broken_suspend into a bit-field and relocates it
next to to the rest of bit-fields.

Fixes: a7d57abcc8a5 ("xhci: workaround CSS timeout on AMD SNPS 3.0 xHC")
Reported-by: Oliver Neukum  
Signed-off-by: Nicolas Saenz Julienne 
---
  drivers/usb/host/xhci.h | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index c3515bad5dbb..011dd45f8718 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1863,6 +1863,8 @@ struct xhci_hcd {
unsignedsw_lpm_support:1;
/* support xHCI 1.0 spec USB2 hardware LPM */
unsignedhw_lpm_support:1;
+   /* Broken Suspend flag for SNPS Suspend resume issue */
+   unsignedbroken_suspend:1;
/* cached usb2 extened protocol capabilites */
u32 *ext_caps;
unsigned intnum_ext_caps;
@@ -1880,8 +1882,6 @@ struct xhci_hcd {
void*dbc;
/* platform-specific data -- must come last */
unsigned long   priv[0] __aligned(sizeof(s64));
-   /* Broken Suspend flag for SNPS Suspend resume issue */
-   u8  broken_suspend;
  };
  
  /* Platform specific overrides to generic XHCI hc_driver ops */




Thanks, not sure how I missed that.

Greg, in case you want to pick this simple fix to 4.20 still:
Acked-by: Mathias Nyman 

Or prefer me to resend it?

-Mathias


Re: [PATCH 5/6] xhci: Use device_iommu_mapped()

2018-12-17 Thread Mathias Nyman

On 11.12.2018 15:43, Joerg Roedel wrote:

From: Joerg Roedel 

Replace the dev->iommu_group check with a proper function
call that better reprensents its purpose.

Cc: Mathias Nyman 
Acked-by: Robin Murphy 
Signed-off-by: Joerg Roedel 
---
  drivers/usb/host/xhci.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index dae3be1b9c8f..8eacd2ed412b 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -244,7 +244,7 @@ static void xhci_zero_64b_regs(struct xhci_hcd *xhci)
 * an iommu. Doing anything when there is no iommu is definitely
 * unsafe...
 */
-   if (!(xhci->quirks & XHCI_ZERO_64B_REGS) || !dev->iommu_group)
+   if (!(xhci->quirks & XHCI_ZERO_64B_REGS) || !device_iommu_mapped(dev))
return;
  
  	xhci_info(xhci, "Zeroing 64bit base registers, expecting fault\n");




Acked-by: Mathias Nyman 


Re: [PATCH 4.19 014/110] usb: xhci: Prevent bus suspend if a port connect change or polling state is detected

2018-12-13 Thread Mathias Nyman

On 13.12.2018 09:36, Greg Kroah-Hartman wrote:

On Wed, Dec 12, 2018 at 11:53:34PM +0100, Thomas Zeitlhofer wrote:

Hello,

On Thu, Nov 29, 2018 at 03:11:45PM +0100, Greg Kroah-Hartman wrote:

4.19-stable review patch.  If anyone has any objections, please let me
know.

--

From: Mathias Nyman 

commit 2f31a67f01a8beb22cae754c53522cb61a005750 upstream.

[...]

on a current Thinkpad X1 Yoga, this breaks resume from hibernate such
that opening the lid has (in the regular use case, see below) no effect
any more:

The system is configured to hibernate when the lid is closed. So, the
expected behavior, which is restored by reverting this patch, is:

close lid => system hibernates
open lid => system resumes

With this patch, the following two cases are observed:

1)
close lid => system hibernates
open lid => system stays off
press power button => system boots and resumes

2)
# systemctl hibernate => system hibernates
close lid
open lid => system resumes



So this is a problem in Linus's tree?  If so, let's work to get this
fixed there first.

If not, then we have other issues :)



That patch incorrectly reacts to USB2 polling states as well,
which could cause issues like this.

Does applying the below code help?

diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
index 94aca1b..01b5818 100644
--- a/drivers/usb/host/xhci-hub.c
+++ b/drivers/usb/host/xhci-hub.c
@@ -1507,7 +1507,8 @@ int xhci_bus_suspend(struct usb_hcd *hcd)
portsc_buf[port_index] = 0;
 
/* Bail out if a USB3 port has a new device in link training */

-   if ((t1 & PORT_PLS_MASK) == XDEV_POLLING) {
+   if ((hcd->speed >= HCD_USB3) &&
+   (t1 & PORT_PLS_MASK) == XDEV_POLLING) {
bus_state->bus_suspended = 0;
spin_unlock_irqrestore(>lock, flags);
xhci_dbg(xhci, "Bus suspend bailout, port in 
polling\n");


Re: [PATCH v2] usb: xhci: fix small typo

2018-12-11 Thread Mathias Nyman

On 10.12.2018 18:34, Frank Lee wrote:

ping..



Thanks
Adding to queue for 4.22

-Mathias


Re: [PATCH] USB: quirks: add NO_LPM quirk for Logitech Flare|Meetup|Brio|Rally

2018-12-10 Thread Mathias Nyman

On 08.12.2018 00:18, Kyle Williams wrote:

On Tue, Dec 04, 2018 at 04:36:18PM -0500, Alan Stern wrote:

On Tue, 4 Dec 2018, Kyle Williams wrote:


Description: Some USB device / host controller combinations seem to have
problems with Link Power management. In particular it is described that
the combination of certain Logitech devices and other powered media
devices such as the Atrus device causes 'not enough bandwidth for
new device state'error.

This patch creates quirk entries for the tested Logitech device
indicating LPM should remain disabled for the device.

Signed-off-by: Kyle Williams 
---
  drivers/usb/core/quirks.c | 16 
  1 file changed, 16 insertions(+)

diff --git a/drivers/usb/core/quirks.c b/drivers/usb/core/quirks.c
index 0690fcff0ea2..9403edee4797 100644
--- a/drivers/usb/core/quirks.c
+++ b/drivers/usb/core/quirks.c
@@ -246,6 +246,22 @@ static const struct usb_device_id usb_quirk_list[] = {
 /* Logitech Harmony 700-series */
 { USB_DEVICE(0x046d, 0xc122), .driver_info = USB_QUIRK_DELAY_INIT },

+   /* Logitech Flare */
+   { USB_DEVICE(0x046d, 0x0876), .driver_info = USB_QUIRK_NO_LPM },


This entry is out of order with the preceding entry.  And some of the
new entries below are out of order with each other (entries are
supposed to be sorted by Vendor ID, then Product ID).

Also, perhaps instead of adding all these new entries, we should set
the NO_LPM quirk flag for all Logitech devices?

Alan Stern

Setting USB_QUIRK_NO_LPM for all Logitech devices instead of specific
ones seem to be a better solution as there are a lot of other devices
that have the issue as well

Kyle Williams


I recently found a cause for the "not enough bandwidth for new device state" 
error.
Patch just got applied to v4.20-rc6
0472bf0 xhci: Prevent U1/U2 link pm states if exit latency is too long

Does it work for your Logitech devices?

-Mathias



Re: [PATCH v3] xhci: Add quirk to workaround the errata seen on Cavium Thunder-X2 Soc

2018-10-30 Thread Mathias Nyman

On 30.10.2018 12:48, Cherian, George wrote:

Implement workaround for ThunderX2 Errata-129 (documented in
CN99XX Known Issues" available at Cavium support site).
As per ThunderX2errata-129, USB 2 device may come up as USB 1
if a connection to a USB 1 device is followed by another connection to
a USB 2 device, the link will come up as USB 1 for the USB 2 device.

Resolution: Reset the PHY after the USB 1 device is disconnected.
The PHY reset sequence is done using private registers in XHCI register
space. After the PHY is reset we check for the PLL lock status and retry
the operation if it fails. From our tests, retrying 4 times is sufficient.

Add a new quirk flag XHCI_RESET_PLL_ON_DISCONNECT to invoke the workaround
in handle_xhci_port_status().

Signed-off-by: George Cherian 


Thanks, adding to queue

-Mathias


Re: [PATCH v3] xhci: Add quirk to workaround the errata seen on Cavium Thunder-X2 Soc

2018-10-30 Thread Mathias Nyman

On 30.10.2018 12:48, Cherian, George wrote:

Implement workaround for ThunderX2 Errata-129 (documented in
CN99XX Known Issues" available at Cavium support site).
As per ThunderX2errata-129, USB 2 device may come up as USB 1
if a connection to a USB 1 device is followed by another connection to
a USB 2 device, the link will come up as USB 1 for the USB 2 device.

Resolution: Reset the PHY after the USB 1 device is disconnected.
The PHY reset sequence is done using private registers in XHCI register
space. After the PHY is reset we check for the PLL lock status and retry
the operation if it fails. From our tests, retrying 4 times is sufficient.

Add a new quirk flag XHCI_RESET_PLL_ON_DISCONNECT to invoke the workaround
in handle_xhci_port_status().

Signed-off-by: George Cherian 


Thanks, adding to queue

-Mathias


Re: [PATCH V2 0/5] Tegra xHCI genpd support

2018-10-16 Thread Mathias Nyman

On 15.10.2018 15:40, Jon Hunter wrote:

Hi Mathias,

On 28/09/18 15:11, Jon Hunter wrote:

This series add genpd support for the Tegra xHCI device that requires
more than one power-domain to operate.

This series is making changes to more than one subsystem and at first
glance may look like a maintainers nightmare. However, my proposal is
this, once reviewed and people are happy ...
1. Patches 1-3 should be merged first. Patches 1 can go via the Tegra
tree and 2-3 via the USB tree.
2. Once patches 1-3 are accepted, then 4 and 5 can be merged via the
Tegra tree.

Changes since V1:
- Dropped unneeded patch to export symbol for genpd API
- Added patch to power-off XUSB domains
- Slight re-work to Tegra xHCI driver changes to simplify clean-up path.

Jon Hunter (5):
   dt-bindings: usb: xhci-tegra: Add power-domain details
   usb: xhci: tegra: Power-off power-domains on removal
   usb: xhci: tegra: Add genpd support
   soc/tegra: pmc: Don't power-up XUSB power-domains
   arm64: dts: tegra210: Add power-domains for xHCI


Let me know you are ok to pick up patches 2 and 3 for v4.20.



Patches 2 and 3 look good to me.
I'll resend them to Greg

-Mathias


Re: [PATCH V2 0/5] Tegra xHCI genpd support

2018-10-16 Thread Mathias Nyman

On 15.10.2018 15:40, Jon Hunter wrote:

Hi Mathias,

On 28/09/18 15:11, Jon Hunter wrote:

This series add genpd support for the Tegra xHCI device that requires
more than one power-domain to operate.

This series is making changes to more than one subsystem and at first
glance may look like a maintainers nightmare. However, my proposal is
this, once reviewed and people are happy ...
1. Patches 1-3 should be merged first. Patches 1 can go via the Tegra
tree and 2-3 via the USB tree.
2. Once patches 1-3 are accepted, then 4 and 5 can be merged via the
Tegra tree.

Changes since V1:
- Dropped unneeded patch to export symbol for genpd API
- Added patch to power-off XUSB domains
- Slight re-work to Tegra xHCI driver changes to simplify clean-up path.

Jon Hunter (5):
   dt-bindings: usb: xhci-tegra: Add power-domain details
   usb: xhci: tegra: Power-off power-domains on removal
   usb: xhci: tegra: Add genpd support
   soc/tegra: pmc: Don't power-up XUSB power-domains
   arm64: dts: tegra210: Add power-domains for xHCI


Let me know you are ok to pick up patches 2 and 3 for v4.20.



Patches 2 and 3 look good to me.
I'll resend them to Greg

-Mathias


Re: [PATCH] usb: xhci: dbc: Don't decrement runtime PM counter if DBC is not started

2018-06-25 Thread Mathias Nyman

On 25.06.2018 08:20, Kai-Heng Feng wrote:

pm_runtime_put_sync() gets called everytime in xhci_dbc_stop().

If dbc is not started, this makes the runtime PM counter incorrectly
becomes 0, and calls autosuspend function. Then we'll keep seeing this:
[54664.762220] xhci_hcd :00:14.0: Root hub is not suspended

So only calls pm_runtime_put_sync() when dbc was started.

Signed-off-by: Kai-Heng Feng 


Thanks, nice catch.
Adding to queue

-Mathias



Re: [PATCH] usb: xhci: dbc: Don't decrement runtime PM counter if DBC is not started

2018-06-25 Thread Mathias Nyman

On 25.06.2018 08:20, Kai-Heng Feng wrote:

pm_runtime_put_sync() gets called everytime in xhci_dbc_stop().

If dbc is not started, this makes the runtime PM counter incorrectly
becomes 0, and calls autosuspend function. Then we'll keep seeing this:
[54664.762220] xhci_hcd :00:14.0: Root hub is not suspended

So only calls pm_runtime_put_sync() when dbc was started.

Signed-off-by: Kai-Heng Feng 


Thanks, nice catch.
Adding to queue

-Mathias



Re: [PATCH] xhci: Fix USB ports for Dell Inspiron 5775

2018-04-10 Thread Mathias Nyman

On 10.04.2018 07:56, Kai Heng Feng wrote:

Hi Matthias,


On Mar 18, 2018, at 11:11 PM, Kai-Heng Feng  wrote:

The Dell Inspiron 5775 is a Raven Ridge. The Enable Slot command timed
out when a USB device gets plugged:
[ 212.156326] xhci_hcd :03:00.3: Error while assigning device slot ID
[ 212.156340] xhci_hcd :03:00.3: Max number of devices this xHCI host 
supports is 64.
[ 212.156348] usb usb2-port3: couldn't allocate usb_device

AMD suggests that a delay before xHC suspends can fix the issue.

I can confirm it fixes the issue, so use the suspend delay quirk for
Raven Ridge's xHC.


I am hoping this patch can get merged in v4.17...

Thanks,
Kai-Heng



Added to queue, sending after rc1

-Mathias


Re: [PATCH] xhci: Fix USB ports for Dell Inspiron 5775

2018-04-10 Thread Mathias Nyman

On 10.04.2018 07:56, Kai Heng Feng wrote:

Hi Matthias,


On Mar 18, 2018, at 11:11 PM, Kai-Heng Feng  wrote:

The Dell Inspiron 5775 is a Raven Ridge. The Enable Slot command timed
out when a USB device gets plugged:
[ 212.156326] xhci_hcd :03:00.3: Error while assigning device slot ID
[ 212.156340] xhci_hcd :03:00.3: Max number of devices this xHCI host 
supports is 64.
[ 212.156348] usb usb2-port3: couldn't allocate usb_device

AMD suggests that a delay before xHC suspends can fix the issue.

I can confirm it fixes the issue, so use the suspend delay quirk for
Raven Ridge's xHC.


I am hoping this patch can get merged in v4.17...

Thanks,
Kai-Heng



Added to queue, sending after rc1

-Mathias


Re: Intel GemniLake xHCI connected devices can never wake up the system from suspend

2018-03-16 Thread Mathias Nyman

Adding Rafael directly to CC

In short, if _S3D and _S3W are missing in DSDT then a PCI device
stays in D0 during suspend in Linux, but goes to D3 in Windows.

USB wake doesn't work in Geminilake because of this.

Should this be changed? reasoning below.

On 16.03.2018 10:23, Daniel Drake wrote:

I've studied the ACPI spec trying to understand better, but I'm
struggling with the question:
What is the maximum number (lowest power) permitted device power state
for a device that is configured as able to wake the system from S3,
**that does not implement the _S3W method**?


Actually the ACPI spec has an answer for the case when _S3D is present.
The lack of clarity is only over the situation when both _S3D and _S3W
are missing - like on the platforms being worked on here.

The _S3D docs say:

If the device can wake the system from the S3 system sleeping state (see
_PRW) then the device must support wake in the D-state returned by this
object. However, OSPM cannot assume wake from the S3 system sleeping state
is supported in any deeper D-state unless specified by a corresponding
_S3W object


Looking at the design of the existing Linux code, it seems like this
"max = min" assignment that is causing us trouble originates directly
from an attempt to implement that logic: if we didn't get a response from
_S3W, then we must clamp ourselves to the data we got from _S3D.

If I modify the Linux code to be a little more specific in that logic
(only applying when we actually got something from _S3D) then the
problematic behaviour is avoided and USB wakeups work.

I feel that this change makes the Linux implementation more directly
mirror the wording in the ACPI spec and it's associated lack of clarity
for when both methods are missing. Thoughts?

---
  drivers/acpi/device_pm.c | 11 ---
  1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/acpi/device_pm.c b/drivers/acpi/device_pm.c
index a4c8ad98560d..44f12c5c75ee 100644
--- a/drivers/acpi/device_pm.c
+++ b/drivers/acpi/device_pm.c
@@ -543,6 +543,7 @@ static int acpi_dev_pm_get_state(struct device *dev, struct 
acpi_device *adev,
unsigned long long ret;
int d_min, d_max;
bool wakeup = false;
+   acpi_status sxd_status;
acpi_status status;
  
  	/*

@@ -565,8 +566,8 @@ static int acpi_dev_pm_get_state(struct device *dev, struct 
acpi_device *adev,
 * provided if AE_NOT_FOUND is returned.
 */
ret = d_min;
-   status = acpi_evaluate_integer(handle, method, NULL, );
-   if ((ACPI_FAILURE(status) && status != AE_NOT_FOUND)
+   sxd_status = acpi_evaluate_integer(handle, method, NULL, );
+   if ((ACPI_FAILURE(sxd_status) && sxd_status != AE_NOT_FOUND)
|| ret > ACPI_STATE_D3_COLD)
return -ENODATA;
  
@@ -599,7 +600,11 @@ static int acpi_dev_pm_get_state(struct device *dev, struct acpi_device *adev,

method[3] = 'W';
status = acpi_evaluate_integer(handle, method, NULL, );
if (status == AE_NOT_FOUND) {
-   if (target_state > ACPI_STATE_S0)
+   /* No _SxW. In this case, the ACPI spec says that we
+* must not go into any power state deeper than the
+* value returned from _SxD.
+*/
+   if (sxd_status == AE_OK && target_state > ACPI_STATE_S0)
d_max = d_min;
} else if (ACPI_SUCCESS(status) && ret <= ACPI_STATE_D3_COLD) {
/* Fall back to D3cold if ret is not a valid state. */





Re: Intel GemniLake xHCI connected devices can never wake up the system from suspend

2018-03-16 Thread Mathias Nyman

Adding Rafael directly to CC

In short, if _S3D and _S3W are missing in DSDT then a PCI device
stays in D0 during suspend in Linux, but goes to D3 in Windows.

USB wake doesn't work in Geminilake because of this.

Should this be changed? reasoning below.

On 16.03.2018 10:23, Daniel Drake wrote:

I've studied the ACPI spec trying to understand better, but I'm
struggling with the question:
What is the maximum number (lowest power) permitted device power state
for a device that is configured as able to wake the system from S3,
**that does not implement the _S3W method**?


Actually the ACPI spec has an answer for the case when _S3D is present.
The lack of clarity is only over the situation when both _S3D and _S3W
are missing - like on the platforms being worked on here.

The _S3D docs say:

If the device can wake the system from the S3 system sleeping state (see
_PRW) then the device must support wake in the D-state returned by this
object. However, OSPM cannot assume wake from the S3 system sleeping state
is supported in any deeper D-state unless specified by a corresponding
_S3W object


Looking at the design of the existing Linux code, it seems like this
"max = min" assignment that is causing us trouble originates directly
from an attempt to implement that logic: if we didn't get a response from
_S3W, then we must clamp ourselves to the data we got from _S3D.

If I modify the Linux code to be a little more specific in that logic
(only applying when we actually got something from _S3D) then the
problematic behaviour is avoided and USB wakeups work.

I feel that this change makes the Linux implementation more directly
mirror the wording in the ACPI spec and it's associated lack of clarity
for when both methods are missing. Thoughts?

---
  drivers/acpi/device_pm.c | 11 ---
  1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/acpi/device_pm.c b/drivers/acpi/device_pm.c
index a4c8ad98560d..44f12c5c75ee 100644
--- a/drivers/acpi/device_pm.c
+++ b/drivers/acpi/device_pm.c
@@ -543,6 +543,7 @@ static int acpi_dev_pm_get_state(struct device *dev, struct 
acpi_device *adev,
unsigned long long ret;
int d_min, d_max;
bool wakeup = false;
+   acpi_status sxd_status;
acpi_status status;
  
  	/*

@@ -565,8 +566,8 @@ static int acpi_dev_pm_get_state(struct device *dev, struct 
acpi_device *adev,
 * provided if AE_NOT_FOUND is returned.
 */
ret = d_min;
-   status = acpi_evaluate_integer(handle, method, NULL, );
-   if ((ACPI_FAILURE(status) && status != AE_NOT_FOUND)
+   sxd_status = acpi_evaluate_integer(handle, method, NULL, );
+   if ((ACPI_FAILURE(sxd_status) && sxd_status != AE_NOT_FOUND)
|| ret > ACPI_STATE_D3_COLD)
return -ENODATA;
  
@@ -599,7 +600,11 @@ static int acpi_dev_pm_get_state(struct device *dev, struct acpi_device *adev,

method[3] = 'W';
status = acpi_evaluate_integer(handle, method, NULL, );
if (status == AE_NOT_FOUND) {
-   if (target_state > ACPI_STATE_S0)
+   /* No _SxW. In this case, the ACPI spec says that we
+* must not go into any power state deeper than the
+* value returned from _SxD.
+*/
+   if (sxd_status == AE_OK && target_state > ACPI_STATE_S0)
d_max = d_min;
} else if (ACPI_SUCCESS(status) && ret <= ACPI_STATE_D3_COLD) {
/* Fall back to D3cold if ret is not a valid state. */





Re: Intel GemniLake xHCI connected devices can never wake up the system from suspend

2018-03-15 Thread Mathias Nyman

On 15.03.2018 15:28, Chris Chiu wrote:

On Thu, Mar 15, 2018 at 7:46 PM, Mathias Nyman
<mathias.ny...@linux.intel.com> wrote:

On 15.03.2018 06:40, Chris Chiu wrote:


Hi,
  I  have a ASUS AIO V222GA and another Acer Desktop XC-830 both
have Intel CPU J5005 and they both hit the same problem. The XHCI
connected USB keyboard/mouse can never wakeup the system from suspend.
It reminds me that similiar thing happens on ApolloLake too which
needs the XHCI_PME_STUCK_QUIRK to get it work. It's also mentioned in

https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/pentium-celeron-n-series-spec-update.pdf
page #14 for N-seris intel CPU. And I also find the same problem
description in
https://www.intel.com/content/dam/www/public/us/en/documents/product-briefs/silver-celeron-spec-update.pdf
page #16 for J-series Intel CPU. Seems that they have different
workaround so I can not simply apply XHCI_PME_STUCK_QUIRK to make it
work.

  Anyone can help here?



N-Series
CHP8: USB xHCI Controller May Not Re-Enter D3 State After a USB Wake Event
- needs XHCI_PME_STUCK_QUIRK in driver (sets bit 28 at offset 80a4)

Intel® Pentium® Silver N5000
Intel® Pentium® Silver J5005
Intel® Celeron® N4000 and N4100
Intel® Celeron® J4105 and J4005

USB xHCI Controller May Not Re-enter a D3 State After a USB Wake Even
Need to clear PME_EN bit of of the standard PCI PM_CSR register.
I think Linux does this anyway (clears enabling PME when reaching D0)
So if I remember correct there was no specific workaround needed for this.

what is the PCI ID of your xhci controller? (lspci -nn)


They're both 8086:31a8. So you mean from the workaround description, it should
work w/o any extra code? "Software should clear bit 8 (PME_EN) of PM_CSR" has
been handled somewhere else?



One other possible cause is that xHCI never reaches PCI device D3 suspend
state during system suspend.
xHC can't generate PME# wake event from normal running PCI device D0 state.

PCI code in Linux will check with ACPI about the lowest possible D state
when suspending,
If there is something missing from the xHCI entry in ACPI DSDT table it
might select D0.
as the suspend state, causing wake failure.



Here's the DSDT ASL of the ASUS machine.
https://gist.github.com/mschiu77/8e9c8a0e5a98b70a6dfff45620340bf1


Scope (_SB.PCI0.XHC) has _PS0 method, so Linux will look for a _S3W to get the
lowest possible D state in S3, but_S3W is missing, so Linux pci-acpi code will
probably default to D0




ASUS said the BIOS has no problem on USB wakeup under Windows so I don't think
there's any update. Anything else could be cause for this?


Linux and Windows probably check different DSDT values

You can try to force your xHC to D3 during suspend with a hack:

diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
index 7815768..da46c52 100644
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -488,6 +488,10 @@ static pci_power_t acpi_pci_choose_state(struct pci_dev 
*pdev)
else
d_max = ACPI_STATE_D3_COLD;
acpi_state = acpi_pm_device_sleep_state(>dev, NULL, d_max);
+
+   if (pdev->device == 0x31a8)
+   acpi_state = ACPI_STATE_D3_HOT;
+
if (acpi_state < 0)
return PCI_POWER_ERROR;
 
and see if usb wake starts to work


-Mathias


Re: Intel GemniLake xHCI connected devices can never wake up the system from suspend

2018-03-15 Thread Mathias Nyman

On 15.03.2018 15:28, Chris Chiu wrote:

On Thu, Mar 15, 2018 at 7:46 PM, Mathias Nyman
 wrote:

On 15.03.2018 06:40, Chris Chiu wrote:


Hi,
  I  have a ASUS AIO V222GA and another Acer Desktop XC-830 both
have Intel CPU J5005 and they both hit the same problem. The XHCI
connected USB keyboard/mouse can never wakeup the system from suspend.
It reminds me that similiar thing happens on ApolloLake too which
needs the XHCI_PME_STUCK_QUIRK to get it work. It's also mentioned in

https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/pentium-celeron-n-series-spec-update.pdf
page #14 for N-seris intel CPU. And I also find the same problem
description in
https://www.intel.com/content/dam/www/public/us/en/documents/product-briefs/silver-celeron-spec-update.pdf
page #16 for J-series Intel CPU. Seems that they have different
workaround so I can not simply apply XHCI_PME_STUCK_QUIRK to make it
work.

  Anyone can help here?



N-Series
CHP8: USB xHCI Controller May Not Re-Enter D3 State After a USB Wake Event
- needs XHCI_PME_STUCK_QUIRK in driver (sets bit 28 at offset 80a4)

Intel® Pentium® Silver N5000
Intel® Pentium® Silver J5005
Intel® Celeron® N4000 and N4100
Intel® Celeron® J4105 and J4005

USB xHCI Controller May Not Re-enter a D3 State After a USB Wake Even
Need to clear PME_EN bit of of the standard PCI PM_CSR register.
I think Linux does this anyway (clears enabling PME when reaching D0)
So if I remember correct there was no specific workaround needed for this.

what is the PCI ID of your xhci controller? (lspci -nn)


They're both 8086:31a8. So you mean from the workaround description, it should
work w/o any extra code? "Software should clear bit 8 (PME_EN) of PM_CSR" has
been handled somewhere else?



One other possible cause is that xHCI never reaches PCI device D3 suspend
state during system suspend.
xHC can't generate PME# wake event from normal running PCI device D0 state.

PCI code in Linux will check with ACPI about the lowest possible D state
when suspending,
If there is something missing from the xHCI entry in ACPI DSDT table it
might select D0.
as the suspend state, causing wake failure.



Here's the DSDT ASL of the ASUS machine.
https://gist.github.com/mschiu77/8e9c8a0e5a98b70a6dfff45620340bf1


Scope (_SB.PCI0.XHC) has _PS0 method, so Linux will look for a _S3W to get the
lowest possible D state in S3, but_S3W is missing, so Linux pci-acpi code will
probably default to D0




ASUS said the BIOS has no problem on USB wakeup under Windows so I don't think
there's any update. Anything else could be cause for this?


Linux and Windows probably check different DSDT values

You can try to force your xHC to D3 during suspend with a hack:

diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
index 7815768..da46c52 100644
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -488,6 +488,10 @@ static pci_power_t acpi_pci_choose_state(struct pci_dev 
*pdev)
else
d_max = ACPI_STATE_D3_COLD;
acpi_state = acpi_pm_device_sleep_state(>dev, NULL, d_max);
+
+   if (pdev->device == 0x31a8)
+   acpi_state = ACPI_STATE_D3_HOT;
+
if (acpi_state < 0)
return PCI_POWER_ERROR;
 
and see if usb wake starts to work


-Mathias


Re: Intel GemniLake xHCI connected devices can never wake up the system from suspend

2018-03-15 Thread Mathias Nyman

On 15.03.2018 06:40, Chris Chiu wrote:

Hi,
 I  have a ASUS AIO V222GA and another Acer Desktop XC-830 both
have Intel CPU J5005 and they both hit the same problem. The XHCI
connected USB keyboard/mouse can never wakeup the system from suspend.
It reminds me that similiar thing happens on ApolloLake too which
needs the XHCI_PME_STUCK_QUIRK to get it work. It's also mentioned in
https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/pentium-celeron-n-series-spec-update.pdf
page #14 for N-seris intel CPU. And I also find the same problem
description in 
https://www.intel.com/content/dam/www/public/us/en/documents/product-briefs/silver-celeron-spec-update.pdf
page #16 for J-series Intel CPU. Seems that they have different
workaround so I can not simply apply XHCI_PME_STUCK_QUIRK to make it
work.

 Anyone can help here?



N-Series
CHP8: USB xHCI Controller May Not Re-Enter D3 State After a USB Wake Event
- needs XHCI_PME_STUCK_QUIRK in driver (sets bit 28 at offset 80a4)

Intel® Pentium® Silver N5000
Intel® Pentium® Silver J5005
Intel® Celeron® N4000 and N4100
Intel® Celeron® J4105 and J4005

USB xHCI Controller May Not Re-enter a D3 State After a USB Wake Even
Need to clear PME_EN bit of of the standard PCI PM_CSR register.
I think Linux does this anyway (clears enabling PME when reaching D0)
So if I remember correct there was no specific workaround needed for this.

what is the PCI ID of your xhci controller? (lspci -nn)

One other possible cause is that xHCI never reaches PCI device D3 suspend state 
during system suspend.
xHC can't generate PME# wake event from normal running PCI device D0 state.

PCI code in Linux will check with ACPI about the lowest possible D state when 
suspending,
If there is something missing from the xHCI entry in ACPI DSDT table it might 
select D0.
as the suspend state, causing wake failure.

Is there a BIOS update available for your ASUS and Acer?

-Mathias

 
 


Re: Intel GemniLake xHCI connected devices can never wake up the system from suspend

2018-03-15 Thread Mathias Nyman

On 15.03.2018 06:40, Chris Chiu wrote:

Hi,
 I  have a ASUS AIO V222GA and another Acer Desktop XC-830 both
have Intel CPU J5005 and they both hit the same problem. The XHCI
connected USB keyboard/mouse can never wakeup the system from suspend.
It reminds me that similiar thing happens on ApolloLake too which
needs the XHCI_PME_STUCK_QUIRK to get it work. It's also mentioned in
https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/pentium-celeron-n-series-spec-update.pdf
page #14 for N-seris intel CPU. And I also find the same problem
description in 
https://www.intel.com/content/dam/www/public/us/en/documents/product-briefs/silver-celeron-spec-update.pdf
page #16 for J-series Intel CPU. Seems that they have different
workaround so I can not simply apply XHCI_PME_STUCK_QUIRK to make it
work.

 Anyone can help here?



N-Series
CHP8: USB xHCI Controller May Not Re-Enter D3 State After a USB Wake Event
- needs XHCI_PME_STUCK_QUIRK in driver (sets bit 28 at offset 80a4)

Intel® Pentium® Silver N5000
Intel® Pentium® Silver J5005
Intel® Celeron® N4000 and N4100
Intel® Celeron® J4105 and J4005

USB xHCI Controller May Not Re-enter a D3 State After a USB Wake Even
Need to clear PME_EN bit of of the standard PCI PM_CSR register.
I think Linux does this anyway (clears enabling PME when reaching D0)
So if I remember correct there was no specific workaround needed for this.

what is the PCI ID of your xhci controller? (lspci -nn)

One other possible cause is that xHCI never reaches PCI device D3 suspend state 
during system suspend.
xHC can't generate PME# wake event from normal running PCI device D0 state.

PCI code in Linux will check with ACPI about the lowest possible D state when 
suspending,
If there is something missing from the xHCI entry in ACPI DSDT table it might 
select D0.
as the suspend state, causing wake failure.

Is there a BIOS update available for your ASUS and Acer?

-Mathias

 
 


Re: [PATCH 2/3] usb: xhci: tegra: Add runtime PM support

2018-03-09 Thread Mathias Nyman

On 09.03.2018 10:36, Thierry Reding wrote:

On Thu, Mar 08, 2018 at 09:31:07PM +, Jon Hunter wrote:


On 01/03/18 14:18, Mathias Nyman wrote:

On 14.02.2018 18:34, Jon Hunter wrote:

Add runtime PM support to the Tegra XHCI driver and move the function
calls to enable/disable the clocks, regulators and PHY into the runtime
PM callbacks.

Signed-off-by: Jon Hunter <jonath...@nvidia.com>
---
   drivers/usb/host/xhci-tegra.c | 80
++-
   1 file changed, 56 insertions(+), 24 deletions(-)

diff --git a/drivers/usb/host/xhci-tegra.c
b/drivers/usb/host/xhci-tegra.c
index 02b0b24faa58..42aa67858b53 100644
--- a/drivers/usb/host/xhci-tegra.c
+++ b/drivers/usb/host/xhci-tegra.c
@@ -18,6 +18,7 @@
   #include 
   #include 
   #include 
+#include 
   #include 
   #include 
   #include 
@@ -1067,22 +1068,12 @@ static int tegra_xusb_probe(struct
platform_device *pdev)
    */
   platform_set_drvdata(pdev, tegra);
   -    err = tegra_xusb_clk_enable(tegra);
-    if (err) {
-    dev_err(>dev, "failed to enable clocks: %d\n", err);
-    goto put_usb2;
-    }
-
-    err = regulator_bulk_enable(tegra->soc->num_supplies,
tegra->supplies);
-    if (err) {
-    dev_err(>dev, "failed to enable regulators: %d\n", err);
-    goto disable_clk;
-    }
+    pm_runtime_enable(>dev);
   -    err = tegra_xusb_phy_enable(tegra);
+    err = pm_runtime_get_sync(>dev);
   if (err < 0) {


Does this mean that if runtime PM is disabled then clocks and regulator
will never be enabled
for Tegra xhci?

How about keeping the clock and regualtor enabling in probe, and instead
add something like:

pm_runtime_set_active(>dev);
pm_runtime_enable(>dev);
pm_runtime_get_noresume(>dev);


For 64-bit Tegra there is a dependency on CONFIG_PM, but for 32-bit
AFAIK there is not and so yes we should handle the case when PM_RUNTIME
is disabled.

Typically we do something like ...

 pm_runtime_enable(>dev);
 if (!pm_runtime_enabled(>dev))
ret = tegra_xusb_runtime_resume(>dev);
 else
 ret = pm_runtime_get_sync(>dev);

That way we can keep the regulator and clock stuff in the handler. I
will update this series.


Is there any good reason why we don't depend on PM for 32-bit as well?
I'm not aware of any differences in drivers that are 32-bit specific for
Tegra, and I'm not even sure the !PM case gets any testing at all. And
even if, do we really still want to support that?

I don't see any advantage these days for having it disabled.


I don't know much about Tegra, but I'd still like to turn this question around:

Is there any reason why clks and regulators can't initially be turned on in 
probe,
and then let runtime PM handle them later if PM is supported?

Shouldn't this work in all cases, and it avoids creating new dependencies?

Thanks
Mathias


  1   2   3   4   5   6   7   8   9   10   >