Re: [bugzilla-dae...@bugzilla.kernel.org: [Bug 197159] New: Xhci host controller not responding starting kernel 4.13]

2017-10-11 Thread Mason
On 10/10/2017 01:38, Bjorn Helgaas wrote:
> On Mon, Oct 09, 2017 at 10:45:39PM +0200, Mason wrote:
>> On 09/10/2017 19:01, Bjorn Helgaas wrote:
>> ...
> 
>>> In that thread, Mason reported a regression that looks similar, but as
>>> far as I can tell, we never identified a root cause.
>>>
>>>   1) The problem Mason reported was on a Tango platform, which has a
>>>  known hardware issue that corrupts data when simultaneous config
>>>  and MMIO accesses occur.  You're seeing the problem on a
>>>  different platform, which is very helpful.
>>
>> As mentioned here:
>> https://www.mail-archive.com/linux-usb@vger.kernel.org/msg94020.html
>>
>> When I disable the AER driver, not a single config space access
>> occurs when a USB drive is unplugged. So I'm 99.99% sure that
>> the issue is NOT caused by tango's bad design. (I got the vibe
>> that nobody cared about tango's issue because it was assumed
>> that the design flaw was responsible for it.)
> 
> I agree; I don't think this is Tango's fault.
> 
> Can you test fe190ed0d602 and d9f11ba9f107 to determine whether
> d9f11ba9f107 is the culprit?  If it is the culprit, can you try reverting
> it on a current kernel to see if that fixes it?
> 
> If d9f11ba9f107 is not the culprit, can you bisect to discover exactly
> where it broke?

NOTE: tango PCIe support was not upstream in v4.11
Therefore, for all my tests, I cherry-pick the following patches
on top of the reference mentioned:

Tweak tango4_defconfig
Disable sibling calls optimization
Proper init
Set up static virt_to_phys mapping for llad
Show BROKEN drivers
8759 PCIe DT node
PCI: Change pci_host_common_probe() visibility
PCI: Improve __pci_read_base() robustness
PCI: Include pci.h for struct pci_ops definition
PCI: tango: Add MSI controller support
PCI: tango: Add Sigma Designs Tango SMP8759 PCIe host bridge support


Below are the result of my tests.

*
v4.11-rc5 : I CAN PLUG/UNPLUG SEVERAL TIMES
*

[9.598044] usb 2-2: new SuperSpeed USB device number 2 using xhci_hcd
[9.637244] usb-storage 2-2:1.0: USB Mass Storage device detected
[9.643687] scsi host0: usb-storage 2-2:1.0
[   10.670685] scsi 0:0:0:0: Direct-Access Kingston DataTraveler 3.0  
PQ: 0 ANSI: 6
[   10.681244] sd 0:0:0:0: [sda] 15109516 512-byte logical blocks: (7.74 
GB/7.20 GiB)
[   10.692182] sd 0:0:0:0: [sda] Write Protect is off
[   10.697566] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, 
doesn't support DPO or FUA
[   10.710124]  sda: sda1
[   10.719317] sd 0:0:0:0: [sda] Attached SCSI removable disk
[   10.751942] random: fast init done
[   14.743212] xhci_hcd :01:00.0: Cannot set link state.
[   14.748699] usb usb2-port2: cannot disable (err = -32)
[   14.754068] usb 2-2: USB disconnect, device number 2
[   21.198040] usb 2-2: new SuperSpeed USB device number 3 using xhci_hcd
[   21.237490] usb-storage 2-2:1.0: USB Mass Storage device detected
[   21.243920] scsi host0: usb-storage 2-2:1.0
[   22.270751] scsi 0:0:0:0: Direct-Access Kingston DataTraveler 3.0  
PQ: 0 ANSI: 6
[   22.281255] sd 0:0:0:0: [sda] 15109516 512-byte logical blocks: (7.74 
GB/7.20 GiB)
[   22.292173] sd 0:0:0:0: [sda] Write Protect is off
[   22.297555] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, 
doesn't support DPO or FUA
[   22.309951]  sda: sda1
[   22.314312] sd 0:0:0:0: [sda] Attached SCSI removable disk
[   26.586603] xhci_hcd :01:00.0: Cannot set link state.
[   26.592391] usb usb2-port2: cannot disable (err = -32)
[   26.597605] usb 2-2: USB disconnect, device number 3
[   31.811351] usb 2-2: new SuperSpeed USB device number 4 using xhci_hcd
[   31.850819] usb-storage 2-2:1.0: USB Mass Storage device detected
[   31.857291] scsi host0: usb-storage 2-2:1.0
[   32.884087] scsi 0:0:0:0: Direct-Access Kingston DataTraveler 3.0  
PQ: 0 ANSI: 6
[   32.893255] sd 0:0:0:0: [sda] 15109516 512-byte logical blocks: (7.74 
GB/7.20 GiB)
[   32.901928] sd 0:0:0:0: [sda] Write Protect is off
[   32.907476] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, 
doesn't support DPO or FUA
[   32.919278]  sda: sda1
[   32.923519] sd 0:0:0:0: [sda] Attached SCSI removable disk
[   35.979867] xhci_hcd :01:00.0: Cannot set link state.
[   35.985346] usb usb2-port2: cannot disable (err = -32)
[   35.990635] usb 2-2: USB disconnect, device number 4




26bba5c767de : KERNEL PANIC AT BOOT


[1.156024] xhci_hcd :01:00.0: xHCI Host Controller
[1.161332] xhci_hcd :01:00.0: new USB bus registered, assigned bus 
number 1
[1.174358] xhci_hcd :01:00.0: hcc params 0x014051cf hci version 0x100 
quirks 0x0010
[1.183767] hub 1-0:1.0: USB hub found
[1.187589] hub 1-0:1.0

Re: [bugzilla-dae...@bugzilla.kernel.org: [Bug 197159] New: Xhci host controller not responding starting kernel 4.13]

2017-10-09 Thread Mason
On 09/10/2017 19:01, Bjorn Helgaas wrote:

> [+cc linux-pci, linux-usb, Mason, Mathias, Lukas, Greg, Felipe, Alan]
> 
> - Forwarded message from bugzilla-dae...@bugzilla.kernel.org -
>>
>> Date: Sun, 08 Oct 2017 13:28:13 +
>> From: bugzilla-dae...@bugzilla.kernel.org
>> To: bugzilla@gmail.com
>> Subject: [Bug 197159] New: Xhci host controller not responding starting 
>> kernel 4.13
>>
>> https://bugzilla.kernel.org/show_bug.cgi?id=197159
>>
>> Bug ID: 197159
>>Summary: Xhci host controller not responding starting kernel
>> 4.13
>>Product: Drivers
>>Version: 2.5
>> Kernel Version: 4.13
>>   Hardware: Intel
>> OS: Linux
>>   Tree: Mainline
>> Status: NEW
>>   Severity: blocking
>>   Priority: P1
>>  Component: PCI
>>   Assignee: drivers_...@kernel-bugs.osdl.org
>>   Reporter: nik...@vividvisions.se
>> Regression: No
>>
>> When booting with a Expresscard USB 3.0 adapter (NEC UPD720202 Chip), the
>> following error is generated:
>>
>> "xhci_hcd :05:00.0: xHCI host controller not responding, assumed dead"
>>
>> This card still works fine with kernel 4.9.
> 
> Thanks very much for the bug report, and sorry for the regression.
> 
> Can you please collect the complete dmesg log and "lspci -vv" output
> and attach them to the bugzilla?
> 
>> Additionally, for some reason this also interferes with LUKS on an LVM
>> partition; password does not work and computer becomes stuck at this point.
>> This works as normal if card is removed and computer is rebooted.
>>
>> Can we please have Expresscard USB 3.0 functionality back in the kernel?
>>
>> This problem has been described elsewhere, but couldn't find any kernel bug
>> report for it. See this link for further information:
>>
>> http://patchwork.ozlabs.org/patch/804867/

Alternate links:
https://www.spinics.net/lists/linux-pci/msg64202.html
https://www.mail-archive.com/linux-usb@vger.kernel.org/msg93821.html

> In that thread, Mason reported a regression that looks similar, but as
> far as I can tell, we never identified a root cause.
> 
>   1) The problem Mason reported was on a Tango platform, which has a
>  known hardware issue that corrupts data when simultaneous config
>  and MMIO accesses occur.  You're seeing the problem on a
>  different platform, which is very helpful.

As mentioned here:
https://www.mail-archive.com/linux-usb@vger.kernel.org/msg94020.html

When I disable the AER driver, not a single config space access
occurs when a USB drive is unplugged. So I'm 99.99% sure that
the issue is NOT caused by tango's bad design. (I got the vibe
that nobody cared about tango's issue because it was assumed
that the design flaw was responsible for it.)

>   2) Mathias suggested d9f11ba9f107 ("xhci: Rework how we handle
>  unresponsive or hoptlug removed hosts"), which appeared in
>  v4.12-rc1, as a possible culprit, but I don't see a bisection
>  that definitively identifies this commit.
> 
>  Is it possible for you to test both fe190ed0d602 ("xhci: Do not
>  halt the host until both HCD have disconnected their devices.")
>  and d9f11ba9f107 ("xhci: Rework how we handle unresponsive or
>  hoptlug removed hosts") so we can tell for sure whether
>  d9f11ba9f107 broke it?
> 
>   3) Mason did report:
>v4.11.12 OK
>v4.12-rc1 KO
>  I assume "KO" means broken (unless that's a typo for "OK"?).  If
>  it means "broken", he did at least confirm that the problem first
>  appeared in v4.12-rc1.

Yes, KO (as in boxing knockout) means "broken", "double-plus-ungood".
It might be unfortunate to use OK and KO, as you point out.

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Possible regression between 4.9 and 4.13

2017-08-31 Thread Mason
On 30/08/2017 11:06, Greg Kroah-Hartman wrote:

> On Wed, Aug 30, 2017 at 10:55:37AM +0200, Mason wrote:
>
>> On 30/08/2017 08:02, Greg Kroah-Hartman wrote:
>>
>>> To get back to the original issue here, the hardware seems to have died,
>>> the driver stops talking to it, and all is good.  The "regression" here
>>> is that we now properly can determine that the hardware is crap.
>>
>> Before 4.12, when I unplugged my USB3 Flash drive, Linux would
>> detect a few "Uncorrected Non-Fatal errors" via AER, but it was
>> still possible to plug the drive back in.
>>
>> Since 4.12, once I unplug the drive, the whole USB3 card is marked
>> as dead (all 4 ports), and I can no longer plug anything in (not even
>> the USB2 drive that didn't have any issues, IIRC).
>>
>> It seems a bit premature to "mark as dead" something that remains
>> functional, doesn't it?
> 
> I agree, but if the device sends all ones, it's a good indication it is
> really dead, right?  Or something is wrong with it.

I wouldn't call it dead if I can plug the drive back in, and have
it working... But I agree that something fishy is happening...

>> Disclaimer, there are many variables in this setup, and I've only
>> tested a small fraction of the problem space: only one system,
>> only one USB3 board, only one USB3 Flash drive.
> 
> Did you ever happen to narrow this down to a single git commit using
> 'git bisect'?  I can't remember what happened in the beginning of this
> thread...

Mathias pointed out d9f11ba9f107aa335091ab8d7ba5eea714e46e8b

>>> So, how do you think we should proceed, delay a bit longer before saying
>>> the device is gone?  How long is "long enough"?  How many bus errors are
>>> we allowed to tolerate (hint, the PCI spec says none...)
>>>
>>> Maybe someone wants to get to the root problem here, why is the hardware
>>> suddenly reporting all 1s?
>>
>> I'm afraid I won't be able to make any progress on this front,
>> unless I can get my hands on a PCIe packet analyzer.
> 
> Odds of that happening are pretty rare, right?  I've never even seen one
> of those...

I had a "Summit T24 Analyzer" on my desk a few months ago, but I was getting
strange results, and the knowledgeable people in my company were not available
at the time.

http://teledynelecroy.com/protocolanalyzer/protocoloverview.aspx?seriesid=445

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Possible regression between 4.9 and 4.13

2017-08-31 Thread Mason
On 30/08/2017 11:37, Mason wrote:

> On 30/08/2017 11:07, Ard Biesheuvel wrote:
> 
>> Please don't forget to mention that this is quirky hardware that
>> depends on BROKEN because it multiplexes MMIO and config space
>> accesses in the same memory window without any locking whatsoever
>> (which would be difficult to do in the first place because we don't
>> use accessors for MMIO in the kernel).
> 
> You're right, it was in the back of my mind, but I didn't state
> it explicitly for the benefit of linux-usb readers.
> 
>> So how likely is it that you are attempting to read from the xhci
>> BAR window while a config space access is in progress? Any way to
>> instrument this in your driver?
> 
> I logged config space accesses here:
> 
> https://www.spinics.net/lists/arm-kernel/msg602832.html
> 
> IIRC, the config space accesses are generated by the AER ISR.
> So disabling the AER driver should guarantee that no config space
> accesses are occurring when the drive is unplugged.

I checked, and I *did* remember correctly.

Disabling the AER driver results in 0 config space access occurring
when the USB3 drive is unplugged. This confirms that the controller's
broken design (muxing config and mem space) is not responsible for
the glitches occurring on unplug events.

Furthermore, I confirm that once the controller has been deemed "dead",
even USB2 drives are no longer detected, and all USB port on the PCIe
board are disabled.

Regards.


For reads/writes in config space, I have:

if (do_debug) {
printk("\t READ: bus=%d devfn=%u where=%d size=%d val=0x%x\n",
bus->number, devfn, where, size, *val);
dump_stack();
}

if (do_debug) {
printk("\tWRITE: bus=%d devfn=%u where=%d size=%d val=0x%x\n",
bus->number, devfn, where, size, val);
dump_stack();
}

During setup I do get, e.g.

[7.621417]   READ: bus=1 devfn=0 where=84 size=2 val=0x8
[7.626840] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G C  
4.12.0-rc1 #2
[7.634358] Hardware name: Sigma Tango DT
[7.638387] [] (unwind_backtrace) from [] 
(show_stack+0x10/0x14)
[7.646171] [] (show_stack) from [] 
(dump_stack+0x84/0x98)
[7.653429] [] (dump_stack) from [] 
(smp8759_config_read+0xa0/0xa4)
[7.661474] [] (smp8759_config_read) from [] 
(pci_bus_read_config_word+0x6c/0x94)
[7.670742] [] (pci_bus_read_config_word) from [] 
(pci_read_config_word+0x24/0x38)
[7.680097] [] (pci_read_config_word) from [] 
(__pci_dev_reset+0x11c/0x2fc)
[7.688841] [] (__pci_dev_reset) from [] 
(pci_probe_reset_function+0xc/0x10)
[7.697673] [] (pci_probe_reset_function) from [] 
(pci_create_sysfs_dev_files+0x2a8/0x374)
[7.707728] [] (pci_create_sysfs_dev_files) from [] 
(pci_sysfs_init+0x34/0x54)
[7.716734] [] (pci_sysfs_init) from [] 
(do_one_initcall+0x44/0x168)
[7.724867] [] (do_one_initcall) from [] 
(kernel_init_freeable+0x15c/0x1e8)
[7.733611] [] (kernel_init_freeable) from [] 
(kernel_init+0x8/0x108)
[7.741831] [] (kernel_init) from [] 
(ret_from_fork+0x14/0x3c)


On plug/unplug events, there are no config space accesses:

[   88.006750] usb 2-2: new SuperSpeed USB device number 2 using xhci_hcd
[   88.040179] usb 2-2: New USB device found, idVendor=0951, idProduct=1666
[   88.046930] usb 2-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[   88.054177] usb 2-2: Product: DataTraveler 3.0
[   88.058684] usb 2-2: Manufacturer: Kingston
[   88.062927] usb 2-2: SerialNumber: 002618887865F0C0F8646BFA
[   88.071523] usb-storage 2-2:1.0: USB Mass Storage device detected
[   88.081334] scsi host0: usb-storage 2-2:1.0
[   89.096074] scsi 0:0:0:0: Direct-Access Kingston DataTraveler 3.0  
PQ: 0 ANSI: 6
[   89.104828] sd 0:0:0:0: [sda] 15109516 512-byte logical blocks: (7.74 
GB/7.20 GiB)
[   89.112996] sd 0:0:0:0: [sda] Write Protect is off
[   89.118060] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, 
doesn't support DPO or FUA
[   89.129463]  sda: sda1
[   89.133104] sd 0:0:0:0: [sda] Attached SCSI removable disk

[  103.375210] xhci_hcd :01:00.0: xHCI host controller not responding, 
assume dead
[  103.382917] xhci_hcd :01:00.0: HC died; cleaning up
[  103.388281] usb 2-2: USB disconnect, device number 2
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Possible regression between 4.9 and 4.13

2017-08-30 Thread Mason
On 30/08/2017 11:07, Ard Biesheuvel wrote:

> Please don't forget to mention that this is quirky hardware that
> depends on BROKEN because it multiplexes MMIO and config space
> accesses in the same memory window without any locking whatsoever
> (which would be difficult to do in the first place because we don't
> use accessors for MMIO in the kernel).

You're right, it was in the back of my mind, but I didn't state
it explicitly for the benefit of linux-usb readers.

> So how likely is it that you are attempting to read from the xhci BAR
> window while a config space access is in progress? Any way to
> instrument this in your driver?

I logged config space accesses here:

https://www.spinics.net/lists/arm-kernel/msg602832.html

IIRC, the config space accesses are generated by the AER ISR.
So disabling the AER driver should guarantee that no config space
accesses are occurring when the drive is unplugged.

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Possible regression between 4.9 and 4.13

2017-08-30 Thread Mason
On 30/08/2017 08:02, Greg Kroah-Hartman wrote:

> To get back to the original issue here, the hardware seems to have died,
> the driver stops talking to it, and all is good.  The "regression" here
> is that we now properly can determine that the hardware is crap.

Before 4.12, when I unplugged my USB3 Flash drive, Linux would
detect a few "Uncorrected Non-Fatal errors" via AER, but it was
still possible to plug the drive back in.

Since 4.12, once I unplug the drive, the whole USB3 card is marked
as dead (all 4 ports), and I can no longer plug anything in (not even
the USB2 drive that didn't have any issues, IIRC).

It seems a bit premature to "mark as dead" something that remains
functional, doesn't it?

Disclaimer, there are many variables in this setup, and I've only
tested a small fraction of the problem space: only one system,
only one USB3 board, only one USB3 Flash drive.

> So, how do you think we should proceed, delay a bit longer before saying
> the device is gone?  How long is "long enough"?  How many bus errors are
> we allowed to tolerate (hint, the PCI spec says none...)
> 
> Maybe someone wants to get to the root problem here, why is the hardware
> suddenly reporting all 1s?

I'm afraid I won't be able to make any progress on this front,
unless I can get my hands on a PCIe packet analyzer.

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Possible regression between 4.9 and 4.13

2017-08-28 Thread Mason
On 28/08/2017 10:39, Mathias Nyman wrote:

> Could you take a log with the following added debug, without
> your extra delays, It should show a bit more about the state
> of the controller when we read 0x

I applied the following patch on top of v4.12-rc1

diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
index 5e3e9d4c6956..c7ea7d4c801f 100644
--- a/drivers/usb/host/xhci-hub.c
+++ b/drivers/usb/host/xhci-hub.c
@@ -23,6 +23,7 @@
 
 #include 
 #include 
+#include 
 
 #include "xhci.h"
 #include "xhci-trace.h"
@@ -1268,7 +1269,10 @@ int xhci_hub_control(struct usb_hcd *hcd, u16 typeReq, 
u16 wValue,
wIndex--;
temp = readl(port_array[wIndex]);
if (temp == ~(u32)0) {
-   xhci_hc_died(xhci);
+   struct pci_dev *pdev = to_pci_dev(hcd->self.controller);
+   xhci_err(xhci, "ClearPortFeat port%d @%p=%x, 
hcd->state:0x%x hcd->flags:0x%x, pci_state 0x%x\n",
+   wIndex, port_array[wIndex], temp, 
hcd->state, hcd->flags, pdev->current_state);
+   WARN_ON(1);
retval = -ENODEV;
break;
}


And here are logs I get when I plug/unplug my USB3 device.

[   14.970148] usb 2-2: new SuperSpeed USB device number 2 using xhci_hcd
[   15.003487] usb 2-2: New USB device found, idVendor=0951, idProduct=1666
[   15.010237] usb 2-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[   15.017483] usb 2-2: Product: DataTraveler 3.0
[   15.021990] usb 2-2: Manufacturer: Kingston
[   15.026234] usb 2-2: SerialNumber: 002618887865F0C0F8646BFA
[   15.034830] usb-storage 2-2:1.0: USB Mass Storage device detected
[   15.041269] scsi host0: usb-storage 2-2:1.0
[   16.056140] scsi 0:0:0:0: Direct-Access Kingston DataTraveler 3.0  
PQ: 0 ANSI: 6
[   16.064979] sd 0:0:0:0: [sda] 15109516 512-byte logical blocks: (7.74 
GB/7.20 GiB)
[   16.072978] sd 0:0:0:0: [sda] Write Protect is off
[   16.078076] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, 
doesn't support DPO or FUA
[   16.089417]  sda: sda1
[   16.093050] sd 0:0:0:0: [sda] Attached SCSI removable disk


[   22.152078] pcieport :00:00.0: AER: Uncorrected (Non-Fatal) error 
received: id=
[   22.160157] pcieport :00:00.0: PCIe Bus Error: severity=Uncorrected 
(Non-Fatal), type=Transaction Layer, id=(Requester ID)
[   22.172051] pcieport :00:00.0:   device [1105:0024] error 
status/mask=4000/
[   22.180493] pcieport :00:00.0:[14] Completion Timeout (First)
[   22.187368] pcieport :00:00.0: AER: Device recovery failed
[   22.885269] xhci_hcd :01:00.0: ClearPortFeat port1 @e0852430=, 
hcd->state:0x1 hcd->flags:0x1a5, pci_state 0x0
[   22.896284] [ cut here ]
[   22.900938] WARNING: CPU: 0 PID: 127 at drivers/usb/host/xhci-hub.c:1275 
xhci_hub_control+0x10f4/0x1778
[   22.910377] Modules linked in:
[   22.913447] CPU: 0 PID: 127 Comm: kworker/0:1 Tainted: G C  
4.12.0-rc1 #4
[   22.921314] Hardware name: Sigma Tango DT
[   22.925342] Workqueue: usb_hub_wq hub_event
[   22.929564] [] (unwind_backtrace) from [] 
(show_stack+0x10/0x14)
[   22.937353] [] (show_stack) from [] 
(dump_stack+0x84/0x98)
[   22.944617] [] (dump_stack) from [] (__warn+0xe8/0x100)
[   22.951616] [] (__warn) from [] 
(warn_slowpath_null+0x20/0x28)
[   22.959227] [] (warn_slowpath_null) from [] 
(xhci_hub_control+0x10f4/0x1778)
[   22.968062] [] (xhci_hub_control) from [] 
(usb_hcd_submit_urb+0x264/0x810)
[   22.976719] [] (usb_hcd_submit_urb) from [] 
(usb_submit_urb+0x2b0/0x4b4)
[   22.985201] [] (usb_submit_urb) from [] 
(usb_start_wait_urb+0x4c/0xbc)
[   22.993509] [] (usb_start_wait_urb) from [] 
(usb_control_msg+0xa0/0xcc)
[   23.001904] [] (usb_control_msg) from [] 
(usb_clear_port_feature+0x44/0x4c)
[   23.010648] [] (usb_clear_port_feature) from [] 
(hub_port_reset+0x228/0x51c)
[   23.019479] [] (hub_port_reset) from [] 
(hub_event+0x1f4/0xe64)
[   23.027177] [] (hub_event) from [] 
(process_one_work+0x1d4/0x3ec)
[   23.035049] [] (process_one_work) from [] 
(worker_thread+0x38/0x554)
[   23.043185] [] (worker_thread) from [] 
(kthread+0x108/0x138)
[   23.050620] [] (kthread) from [] 
(ret_from_fork+0x14/0x3c)
[   23.057877] ---[ end trace 5e4494cf1f6e3761 ]---
[   23.062691] xhci_hcd :01:00.0: ClearPortFeat port1 @e0852430=, 
hcd->state:0x1 hcd->flags:0x1a5, pci_state 0x0
[   23.073707] [ cut here ]
[   23.078349] WARNING: CPU: 0 PID: 127 at drivers/usb/host/xhci-hub.c:1275 
xhci_hub_control+0x10f4/0x1778
[   23.087787] Modules linked in:
[   23.090854] CPU: 0 PID: 127 Comm: kworker/0:1 Tainted: GWC  
4.12.0-rc1 #4
[   23.098720] Hardware name: Sigma Tango DT
[   23.102745] Workqueue: usb_hub_wq hub_event
[   23.106953] [] (unwind_backtrace) from [] 
(show_stack+0x10/0x14)
[   23.114737] [] 

Re: Possible regression between 4.9 and 4.13

2017-08-23 Thread Mason
On 23/08/2017 14:41, Mason wrote:

> I compiled a minimal kernel, with lots of irrelevant drivers and
> frameworks left out, including power management. I still get the
> "xHCI host controller not responding, assume dead" issue.

The problem seems to have a timing-related aspect.

I added a bunch of logs (to a slow serial console) and the HC was
not killed. I was able to plug the Flash drive a second time.
(I am logging config space reads and writes.)

[1.098314]   READ: bus=1 devfn=0 where=84 size=2 val=0x8
[1.103779]   READ: bus=1 devfn=0 where=4 size=2 val=0x142
[1.109315]   READ: bus=1 devfn=0 where=61 size=1 val=0x1
[1.114746]   READ: bus=1 devfn=0 where=4 size=2 val=0x142
[1.120311]   READ: bus=1 devfn=0 where=4 size=2 val=0x142
[1.125841]  WRITE: bus=1 devfn=0 where=4 size=2 val=0x146

NB: I added msleep(2500) in usb_add_hcd()

[3.681867] xhci_hcd :01:00.0: xHCI Host Controller
[3.687154] xhci_hcd :01:00.0: new USB bus registered, assigned bus 
number 1
[3.694656]   READ: bus=1 devfn=0 where=96 size=1 val=0x30
[3.705736] xhci_hcd :01:00.0: hcc params 0x014051cf hci version 0x100 
quirks 0x0010
[3.714233]   READ: bus=1 devfn=0 where=12 size=1 val=0x10
[3.719752]   READ: bus=1 devfn=0 where=4 size=2 val=0x146
[3.725269]  WRITE: bus=1 devfn=0 where=4 size=2 val=0x156
[3.730794]   READ: bus=1 devfn=0 where=146 size=2 val=0x7
[3.736314]   READ: bus=1 devfn=0 where=146 size=2 val=0x7
[3.741835]  WRITE: bus=1 devfn=0 where=146 size=2 val=0x7
[3.747354]   READ: bus=1 devfn=0 where=146 size=2 val=0x7
[3.752871]   READ: bus=1 devfn=0 where=148 size=4 val=0x1000
[3.758775]   READ: bus=1 devfn=0 where=146 size=2 val=0x7
[3.764297]  WRITE: bus=1 devfn=0 where=146 size=2 val=0xc007
[3.770108]   READ: bus=1 devfn=0 where=4 size=2 val=0x146
[3.775626]  WRITE: bus=1 devfn=0 where=4 size=2 val=0x546
[3.781146]   READ: bus=1 devfn=0 where=146 size=2 val=0xc007
[3.786925]  WRITE: bus=1 devfn=0 where=146 size=2 val=0x8007
[3.792919] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002
[3.799756] usb usb1: New USB device strings: Mfr=3, Product=2, 
SerialNumber=1
[3.807021] usb usb1: Product: xHCI Host Controller
[3.811933] usb usb1: Manufacturer: Linux 4.12.0-rc1 xhci-hcd
[3.817713] usb usb1: SerialNumber: :01:00.0
[3.822773] hub 1-0:1.0: USB hub found
[3.826598] hub 1-0:1.0: 4 ports detected

NB: I added msleep(2500) in usb_add_hcd()

[6.455246] xhci_hcd :01:00.0: xHCI Host Controller
[6.460520] xhci_hcd :01:00.0: new USB bus registered, assigned bus 
number 2
[6.468028] usb usb2: We don't know the algorithms for LPM for this host, 
disabling LPM.
[6.476236] usb usb2: New USB device found, idVendor=1d6b, idProduct=0003
[6.483068] usb usb2: New USB device strings: Mfr=3, Product=2, 
SerialNumber=1
[6.490334] usb usb2: Product: xHCI Host Controller
[6.495240] usb usb2: Manufacturer: Linux 4.12.0-rc1 xhci-hcd
[6.501020] usb usb2: SerialNumber: :01:00.0
[6.505994] hub 2-0:1.0: USB hub found
[6.509806] hub 2-0:1.0: 4 ports detected
[6.514215] usbcore: registered new interface driver usb-storage
[6.520313] Registering SWP/SWPB emulation handler
[6.525541]   READ: bus=0 devfn=0 where=132 size=4 val=0x8001
[6.531334]   READ: bus=0 devfn=0 where=6 size=2 val=0x4010
[6.536955]   READ: bus=0 devfn=0 where=52 size=1 val=0x50
[6.542484]   READ: bus=0 devfn=0 where=80 size=2 val=0x7805
[6.548180]   READ: bus=0 devfn=0 where=120 size=2 val=0x8001
[6.553969]   READ: bus=0 devfn=0 where=128 size=2 val=0x10
[6.559584]   READ: bus=0 devfn=0 where=124 size=2 val=0x6008
[6.565387]   READ: bus=1 devfn=0 where=164 size=4 val=0x8fc0
[6.571167]   READ: bus=1 devfn=0 where=6 size=2 val=0x10
[6.576609]   READ: bus=1 devfn=0 where=52 size=1 val=0x50
[6.582129]   READ: bus=1 devfn=0 where=80 size=2 val=0x7001
[6.587821]   READ: bus=1 devfn=0 where=112 size=2 val=0x9005
[6.593601]   READ: bus=1 devfn=0 where=144 size=2 val=0xa011
[6.599381]   READ: bus=1 devfn=0 where=160 size=2 val=0x10
[6.604985]   READ: bus=1 devfn=0 where=84 size=2 val=0x8
[6.623665] Freeing unused kernel memory: 9216K


PLUG #1
[   66.783559] usb 2-2: new SuperSpeed USB device number 2 using xhci_hcd
[   66.816910] usb 2-2: New USB device found, idVendor=0951, idProduct=1666
[   66.823661] usb 2-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[   66.830909] usb 2-2: Product: DataTraveler 3.0
[   66.835417] usb 2-2: Manufacturer: Kingston
[   66.839660] usb 2-2: SerialNumber: 002618887865F0C0F8646BFA
[   66.848131] usb-storage 2-2:1.0: USB Mass Storage device detected
[   66.854584] scsi host0: usb-storage 2-2:1.0
[   67.869446] scsi 0:0:0:0: Direct-Access Kingston DataTraveler 3.0  
PQ: 0 ANSI: 6
[   67.878270] sd 0:0:0:0: [sda] 15109516 512-byte logical blocks:

Re: Possible regression between 4.9 and 4.13

2017-08-23 Thread Mason
On 23/08/2017 13:54, Mason wrote:

> On 23/08/2017 13:11, Mathias Nyman wrote:
> 
>> In this case we read the register when hub thread asks to clear port feature.
>>
>> why portsc returns 0x is a another question, could the hub thread be 
>> running while xhci controller is (in D3)?
>> Was xhci runtime suspended?
> 
> How do I tell?
> Should I disable SUSPEND support and all kinds of power management?

I compiled a minimal kernel, with lots of irrelevant drivers and
frameworks left out, including power management. I still get the
"xHCI host controller not responding, assume dead" issue.

PLUG
[   59.803499] usb 2-2: new SuperSpeed USB device number 2 using xhci_hcd
[   59.836902] usb 2-2: New USB device found, idVendor=0951, idProduct=1666
[   59.843653] usb 2-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[   59.850900] usb 2-2: Product: DataTraveler 3.0
[   59.855417] usb 2-2: Manufacturer: Kingston
[   59.859661] usb 2-2: SerialNumber: 002618887865F0C0F8646BFA
[   59.868249] usb-storage 2-2:1.0: USB Mass Storage device detected
[   59.874691] scsi host0: usb-storage 2-2:1.0
[   60.882801] scsi 0:0:0:0: Direct-Access Kingston DataTraveler 3.0  
PQ: 0 ANSI: 6
[   60.891640] sd 0:0:0:0: [sda] 15109516 512-byte logical blocks: (7.74 
GB/7.20 GiB)
[   60.899662] sd 0:0:0:0: [sda] Write Protect is off
[   60.904763] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, 
doesn't support DPO or FUA
[   60.916154]  sda: sda1
[   60.919798] sd 0:0:0:0: [sda] Attached SCSI removable disk

UNPLUG
[   70.545087] pcieport :00:00.0: AER: Uncorrected (Non-Fatal) error 
received: id=
[   70.553169] pcieport :00:00.0: PCIe Bus Error: severity=Uncorrected 
(Non-Fatal), type=Transaction Layer, id=(Requester ID)
[   70.565084] pcieport :00:00.0:   device [1105:0024] error 
status/mask=4000/
[   70.573528] pcieport :00:00.0:[14] Completion Timeout (First)
[   70.580402] pcieport :00:00.0: AER: Device recovery failed

[   71.275253] xhci_hcd :01:00.0: xHCI host controller not responding, 
assume dead
[   71.282956] xhci_hcd :01:00.0: HC died; cleaning up
[   71.288304] usb 2-2: USB disconnect, device number 2

[   71.293445] pcieport :00:00.0: AER: Uncorrected (Non-Fatal) error 
received: id=
[   71.301851] pcieport :00:00.0: PCIe Bus Error: severity=Uncorrected 
(Non-Fatal), type=Transaction Layer, id=(Requester ID)
[   71.313785] pcieport :00:00.0:   device [1105:0024] error 
status/mask=4000/
[   71.322240] pcieport :00:00.0:[14] Completion Timeout (First)
[   71.329115] pcieport :00:00.0: AER: Device recovery failed

[   71.335042] pcieport :00:00.0: AER: Uncorrected (Non-Fatal) error 
received: id=
[   71.343137] pcieport :00:00.0: PCIe Bus Error: severity=Uncorrected 
(Non-Fatal), type=Transaction Layer, id=(Requester ID)
[   71.354984] pcieport :00:00.0:   device [1105:0024] error 
status/mask=4000/
[   71.363443] pcieport :00:00.0:[14] Completion Timeout (First)
[   71.370289] pcieport :00:00.0: AER: Device recovery failed


defconfig for reference

# CONFIG_SWAP is not set
CONFIG_SYSVIPC=y
CONFIG_NO_HZ_IDLE=y
CONFIG_HIGH_RES_TIMERS=y
# CONFIG_COMPAT_BRK is not set
CONFIG_SLAB=y
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODVERSIONS=y
CONFIG_ARCH_TANGO=y
# CONFIG_ARM_ERRATA_643719 is not set
CONFIG_PCI=y
CONFIG_PCIEPORTBUS=y
CONFIG_PCI_MSI=y
CONFIG_PCIE_TANGO_SMP8759=y
CONFIG_SMP=y
CONFIG_PREEMPT=y
CONFIG_HZ_300=y
CONFIG_AEABI=y
CONFIG_HIGHMEM=y
# CONFIG_ATAGS is not set
CONFIG_ARM_APPENDED_DTB=y
CONFIG_ARM_ATAG_DTB_COMPAT=y
CONFIG_VFP=y
CONFIG_NEON=y
# CONFIG_SUSPEND is not set
CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
CONFIG_DEVTMPFS=y
CONFIG_DEVTMPFS_MOUNT=y
CONFIG_BLK_DEV_LOOP=y
CONFIG_SCSI=y
CONFIG_BLK_DEV_SD=y
# CONFIG_INPUT_KEYBOARD is not set
# CONFIG_INPUT_MOUSE is not set
# CONFIG_SERIO is not set
CONFIG_SERIAL_8250=y
# CONFIG_SERIAL_8250_DEPRECATED_OPTIONS is not set
CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_SERIAL_8250_RT288X=y
CONFIG_SERIAL_OF_PLATFORM=y
# CONFIG_HW_RANDOM is not set
# CONFIG_HWMON is not set
# CONFIG_HID is not set
# CONFIG_USB_HID is not set
CONFIG_USB=y
CONFIG_USB_ANNOUNCE_NEW_DEVICES=y
CONFIG_USB_XHCI_HCD=y
CONFIG_USB_STORAGE=y
CONFIG_VFAT_FS=m
CONFIG_TMPFS=y
CONFIG_NLS_CODEPAGE_437=m
CONFIG_NLS_ISO8859_1=m
CONFIG_NLS_UTF8=m
CONFIG_PRINTK_TIME=y
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Possible regression between 4.9 and 4.13

2017-08-23 Thread Mason
On 23/08/2017 13:11, Mathias Nyman wrote:

> On 23.08.2017 12:31, Mason wrote:
> 
>> [   46.525247] usb 2-2: new SuperSpeed USB device number 2 using xhci_hcd
>> [   46.565496] usb-storage 2-2:1.0: USB Mass Storage device detected
>> [   46.571934] scsi host0: usb-storage 2-2:1.0
>> [   47.601227] scsi 0:0:0:0: Direct-Access Kingston DataTraveler 3.0 
>>  PQ: 0 ANSI: 6
>> [   47.611340] sd 0:0:0:0: [sda] 15109516 512-byte logical blocks: (7.74 
>> GB/7.20 GiB)
>> [   47.621624] sd 0:0:0:0: [sda] Write Protect is off
>> [   47.627131] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, 
>> doesn't support DPO or FUA
>> [   47.639637]  sda: sda1
>> [   47.648091] sd 0:0:0:0: [sda] Attached SCSI removable disk
>> [   58.100306] xhci_hcd :01:00.0: xHCI host controller not responding, 
>> assume dead
>> [   58.108021] CPU: 0 PID: 939 Comm: kworker/0:2 Tainted: G C  
>> 4.13.0-rc6 #11
>> [   58.115976] Hardware name: Sigma Tango DT
>> [   58.120016] Workqueue: usb_hub_wq hub_event
>> [   58.124241] [] (unwind_backtrace) from [] 
>> (show_stack+0x10/0x14)
>> [   58.132033] [] (show_stack) from [] 
>> (dump_stack+0x84/0x98)
>> [   58.139302] [] (dump_stack) from [] 
>> (xhci_hc_died.part.9+0x50/0x23c)
>> [   58.147438] [] (xhci_hc_died.part.9) from [] 
>> (xhci_hub_control+0xf3c/0x175c)
>> [   58.156273] [] (xhci_hub_control) from [] 
>> (usb_hcd_submit_urb+0x264/0x814)
>> [   58.164932] [] (usb_hcd_submit_urb) from [] 
>> (usb_start_wait_urb+0x4c/0xbc)
>> [   58.173591] [] (usb_start_wait_urb) from [] 
>> (usb_control_msg+0xa0/0xcc)
>> [   58.181985] [] (usb_control_msg) from [] 
>> (usb_clear_port_feature+0x44/0x4c)
>> [   58.190730] [] (usb_clear_port_feature) from [] 
>> (hub_port_reset+0x228/0x51c)
>> [   58.199561] [] (hub_port_reset) from [] 
>> (hub_event+0x87c/0x108c)
>> [   58.207349] [] (hub_event) from [] 
>> (process_one_work+0x1d8/0x3f0)
>> [   58.215220] [] (process_one_work) from [] 
>> (worker_thread+0x38/0x554)
>> [   58.223354] [] (worker_thread) from [] 
>> (kthread+0x108/0x138)
>> [   58.230789] [] (kthread) from [] 
>> (ret_from_fork+0x14/0x3c)
>> [   58.238056] xhci_hcd :01:00.0: HC died; cleaning up
>> [   58.243391] usb 2-2: USB disconnect, device number 2
> 
> xhci driver reads 0x from a mmio mapped xhci portsc register and 
> bails out in:
> xhci-hub.c:
>  temp = readl(port_array[wIndex]);
>  if (temp == ~(u32)0) {
>  xhci_hc_died(xhci);
>   retval = -ENODEV;
>   break;
>   }
> 
> In this case we read the register when hub thread asks to clear port feature.
> 
> why portsc returns 0x is a another question, could the hub thread be 
> running while xhci controller is (in D3)?
> Was xhci runtime suspended?

How do I tell?
Should I disable SUSPEND support and all kinds of power management?

> There were some pcieport errors in another log you showed, maybe PCI devices 
> are not properly recovered
> and the registers return 0x?

FWIW, I just compiled v4.12-rc1 and I do get the broken behavior.

v4.11.12 = OK
v4.12-rc1 = KO

PLUG
[   17.226953] usb 2-2: new SuperSpeed USB device number 2 using xhci_hcd
[   17.267195] usb-storage 2-2:1.0: USB Mass Storage device detected
[   17.273612] scsi host0: usb-storage 2-2:1.0
[   18.296369] scsi 0:0:0:0: Direct-Access Kingston DataTraveler 3.0  
PQ: 0 ANSI: 6
[   18.307772] sd 0:0:0:0: [sda] 15109516 512-byte logical blocks: (7.74 
GB/7.20 GiB)
[   18.316991] sd 0:0:0:0: [sda] Write Protect is off
[   18.322588] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, 
doesn't support DPO or FUA
[   18.334828]  sda: sda1
[   18.339507] sd 0:0:0:0: [sda] Attached SCSI removable disk
[   18.366202] random: fast init done

UNPLUG
[   21.314111] pcieport :00:00.0: AER: Uncorrected (Non-Fatal) error 
received: id=
[   21.322219] pcieport :00:00.0: PCIe Bus Error: severity=Uncorrected 
(Non-Fatal), type=Transaction Layer, id=(Requester ID)
[   21.334039] pcieport :00:00.0:   device [1105:0024] error 
status/mask=4000/
[   21.342453] pcieport :00:00.0:[14] Completion Timeout (First)
[   21.349306] pcieport :00:00.0: AER: Device recovery failed
[   22.055471] xhci_hcd :01:00.0: xHCI host controller not responding, 
assume dead
[   22.063187] xhci_hcd :01:00.0: HC died; cleaning up
[   22.068523] usb 2-2: USB disconnect, device number 2
[   22.073774] pcieport :00:00.0: AER: Uncorrected (Non-Fatal) error 
received: id=
[   22.085369] pcieport :00:00.0: PCIe Bus Error: sever

Re: Possible regression between 4.9 and 4.13

2017-08-23 Thread Mason
On 23/08/2017 09:51, Mathias Nyman wrote:

> very likely cause is the more aggressive detection of pci removed xhci hosts
> 
> See commit d9f11ba9f107aa335091ab8d7ba5eea714e46e8b
>  xhci: Rework how we handle unresponsive or hoptlug removed hosts
> 
> It checks if a xhci register reads returns 0x and assumes xhci
> died in that case.

I've just tested 4.11.12 + a few local patches to back-port
PCIe host bridge support.

It "works" as well as 4.9
(i.e. modulo the "AER: Uncorrected (Non-Fatal) error received")

[0.508533] pcie_tango 5000.pcie: simultaneous PCI config and MMIO 
accesses may cause data corruption
[0.519622] OF: PCI: host bridge /soc/pcie@2e000 ranges:
[0.519645] OF: PCI:   MEM 0x5040..0x53ff -> 0x0040
[0.519725] pcie_tango 5000.pcie: ECAM at [mem 0x5000-0x503f] 
for [bus 00-03]
[0.519872] pcie_tango 5000.pcie: PCI host bridge to bus :00
[0.519886] pci_bus :00: root bus resource [bus 00-03]
[0.519898] pci_bus :00: root bus resource [mem 0x5040-0x53ff] 
(bus address [0x0040-0x03ff])
[0.520201] PCI: bus0: Fast back to back transfers disabled
[0.520213] pci :00:00.0: bridge configuration invalid ([bus 00-00]), 
reconfiguring
[0.520922] PCI: bus1: Fast back to back transfers disabled
[0.520964] pci :00:00.0: of_irq_parse_pci: failed with rc=-22
[0.520993] pci :00:00.0: BAR 8: assigned [mem 0x5040-0x504f]
[0.521004] pci :01:00.0: BAR 0: assigned [mem 0x5040-0x50401fff 
64bit]
[0.521025] pci :00:00.0: PCI bridge to [bus 01]
[0.521033] pci :00:00.0:   bridge window [mem 0x5040-0x504f]
[0.521085] pcieport :00:00.0: enabling device (0140 -> 0142)
[0.521282] pcieport :00:00.0: Signaling PME with IRQ 30
[0.521402] pcieport :00:00.0: AER enabled with IRQ 30
[0.521526] pci :01:00.0: enabling device (0140 -> 0142)
...
[1.239706] xhci_hcd :01:00.0: xHCI Host Controller
[1.244998] xhci_hcd :01:00.0: new USB bus registered, assigned bus 
number 1
[1.258048] xhci_hcd :01:00.0: hcc params 0x014051cf hci version 0x100 
quirks 0x0010
[1.267467] hub 1-0:1.0: USB hub found
[1.271287] hub 1-0:1.0: 4 ports detected
[1.275761] xhci_hcd :01:00.0: xHCI Host Controller
[1.281048] xhci_hcd :01:00.0: new USB bus registered, assigned bus 
number 2
[1.288578] usb usb2: We don't know the algorithms for LPM for this host, 
disabling LPM.
[1.297234] hub 2-0:1.0: USB hub found
[1.301042] hub 2-0:1.0: 4 ports detected
[1.305681] usbcore: registered new interface driver usb-storage


PLUG #1
[   26.104607] usb 2-2: new SuperSpeed USB device number 2 using xhci_hcd
[   26.143799] usb-storage 2-2:1.0: USB Mass Storage device detected
[   26.150253] scsi host0: usb-storage 2-2:1.0
[   27.177298] scsi 0:0:0:0: Direct-Access Kingston DataTraveler 3.0  
PQ: 0 ANSI: 6
[   27.187586] sd 0:0:0:0: [sda] 15109516 512-byte logical blocks: (7.74 
GB/7.20 GiB)
[   27.199000] sd 0:0:0:0: [sda] Write Protect is off
[   27.204186] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, 
doesn't support DPO or FUA
[   27.216322]  sda: sda1
[   27.220584] sd 0:0:0:0: [sda] Attached SCSI removable disk
[   27.252046] random: fast init done

UNPLUG #1
[   37.334040] pcieport :00:00.0: AER: Uncorrected (Non-Fatal) error 
received: id=
[   37.342135] pcieport :00:00.0: PCIe Bus Error: severity=Uncorrected 
(Non-Fatal), type=Transaction Layer, id=(Requester ID)
[   37.353970] pcieport :00:00.0:   device [1105:0024] error 
status/mask=4000/
[   37.362589] pcieport :00:00.0:[14] Completion Timeout (First)
[   37.369485] pcieport :00:00.0: AER: Device recovery failed
[   38.066538] xhci_hcd :01:00.0: Cannot set link state.
[   38.072039] usb usb2-port2: cannot disable (err = -32)
[   38.077348] usb 2-2: USB disconnect, device number 2
[   38.082711] pcieport :00:00.0: AER: Uncorrected (Non-Fatal) error 
received: id=
[   38.094279] pcieport :00:00.0: PCIe Bus Error: severity=Uncorrected 
(Non-Fatal), type=Transaction Layer, id=(Requester ID)
[   38.108006] pcieport :00:00.0:   device [1105:0024] error 
status/mask=4000/
[   38.116878] pcieport :00:00.0:[14] Completion Timeout (First)
[   38.123954] pcieport :00:00.0: AER: Device recovery failed

PLUG #2
[   55.097922] usb 2-2: new SuperSpeed USB device number 3 using xhci_hcd
[   55.137590] usb-storage 2-2:1.0: USB Mass Storage device detected
[   55.144016] scsi host0: usb-storage 2-2:1.0
[   56.163907] scsi 0:0:0:0: Direct-Access Kingston DataTraveler 3.0  
PQ: 0 ANSI: 6
[   56.174851] sd 0:0:0:0: [sda] 15109516 512-byte logical blocks: (7.74 
GB/7.20 GiB)
[   56.184218] sd 0:0:0:0: [sda] Write Protect is off
[   56.190162] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, 
doesn't support DPO or FUA
[   

Re: Possible regression between 4.9 and 4.13

2017-08-23 Thread Mason
On 23/08/2017 09:51, Mathias Nyman wrote:

> very likely cause is the more aggressive detection of pci removed xhci hosts
> 
> See commit d9f11ba9f107aa335091ab8d7ba5eea714e46e8b
>  xhci: Rework how we handle unresponsive or hoptlug removed hosts
> 
> It checks if a xhci register reads returns 0x and assumes xhci
> died in that case.
> 
> Could you add something like the below to check which what is killing the 
> host?
> Or a BUG()/WARN() in xhci_hc_died() to get a backtrace of who called it.

[   46.525247] usb 2-2: new SuperSpeed USB device number 2 using xhci_hcd
[   46.565496] usb-storage 2-2:1.0: USB Mass Storage device detected
[   46.571934] scsi host0: usb-storage 2-2:1.0
[   47.601227] scsi 0:0:0:0: Direct-Access Kingston DataTraveler 3.0  
PQ: 0 ANSI: 6
[   47.611340] sd 0:0:0:0: [sda] 15109516 512-byte logical blocks: (7.74 
GB/7.20 GiB)
[   47.621624] sd 0:0:0:0: [sda] Write Protect is off
[   47.627131] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, 
doesn't support DPO or FUA
[   47.639637]  sda: sda1
[   47.648091] sd 0:0:0:0: [sda] Attached SCSI removable disk
[   58.100306] xhci_hcd :01:00.0: xHCI host controller not responding, 
assume dead
[   58.108021] CPU: 0 PID: 939 Comm: kworker/0:2 Tainted: G C  
4.13.0-rc6 #11
[   58.115976] Hardware name: Sigma Tango DT
[   58.120016] Workqueue: usb_hub_wq hub_event
[   58.124241] [] (unwind_backtrace) from [] 
(show_stack+0x10/0x14)
[   58.132033] [] (show_stack) from [] 
(dump_stack+0x84/0x98)
[   58.139302] [] (dump_stack) from [] 
(xhci_hc_died.part.9+0x50/0x23c)
[   58.147438] [] (xhci_hc_died.part.9) from [] 
(xhci_hub_control+0xf3c/0x175c)
[   58.156273] [] (xhci_hub_control) from [] 
(usb_hcd_submit_urb+0x264/0x814)
[   58.164932] [] (usb_hcd_submit_urb) from [] 
(usb_start_wait_urb+0x4c/0xbc)
[   58.173591] [] (usb_start_wait_urb) from [] 
(usb_control_msg+0xa0/0xcc)
[   58.181985] [] (usb_control_msg) from [] 
(usb_clear_port_feature+0x44/0x4c)
[   58.190730] [] (usb_clear_port_feature) from [] 
(hub_port_reset+0x228/0x51c)
[   58.199561] [] (hub_port_reset) from [] 
(hub_event+0x87c/0x108c)
[   58.207349] [] (hub_event) from [] 
(process_one_work+0x1d8/0x3f0)
[   58.215220] [] (process_one_work) from [] 
(worker_thread+0x38/0x554)
[   58.223354] [] (worker_thread) from [] 
(kthread+0x108/0x138)
[   58.230789] [] (kthread) from [] 
(ret_from_fork+0x14/0x3c)
[   58.238056] xhci_hcd :01:00.0: HC died; cleaning up
[   58.243391] usb 2-2: USB disconnect, device number 2
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Possible regression between 4.9 and 4.13

2017-08-23 Thread Mason
On 23/08/2017 09:51, Mathias Nyman wrote:

> On 23.08.2017 09:07, Felipe Balbi wrote:
>
>> Mason writes:
>>
>>> Any idea what could have changed between 4.9 and 4.13 ?
>>
>> Quite a bit:
>>
>> $ git rev-list --no-merges  --count v4.13-rc6 ^v4.9 -- drivers/usb/host/xhci 
>> drivers/usb/core/
>> 58
> 
> very likely cause is the more aggressive detection of pci removed xhci hosts
> 
> See commit d9f11ba9f107aa335091ab8d7ba5eea714e46e8b
>  xhci: Rework how we handle unresponsive or hoptlug removed hosts
> 
> It checks if a xhci register reads returns 0x and assumes xhci
> died in that case.
> 
> Could you add something like the below to check which what is killing the 
> host?
> Or a BUG()/WARN() in xhci_hc_died() to get a backtrace of who called it.
> 
> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
> index 51cd4b8..ade2ad6 100644
> --- a/drivers/usb/host/xhci-ring.c
> +++ b/drivers/usb/host/xhci-ring.c
> @@ -922,7 +922,8 @@ void xhci_hc_died(struct xhci_hcd *xhci)
>  if (xhci->xhc_state & XHCI_STATE_DYING)
>  return;
>   
> -   xhci_err(xhci, "xHCI host controller not responding, assume dead\n");
> +   xhci_err(xhci, "xHC not responding in %pf, assume controller is 
> dead\n",
> +__builtin_return_address(0));
>  xhci->xhc_state |= XHCI_STATE_DYING;
>   
>  xhci_cleanup_command_queue(xhci);

I'll try some coarse bisection to narrow it down.

$ git describe --contains d9f11ba9f107aa335091ab8d7ba5eea714e46e8b
v4.12-rc1~97^2~39

I'll check 4.11 first.

I wanted to mention that the XHCI setup on 4.9 and 4.13 print
slightly different things (at the beginning).

On 4.9
[1.240322] xhci_hcd :01:00.0: xHCI Host Controller
[1.245617] xhci_hcd :01:00.0: new USB bus registered, assigned bus 
number 1
[1.258691] xhci_hcd :01:00.0: hcc params 0x014051cf hci version 0x100 
quirks 0x0010
[1.268090] hub 1-0:1.0: USB hub found
[1.271905] hub 1-0:1.0: 4 ports detected
[1.276372] xhci_hcd :01:00.0: xHCI Host Controller
[1.281645] xhci_hcd :01:00.0: new USB bus registered, assigned bus 
number 2
[1.289173] usb usb2: We don't know the algorithms for LPM for this host, 
disabling LPM.
[1.297775] hub 2-0:1.0: USB hub found
[1.301577] hub 2-0:1.0: 4 ports detected
[1.306194] usbcore: registered new interface driver usb-storage

On 4.13
[1.222471] pcieport :00:00.0: of_irq_parse_pci: failed with rc=-22
[1.229156] xhci_hcd :01:00.0: Resetting
[2.268836] xhci_hcd :01:00.0: xHCI Host Controller
[2.274126] xhci_hcd :01:00.0: new USB bus registered, assigned bus 
number 1
[2.287222] xhci_hcd :01:00.0: hcc params 0x014051cf hci version 0x100 
quirks 0x0010
[2.296653] hub 1-0:1.0: USB hub found
[2.300478] hub 1-0:1.0: 4 ports detected
[2.304962] xhci_hcd :01:00.0: xHCI Host Controller
[2.310246] xhci_hcd :01:00.0: new USB bus registered, assigned bus 
number 2
[2.317776] usb usb2: We don't know the algorithms for LPM for this host, 
disabling LPM.
[2.326419] hub 2-0:1.0: USB hub found
[2.330229] hub 2-0:1.0: 4 ports detected
[2.334869] usbcore: registered new interface driver usb-storage

FWIW, "of_irq_parse_pci: failed with rc=-22"
seems to come from:

[1.257411] [] (of_irq_parse_pci) from [] 
(of_irq_parse_and_map_pci+0x10/0x2c)
[1.266420] [] (of_irq_parse_and_map_pci) from [] 
(pci_assign_irq+0x78/0xb0)
[1.275254] [] (pci_assign_irq) from [] 
(pci_device_probe+0x18/0x128)
[1.283476] [] (pci_device_probe) from [] 
(driver_probe_device+0x244/0x2c8)

The error logging was added by f1aa54840657f
No, that just turned one specific error into a warning.
Need to dig a bit more.

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Possible regression between 4.9 and 4.13

2017-08-22 Thread Mason
Hello,

The driver for my system's PCIe host bridge landed recently
(in 4.13) but it was developed on 4.9

I tested the PCIe host bridge by plugging a 4-port USB3 adapter
into the PCIe slot (system at rest) and plugging an USB3 Flash
drive into the USB3 adapter (at run-time).

On 4.9, the setup works (almost perfectly, see below).
On 4.13, once I unplug the Flash drive, the controller port
remains unresponsive.


On 4.9, I said *almost* perfectly, because the pcieport driver
does report a few non-fatal errors when I unplug:

[  193.838504] usb 2-2: new SuperSpeed USB device number 2 using xhci_hcd
[  193.878081] usb-storage 2-2:1.0: USB Mass Storage device detected
[  193.884547] scsi host0: usb-storage 2-2:1.0
[  194.907936] scsi 0:0:0:0: Direct-Access Kingston DataTraveler 3.0  
PQ: 0 ANSI: 6
[  194.920296] sd 0:0:0:0: [sda] 15109516 512-byte logical blocks: (7.74 
GB/7.20 GiB)
[  194.928666] sd 0:0:0:0: [sda] Write Protect is off
[  194.933755] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, 
doesn't support DPO or FUA
[  194.946074]  sda: sda1
[  194.953608] sd 0:0:0:0: [sda] Attached SCSI removable disk

[  208.930260] pcieport :00:00.0: AER: Uncorrected (Non-Fatal) error 
received: id=
[  208.938342] pcieport :00:00.0: PCIe Bus Error: severity=Uncorrected 
(Non-Fatal), type=Transaction Layer, id=(Requester ID)
[  208.950163] pcieport :00:00.0:   device [1105:0024] error 
status/mask=4000/
[  208.958577] pcieport :00:00.0:[14] Completion Timeout (First)
[  208.965432] pcieport :00:00.0: AER: Device recovery failed
[  209.663733] xhci_hcd :01:00.0: Cannot set link state.
[  209.669194] usb usb2-port2: cannot disable (err = -32)
[  209.674376] usb 2-2: USB disconnect, device number 2
[  209.680481] pcieport :00:00.0: AER: Uncorrected (Non-Fatal) error 
received: id=
[  209.688689] pcieport :00:00.0: PCIe Bus Error: severity=Uncorrected 
(Non-Fatal), type=Transaction Layer, id=(Requester ID)
[  209.700555] pcieport :00:00.0:   device [1105:0024] error 
status/mask=4000/
[  209.708978] pcieport :00:00.0:[14] Completion Timeout (First)
[  209.715845] pcieport :00:00.0: AER: Device recovery failed
[  209.721722] pcieport :00:00.0: AER: Uncorrected (Non-Fatal) error 
received: id=
[  209.729785] pcieport :00:00.0: PCIe Bus Error: severity=Uncorrected 
(Non-Fatal), type=Transaction Layer, id=(Requester ID)
[  209.741602] pcieport :00:00.0:   device [1105:0024] error 
status/mask=4000/
[  209.750027] pcieport :00:00.0:[14] Completion Timeout (First)
[  209.756866] pcieport :00:00.0: AER: Device recovery failed

After that, I can still plug the drive into the same port.

But on 4.13, I get

[   27.330378] usb 2-2: new SuperSpeed USB device number 2 using xhci_hcd
[   27.369383] usb-storage 2-2:1.0: USB Mass Storage device detected
[   27.375840] scsi host0: usb-storage 2-2:1.0
[   28.403035] scsi 0:0:0:0: Direct-Access Kingston DataTraveler 3.0  
PQ: 0 ANSI: 6
[   28.413326] sd 0:0:0:0: [sda] 15109516 512-byte logical blocks: (7.74 
GB/7.20 GiB)
[   28.423653] sd 0:0:0:0: [sda] Write Protect is off
[   28.429139] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, 
doesn't support DPO or FUA
[   28.441529]  sda: sda1
[   28.449431] sd 0:0:0:0: [sda] Attached SCSI removable disk

[   90.592134] xhci_hcd :01:00.0: xHCI host controller not responding, 
assume dead
[   90.599857] xhci_hcd :01:00.0: HC died; cleaning up
[   90.605336] usb 2-2: USB disconnect, device number 2
[   90.630414] udevd[955]: inotify_add_watch(6, /dev/sda, 10) failed: No such 
file or directory

Trying to replug into the same port = nothing happens
(Linux did say "assume dead")

Any idea what could have changed between 4.9 and 4.13 ?

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Neophyte questions about PCIe

2017-03-14 Thread Mason
On 14/03/2017 11:23, David Laight wrote:

> Mason wrote:
> 
>> I'd like to push support for this PCIe controller upstream.
>>
>> Is the code I posted on the right track?
>> Maybe I can post a RFC patch tomorrow?
> 
> I think you need to resolve the problem of config space (and IO) cycles
> before the driver can be deemed usable.

You're alluding to the (unfortunate) muxing of config and mem spaces
on my controller, where concurrent accesses by two different threads
would blow the system up.

You've suggested sending IPIs in the config space accessor, in order
to prevent other CPUs from starting a mem access. But this doesn't
help if a mem access is already in flight, AFAIU.

I fear there is nothing that can be done in SW, short of rewriting
drivers such that mem space accesses are handled by a driver-specific
call-back which could take care of all required locking.

AFAICT, my only (reasonable) option is putting a big fat warning
in the code, and pray that concurrent accesses never happen.
(I'll test with a storage stress test on a USB3 drive.)

In parallel, I'm trying to convince management that the HW needs
fixing ASAP.

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Neophyte questions about PCIe

2017-03-13 Thread Mason
On 13/03/2017 22:40, Bjorn Helgaas wrote:

> On Sat, Mar 11, 2017 at 11:57:56AM +0100, Mason wrote:
>
>> On 10/03/2017 18:49, Mason wrote:
>> 
>>> static void tango_pcie_bar_quirk(struct pci_dev *dev)
>>> {
>>> struct pci_bus *bus = dev->bus;
>>>
>>> printk("%s: bus=%d devfn=%d\n", __func__, bus->number, dev->devfn);
>>>
>>> pci_write_config_dword(dev, PCI_BASE_ADDRESS_0, 0x8004);
>>> }
>>> DECLARE_PCI_FIXUP_FINAL(0x1105, PCI_ANY_ID, tango_pcie_bar_quirk);
>>
>> And this is where the elusive "black magic" happens.
>>
>> Is it "safe" to configure a BAR behind Linux's back?
> 
> No.  Linux maintains a struct resource for every BAR.  This quirk
> makes the BAR out of sync with the resource, so Linux no longer has an
> accurate idea of what bus address space is consumed and what is
> available.

Even when Linux is not able to map the BAR, since it's too
large to fit in the mem window?

> Normally a BAR is for mapping device registers into PCI bus address
> space.  If this BAR controls how the RC forwards PCI DMA transactions
> to RAM, then it's not really a BAR and you should prevent Linux from
> seeing it as a BAR.  You could do this by special-casing it in the
> config accessor so reads return 0 and writes are dropped.  Then you
> could write the register in your host bridge driver safely because the
> PCI core would think the BAR is not implemented.

In fact, that's what I used to do in a previous version :-)

I'd like to push support for this PCIe controller upstream.

Is the code I posted on the right track?
Maybe I can post a RFC patch tomorrow?

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Neophyte questions about PCIe

2017-03-11 Thread Mason
On 10/03/2017 18:49, Mason wrote:

> And my current code, to work-around the silicon bugs:
> 
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> 
> //#define DEBUG_CONFIG
> 
> static int tango_config_read(struct pci_bus *bus, unsigned int devfn,
>   int where, int size, u32 *val)
> {
>   int ret;
>   void __iomem *pci_conf = (void *)0xf002e048;
> 
> #ifdef DEBUG_CONFIG
>   if (where == PCI_BASE_ADDRESS_0)
>   dump_stack();
> #endif
> 
>   writel(1, pci_conf);

This sets the config/mem mux to CONFIG SPACE.

>   if (devfn != 0) {
>   *val = ~0;
>   return PCIBIOS_DEVICE_NOT_FOUND;
>   }

This works around a silicon bug, where accesses to devices or
functions not 0 return garbage.

>   ret = pci_generic_config_read(bus, devfn, where, size, val);
> 
>   writel(0, pci_conf);

This resets the config/mem mux back to MEM SPACE.

If anything tries to access MEM in that time frame, we're toast.

> #ifdef DEBUG_CONFIG
>   printk("%s: bus=%d where=%d size=%d val=0x%x\n",
>   __func__, bus->number, where, size, *val);
> #endif
> 
>   return ret;
> }
> 
> static int tango_config_write(struct pci_bus *bus, unsigned int devfn,
>int where, int size, u32 val)
> {
>   int ret;
>   void __iomem *pci_conf = (void *)0xf002e048;
> 
> #ifdef DEBUG_CONFIG
>   if (where == PCI_BASE_ADDRESS_0)
>   dump_stack();
> #endif
> 
> #ifdef DEBUG_CONFIG
>   printk("%s: bus=%d where=%d size=%d val=0x%x\n",
>   __func__, bus->number, where, size, val);
> #endif
> 
>   writel(1, pci_conf);
> 
>   ret = pci_generic_config_write(bus, devfn, where, size, val);
> 
>   writel(0, pci_conf);
> 
>   return ret;
> }
> 
> static struct pci_ecam_ops tango_pci_ops = {
>   .bus_shift  = 20,
>   .pci_ops= {
>   .map_bus= pci_ecam_map_bus,
>   .read   = tango_config_read,
>   .write  = tango_config_write,
>   }
> };
> 
> static const struct of_device_id tango_pci_ids[] = {
>   { .compatible = "sigma,smp8759-pcie" },
>   { /* sentinel */ },
> };
> 
> static int tango_pci_probe(struct platform_device *pdev)
> {
>   return pci_host_common_probe(pdev, _pci_ops);
> }
> 
> static struct platform_driver tango_pci_driver = {
>   .probe = tango_pci_probe,
>   .driver = {
>   .name = KBUILD_MODNAME,
>   .of_match_table = tango_pci_ids,
>   },
> };
> 
> builtin_platform_driver(tango_pci_driver);
> 
> #define RIESLING_B 0x24
> 
> /* Root complex reports incorrect device class */
> static void tango_pcie_fixup_class(struct pci_dev *dev)
> {
>   dev->class = PCI_CLASS_BRIDGE_PCI << 8;
> }
> DECLARE_PCI_FIXUP_EARLY(0x1105, RIESLING_B, tango_pcie_fixup_class);

This works around another silicon bug.

> static void tango_pcie_bar_quirk(struct pci_dev *dev)
> {
>   struct pci_bus *bus = dev->bus;
> 
>   printk("%s: bus=%d devfn=%d\n", __func__, bus->number, dev->devfn);
> 
> pci_write_config_dword(dev, PCI_BASE_ADDRESS_0, 0x8004);
> }
> DECLARE_PCI_FIXUP_FINAL(0x1105, PCI_ANY_ID, tango_pcie_bar_quirk);

And this is where the elusive "black magic" happens.

Is it "safe" to configure a BAR behind Linux's back?

Basically, there seems to be an identity map between RAM and PCI space.
(Is that, perhaps, some kind of default? I would think that the default
would have been defined by the "ranges" prop in the pci DT node.)

So PCI address 0x8000_ maps to CPU address 0x8000_, i.e. the
start of system RAM. And when dev 1 accesses RAM, the RC correctly
forwards the packet to the memory bus.

However, RC BAR0 is limited to 1 GB (split across 8 x 128 MB "region").

Thus, to properly set this up, I need to account for what memory
Linux is managing, i.e. the mem= command line argument.
(I don't know how to access that at run-time.)

For example, if we have 2 x 512 MB of RAM.
DRAM0 is at [0x8000_, 0xa000_[
DRAM1 is at [0xc000_, 0xe000_[

But a different situation is 1 x 1 GB of RAM.
DRAM0 is at [0x8000_, 0xc000_[

I need to program different region targets.
How to do that in a way that is acceptable upstream?

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Neophyte questions about PCIe

2017-03-10 Thread Mason
On 10/03/2017 17:45, Mason wrote:

> Time to clean up a million hacks to be able to discuss the finer points.

Here is my current boot log:

[1.133895] OF: PCI: host bridge /soc/pcie@5000 ranges:
[1.139607] pci_add_resource_offset: res=[bus 00-0f] offset=0x0
[1.145659] OF: PCI: Parsing ranges property...
[1.150316] OF: PCI:   MEM 0x5400..0x5fff -> 0x0400
[1.156364] pci_add_resource_offset: res=[mem 0x5400-0x5fff] 
offset=0x5000
[1.164628] pci_tango 5000.pcie: ECAM at [mem 0x5000-0x50ff] for 
[bus 00-0f]
[1.173033] pci_tango 5000.pcie: PCI host bridge to bus :00
[1.179440] pci_bus :00: root bus resource [bus 00-0f]
[1.185056] pci_bus :00: root bus resource [mem 0x5400-0x5fff] 
(bus address [0x0400-0x0fff])
[1.195386] pci_bus :00: scanning bus
[1.199539] pci :00:00.0: [1105:0024] type 01 class 0x048000
[1.205691] pci :00:00.0: calling tango_pcie_fixup_class+0x0/0x10
[1.212277] pci :00:00.0: reg 0x10: [mem 0x-0x3fff 64bit]
[1.219220] pci :00:00.0: calling pci_fixup_ide_bases+0x0/0x40
[1.225570] pci :00:00.0: supports D1 D2
[1.229957] pci :00:00.0: PME# supported from D0 D1 D2 D3hot
[1.236092] pci :00:00.0: PME# disabled
[1.240576] pci_bus :00: fixups for bus
[1.244886] PCI: bus0: Fast back to back transfers disabled
[1.250587] pci :00:00.0: scanning [bus 00-00] behind bridge, pass 0
[1.257420] pci :00:00.0: bridge configuration invalid ([bus 00-00]), 
reconfiguring
[1.265567] pci :00:00.0: scanning [bus 00-00] behind bridge, pass 1
[1.272517] pci_bus :01: busn_res: can not insert [bus 01-ff] under [bus 
00-0f] (conflicts with (null) [bus 00-0f])
[1.283462] pci_bus :01: scanning bus
[1.287623] pci :01:00.0: [1912:0014] type 00 class 0x0c0330
[1.293799] pci :01:00.0: reg 0x10: [mem 0x-0x1fff 64bit]
[1.300799] pci :01:00.0: calling pci_fixup_ide_bases+0x0/0x40
[1.307223] pci :01:00.0: PME# supported from D0 D3hot D3cold
[1.313446] pci :01:00.0: PME# disabled
[1.318053] pci_bus :01: fixups for bus
[1.322362] PCI: bus1: Fast back to back transfers disabled
[1.328060] pci_bus :01: bus scan returning with max=01
[1.333759] pci_bus :01: busn_res: [bus 01-ff] end is updated to 01
[1.340506] pci_bus :00: bus scan returning with max=01
[1.346205] pci :00:00.0: fixup irq: got 0
[1.350765] pci :00:00.0: assigning IRQ 00
[1.355332] pci :01:00.0: fixup irq: got 0
[1.359892] pci :01:00.0: assigning IRQ 00
[1.364479] pci :00:00.0: BAR 0: no space for [mem size 0x4000 64bit]
[1.371748] pci :00:00.0: BAR 0: failed to assign [mem size 0x4000 
64bit]
[1.379369] pci :00:00.0: BAR 8: assigned [mem 0x5400-0x540f]
[1.386291] pci :01:00.0: BAR 0: assigned [mem 0x5400-0x54001fff 
64bit]
[1.393747] pci :00:00.0: PCI bridge to [bus 01]
[1.398833] pci :00:00.0:   bridge window [mem 0x5400-0x540f]
[1.405767] pci :00:00.0: calling tango_pcie_bar_quirk+0x0/0x40
[1.412160] tango_pcie_bar_quirk: bus=0 devfn=0
[1.416843] pcieport :00:00.0: enabling device (0140 -> 0142)
[1.423074] pcieport :00:00.0: enabling bus mastering
[1.428652] altera_irq_domain_alloc: ENTER
[1.432876] FOO-msi 2e080.msi: msi#0 address_hi 0x0 address_lo 0x9002e07c
[1.440007] FOO-msi 2e080.msi: msi#0 address_hi 0x0 address_lo 0x9002e07c
[1.446972] aer :00:00.0:pcie002: service driver aer loaded
[1.453157] pci :01:00.0: calling quirk_usb_early_handoff+0x0/0x7e0
[1.459913] pci :01:00.0: enabling device (0140 -> 0142)
[1.465709] quirk_usb_handoff_xhci: ioremap(0x5400, 8192)
[1.471589] xhci_find_next_ext_cap: offset=0x500
[1.476325] val = 0x1000401
...
[1.624093] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[1.630675] ehci-pci: EHCI PCI platform driver
[1.635338] xhci_hcd :01:00.0: enabling bus mastering
[1.640789] xhci_hcd :01:00.0: xHCI Host Controller
[1.646071] xhci_hcd :01:00.0: new USB bus registered, assigned bus 
number 1
[1.659065] xhci_find_next_ext_cap: offset=0x500
[1.663714] val = 0x1000401
[1.666526] xhci_find_next_ext_cap: offset=0x510
[1.671171] val = 0x3000502
[1.673984] xhci_find_next_ext_cap: offset=0x510
[1.678632] val = 0x3000502
[1.681433] xhci_find_next_ext_cap: offset=0x524
[1.686079] val = 0x2000702
[1.68] xhci_find_next_ext_cap: offset=0x524
[1.693533] val = 0x2000702
[1.696343] xhci_find_next_ext_cap: offset=0x540
[1.700987] val = 0x4c0
[1.703446] xhci_find_next_ext_cap: offset=0x550
[1.708091] val = 0xa
[1.710382] xhci_find_next_ext_cap: offset=0x510
[1.715028] val = 0x3000502
[1.717837] xhci_find_next_ext_cap: 

Re: Neophyte questions about PCIe

2017-03-10 Thread Mason
On 10/03/2017 00:43, Mason wrote:

> I think I'm making progress [...]

Yes! I was able to plug a USB3 Flash drive, mount it,
and read its contents. A million thanks, my head was
starting to hurt from too much banging.

Time to clean up a million hacks to be able to discuss
the finer points.

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Neophyte questions about PCIe

2017-03-10 Thread Mason
On 10/03/2017 16:14, David Laight wrote:

> Mason wrote:
> 
>> My RC drops packets not targeting its BAR0.
> 
> I suspect the fpga/cpld logic supports RC and endpoint modes
> and is using much the same names for the registers (and logic
> implementation).

Your guess is spot on.

In the controller's MMIO registers, the so-called core_conf_0
register has the following field:

chip_is_root: 1 means tango is root port, 0 means tango is endpoint.

> If your cpu support more than 1GB of memory but only part is
> PCIe accessible you'll have to ensure that all the memory
> definitions are set correctly and 'bounce buffers' used for
> some operations.

Do you mean I would have to "fix" something in the XHCI driver?

Hopefully, no customer plans to give Linux more than 1 GB.
(Although the latest systems do support 4 GB... A lot of it is
used for video buffers, handled outside Linux.)

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Neophyte questions about PCIe

2017-03-10 Thread Mason
On 10/03/2017 15:06, David Laight wrote:

> Robin Murphy wrote:
>
>> On 09/03/17 23:43, Mason wrote:
>>
>>> I think I'm making progress, in that I now have a better
>>> idea of what I don't understand. So I'm able to ask
>>> (hopefully) less vague questions.
>>>
>>> Take the USB3 PCIe adapter I've been testing with. At some
>>> point during init, the XHCI driver request some memory
>>> (via kmalloc?) in order to exchange data with the host, right?
>>>
>>> On my SoC, the RAM used by Linux lives at physical range
>>> [0x8000_, 0x8800_[ => 128 MB
>>>
>>> How does the XHCI driver make the adapter aware of where
>>> it can scribble data? The XHCI driver has no notion that
>>> the device is behind a bus, does it?
>>>
>>> At some point, the physical addresses must be converted
>>> to PCI bus addresses, right? Is it computed subtracting
>>> the offset defined in the DT?
> 
> The driver should call dma_alloc_coherent() which returns both the
> kernel virtual address and the device (xhci controller) has
> to use to access it.
> The cpu physical address is irrelevant (although it might be
> calculated in the middle somewhere).

Thank you for that missing piece of the puzzle.
I see some relevant action in drivers/usb/host/xhci-mem.c

And I now see this log:

[2.499320] xhci_hcd :01:00.0: // Device context base array address = 
0x8e07e000 (DMA), d0855000 (virt)
[2.509156] xhci_hcd :01:00.0: Allocated command ring at cfb04200
[2.515640] xhci_hcd :01:00.0: First segment DMA is 0x8e07f000
[2.521863] xhci_hcd :01:00.0: // Setting command ring address to 0x20
[2.528786] xhci_hcd :01:00.0: // xHC command ring deq ptr low bits + 
flags = @
[2.537188] xhci_hcd :01:00.0: // xHC command ring deq ptr high bits = 
@
[2.545002] xhci_hcd :01:00.0: // Doorbell array is located at offset 
0x800 from cap regs base addr
[2.554455] xhci_hcd :01:00.0: // xHCI capability registers at d0852000:
[2.561550] xhci_hcd :01:00.0: // @d0852000 = 0x120 (CAPLENGTH AND 
HCIVERSION)

I believe 0x8e07e000 is a CPU address, not a PCI bus address.


>>> Then suppose the USB3 card wants to write to an address
>>> in RAM. It sends a packet on the PCIe bus, targeting
>>> the PCI bus address of that RAM, right? Is this address
>>> supposed to be in BAR0 of the root complex? I guess not,
>>> since Bjorn said that it was unusual for a RC to have
>>> a BAR at all. So I'll hand-wave, and decree that, by some
>>> protocol magic, the packet arrives at the PCIe controller.
>>> And this controller knows to forward this write request
>>> over the memory bus. Does that look about right?
>>
>> Generally, yes - if an area of memory space *is* claimed by a BAR, then
>> another PCI device accessing that would be treated as peer-to-peer DMA,
>> which may or may not be allowed (or supported at all).
> 
> So PCIe addresses that refer to the host memory addresses are
> just forwarded to the memory subsystem.
> In practise this is almost everything.

My RC drops packets not targeting its BAR0.

> The only other PCIe writes the host will see are likely to be associated
> with MIS and MSI-X interrupt support.

Rev 1 of the PCIe controller is supposed to forward MSI doorbell
writes over the global bus to the PCIe controller's MMIO register.

> Some PCIe root complex support peer-to-peer writes but not reads.
> Write are normally 'posted' (so are 'fire and forget') reads need the
> completion TLP (containing the data) sent back - all hard and difficult.
> 
>> For mem space
>> which isn't claimed by BARs, it's up to the RC to decide what to do. As
>> a concrete example (which might possibly be relevant) the PLDA XR3-AXI
>> IP which we have in the ARM Juno SoC has the ATR_PCIE_WINx registers in
>> its root port configuration block that control what ranges of mem space
>> are mapped to the external AXI master interface and how.
>>
>>> My problem is that, in the current implementation of the
>>> PCIe controller, the USB device that wants to write to
>>> memory is supposed to target BAR0 of the RC.
>>
>> That doesn't sound right at all. If the RC has a BAR, I'd expect it to
>> be for poking the guts of the RC device itself (since this prompted me
>> to go and compare, I see the Juno RC does indeed have it own enigmatic
>> 16KB BAR, which reads as ever-changing random junk; no idea what that's
>> about).
>>
>>> Since my mem space is limited to 256 MB, then BAR0 is
>>> limited to 256 MB (or even 128 MB, since I also need

Re: Neophyte questions about PCIe

2017-03-10 Thread Mason
On 10/03/2017 14:15, Robin Murphy wrote:
> On 09/03/17 23:43, Mason wrote:
>> On 08/03/2017 16:17, Bjorn Helgaas wrote:
>> [snip excellent in-depth overview]
>>
>> I think I'm making progress, in that I now have a better
>> idea of what I don't understand. So I'm able to ask
>> (hopefully) less vague questions.
>>
>> Take the USB3 PCIe adapter I've been testing with. At some
>> point during init, the XHCI driver request some memory
>> (via kmalloc?) in order to exchange data with the host, right?
>>
>> On my SoC, the RAM used by Linux lives at physical range
>> [0x8000_, 0x8800_[ => 128 MB
>>
>> How does the XHCI driver make the adapter aware of where
>> it can scribble data? The XHCI driver has no notion that
>> the device is behind a bus, does it?
>>
>> At some point, the physical addresses must be converted
>> to PCI bus addresses, right? Is it computed subtracting
>> the offset defined in the DT?
>>
>> Then suppose the USB3 card wants to write to an address
>> in RAM. It sends a packet on the PCIe bus, targeting
>> the PCI bus address of that RAM, right? Is this address
>> supposed to be in BAR0 of the root complex? I guess not,
>> since Bjorn said that it was unusual for a RC to have
>> a BAR at all. So I'll hand-wave, and decree that, by some
>> protocol magic, the packet arrives at the PCIe controller.
>> And this controller knows to forward this write request
>> over the memory bus. Does that look about right?
> 
> Generally, yes - if an area of memory space *is* claimed by a BAR, then
> another PCI device accessing that would be treated as peer-to-peer DMA,
> which may or may not be allowed (or supported at all). For mem space
> which isn't claimed by BARs, it's up to the RC to decide what to do. As
> a concrete example (which might possibly be relevant) the PLDA XR3-AXI
> IP which we have in the ARM Juno SoC has the ATR_PCIE_WINx registers in
> its root port configuration block that control what ranges of mem space
> are mapped to the external AXI master interface and how.

The HW dev told me that the Verilog code for the RC considers
packets not targeted at RC BAR0 an error, and drops it.


>> My problem is that, in the current implementation of the
>> PCIe controller, the USB device that wants to write to
>> memory is supposed to target BAR0 of the RC.
> 
> That doesn't sound right at all. If the RC has a BAR, I'd expect it to
> be for poking the guts of the RC device itself (since this prompted me
> to go and compare, I see the Juno RC does indeed have it own enigmatic
> 16KB BAR, which reads as ever-changing random junk; no idea what that's
> about).

That's not how our RC works. If I want to poke its guts, I have
some MMIO addresses on the global bus. RC BAR0 is strictly used
as a window to the global bus.


>> Since my mem space is limited to 256 MB, then BAR0 is
>> limited to 256 MB (or even 128 MB, since I also need
>> to mapthe device's BAR into the same mem space).
> 
> Your window into mem space *from the CPU's point of view* is limited to
> 256MB. The relationship between mem space and the system (AXI) memory
> map from the point of view of PCI devices is a separate issue; if it's
> configurable at all, it probably makes sense to have the firmware set an
> outbound window to at least cover DRAM 1:1, then forget about it (this
> is essentially what Juno UEFI does, for example).

The size of RC BAR0 is limited to 1 GB, so best case I can map
1 GB back to the system RAM. Well, actually best case is 896 MB
since 1/8 of the window must map the MSI doorbell region.

I'll see what I can come up with.

Thanks a lot for your comments.

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Neophyte questions about PCIe

2017-03-09 Thread Mason
On 08/03/2017 16:17, Bjorn Helgaas wrote:
[snip excellent in-depth overview]

I think I'm making progress, in that I now have a better
idea of what I don't understand. So I'm able to ask
(hopefully) less vague questions.

Take the USB3 PCIe adapter I've been testing with. At some
point during init, the XHCI driver request some memory
(via kmalloc?) in order to exchange data with the host, right?

On my SoC, the RAM used by Linux lives at physical range
[0x8000_, 0x8800_[ => 128 MB

How does the XHCI driver make the adapter aware of where
it can scribble data? The XHCI driver has no notion that
the device is behind a bus, does it?

At some point, the physical addresses must be converted
to PCI bus addresses, right? Is it computed subtracting
the offset defined in the DT?

Then suppose the USB3 card wants to write to an address
in RAM. It sends a packet on the PCIe bus, targeting
the PCI bus address of that RAM, right? Is this address
supposed to be in BAR0 of the root complex? I guess not,
since Bjorn said that it was unusual for a RC to have
a BAR at all. So I'll hand-wave, and decree that, by some
protocol magic, the packet arrives at the PCIe controller.
And this controller knows to forward this write request
over the memory bus. Does that look about right?

My problem is that, in the current implementation of the
PCIe controller, the USB device that wants to write to
memory is supposed to target BAR0 of the RC.

Since my mem space is limited to 256 MB, then BAR0 is
limited to 256 MB (or even 128 MB, since I also need
to mapthe device's BAR into the same mem space).

So, if I understand correctly (which, at this point,
is quite unlikely) PCIe will work correctly for me
only if Linux manages 128 MB or less...

How does it work on systems where the RC has no BAR?
I suppose devices are able to access all of RAM...
because the controller forwards everything? (This may
be where an IOMMU comes handy?)

Is there a way to know, at run-time, where and how big
Linux's dynamic memory pool is? Perhaps the memory pool
itself remains smaller than 128 MB?

I realize that I've asked a million questions. Feel free
to ignore most of them, if you can help with just one,
it would be a tremendous help already.

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Implementing MSI support on my platform

2017-03-08 Thread Mason
On 07/03/2017 17:47, Mason wrote:

> As suggested by Marc, I'm trying to adapt
>   drivers/pci/host/pcie-altera-msi.c
> to my platform.

For my own reference, I have enabled verbose XHCI debug logs.
I have highlighted suspicious output with *

[0.00] Booting Linux on physical CPU 0x0
[0.00] Linux version 4.9.7-1-rc2 (gcc version 5.3.1 20160113 (Linaro 
GCC 5.3-2016.02) ) #53 SMP PREEMPT Wed Mar 8 10:53:55 CET 2017
[0.00] CPU: ARMv7 Processor [413fc090] revision 0 (ARMv7), cr=10c5387d
[0.00] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing 
instruction cache
[0.00] OF: fdt:Machine model: Sigma Designs SMP8758 Vantage-1172 Rev E1
[0.00] debug: ignoring loglevel setting.
[0.00] bootconsole [earlycon0] enabled
[0.00] Memory policy: Data cache writealloc
[0.00] On node 0 totalpages: 65536
[0.00] free_area_init_node: node 0, pgdat c1023980, node_mem_map 
cfdf9000
[0.00]   Normal zone: 512 pages used for memmap
[0.00]   Normal zone: 0 pages reserved
[0.00]   Normal zone: 65536 pages, LIFO batch:15
[0.00] percpu: Embedded 14 pages/cpu @cfdd7000 s24960 r8192 d24192 
u57344
[0.00] pcpu-alloc: s24960 r8192 d24192 u57344 alloc=14*4096
[0.00] pcpu-alloc: [0] 0 [0] 1 
[0.00] Built 1 zonelists in Zone order, mobility grouping on.  Total 
pages: 65024
[0.00] Kernel command line: mem=256M ignore_loglevel 
console=ttyS0,115200 earlyprintk
[0.00] PID hash table entries: 1024 (order: 0, 4096 bytes)
[0.00] Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
[0.00] Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
[0.00] Memory: 243976K/262144K available (4096K kernel code, 147K 
rwdata, 900K rodata, 10240K init, 233K bss, 18168K reserved, 0K cma-reserved, 
0K highmem)
[0.00] Virtual kernel memory layout:
[0.00] vector  : 0x - 0x1000   (   4 kB)
[0.00] fixmap  : 0xffc0 - 0xfff0   (3072 kB)
[0.00] vmalloc : 0xd080 - 0xff80   ( 752 MB)
[0.00] lowmem  : 0xc000 - 0xd000   ( 256 MB)
[0.00] pkmap   : 0xbfe0 - 0xc000   (   2 MB)
[0.00] modules : 0xbf00 - 0xbfe0   (  14 MB)
[0.00]   .text : 0xc0008000 - 0xc050   (5088 kB)
[0.00]   .init : 0xc060 - 0xc100   (10240 kB)
[0.00]   .data : 0xc100 - 0xc1024d20   ( 148 kB)
[0.00].bss : 0xc1024d20 - 0xc105f200   ( 234 kB)
[0.00] Preemptible hierarchical RCU implementation.
[0.00]  Build-time adjustment of leaf fanout to 32.
[0.00]  RCU restricting CPUs from NR_CPUS=4 to nr_cpu_ids=2.
[0.00] RCU: Adjusting geometry for rcu_fanout_leaf=32, nr_cpu_ids=2
[0.00] NR_IRQS:16 nr_irqs:16 16
[0.00] L2C-310 enabling early BRESP for Cortex-A9
[0.00] L2C-310 ID prefetch enabled, offset 4 lines
[0.00] L2C-310 dynamic clock gating enabled, standby mode enabled
[0.00] L2C-310 cache controller enabled, 8 ways, 512 kB
[0.00] L2C-310: CACHE_ID 0x41c8, AUX_CTRL 0x72860401
[0.00] clocksource: tango-xtal: mask: 0x max_cycles: 
0x, max_idle_ns: 70787423951 ns
[0.04] sched_clock: 32 bits at 27MHz, resolution 37ns, wraps every 
79536431085ns
[0.007960] Switching to timer-based delay loop, resolution 37ns
[0.014323] Console: colour dummy device 80x30
[0.018896] Calibrating delay loop (skipped), value calculated using timer 
frequency.. 54.25 BogoMIPS (lpj=9)
[0.029314] pid_max: default: 32768 minimum: 301
[0.034131] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes)
[0.040871] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes)
[0.048514] CPU: Testing write buffer coherency: ok
[0.053725] CPU0: thread -1, cpu 0, socket 0, mpidr 8000
[0.059550] Setting up static identity map for 0x8010 - 0x80100034
[0.120459] CPU1: thread -1, cpu 1, socket 0, mpidr 8001
[0.120523] Brought up 2 CPUs
[0.129374] SMP: Total of 2 processors activated (108.50 BogoMIPS).
[0.135764] CPU: All CPU(s) started in SVC mode.
[0.141062] devtmpfs: initialized
[0.145449] VFP support v0.3: implementor 41 architecture 3 part 30 variant 
9 rev 4
[0.153552] clocksource: jiffies: mask: 0x max_cycles: 0x, 
max_idle_ns: 6370867519511994 ns
[0.163845] NET: Registered protocol family 16
[0.169109] DMA: preallocated 256 KiB pool for atomic coherent allocations
[0.177113] tango_pcie_init
[0.180010] [2e000] = ff00
[0.183187] [2e004] = ff00
[0.186346] [2e008] = ff00
[0.189515] [2e00c] = ff00
[0.192674] [2e010] = ff00
[0.195842] [2e014] = ff00
[0.199000] [2e018] = ff00
[0.202168] [2e01c] = ff00
[0.205326] [2e020] = 4ab099f2
[0.208492] [2e024] =

Implementing MSI support on my platform

2017-03-07 Thread Mason
Hello,

As suggested by Marc, I'm trying to adapt
  drivers/pci/host/pcie-altera-msi.c
to my platform.


Here are my changes to the existing driver:

diff --git a/drivers/pci/host/pcie-altera-msi.c 
b/drivers/pci/host/pcie-altera-msi.c
index 4e5d628e8cd4..914cd26b2a53 100644
--- a/drivers/pci/host/pcie-altera-msi.c
+++ b/drivers/pci/host/pcie-altera-msi.c
@@ -18,6 +18,7 @@
  * this program.  If not, see .
  */
 
+#define DEBUG 42
 #include 
 #include 
 #include 
@@ -31,7 +32,11 @@
 
 #define MSI_STATUS 0x0
 #define MSI_ERROR  0x4
+#if 0
 #define MSI_INTMASK0x8
+#else
+#define MSI_INTMASK0x20
+#endif
 
 #define MAX_MSI_VECTORS32
 
@@ -51,12 +56,19 @@ struct altera_msi {
 static inline void msi_writel(struct altera_msi *msi, const u32 value,
  const u32 reg)
 {
+   printk("%s: reg=%u val=0x%x\n", __func__, reg, value);
writel_relaxed(value, msi->csr_base + reg);
 }
 
 static inline u32 msi_readl(struct altera_msi *msi, const u32 reg)
 {
+#if 0
return readl_relaxed(msi->csr_base + reg);
+#else
+   u32 val = readl_relaxed(msi->csr_base + reg);
+   printk("%s: reg=%u val=0x%x\n", __func__, reg, val);
+   return val;
+#endif
 }
 
 static void altera_msi_isr(struct irq_desc *desc)
@@ -68,14 +80,19 @@ static void altera_msi_isr(struct irq_desc *desc)
u32 bit;
u32 virq;
 
+   printk("%s: ENTER\n", __func__);
chained_irq_enter(chip, desc);
msi = irq_desc_get_handler_data(desc);
num_of_vectors = msi->num_of_vectors;
 
while ((status = msi_readl(msi, MSI_STATUS)) != 0) {
for_each_set_bit(bit, , msi->num_of_vectors) {
+#if 0
/* Dummy read from vector to clear the interrupt */
readl_relaxed(msi->vector_base + (bit * sizeof(u32)));
+#else
+   msi_writel(msi, bit, MSI_STATUS);
+#endif
 
virq = irq_find_mapping(msi->inner_domain, bit);
if (virq)
@@ -103,7 +120,11 @@ static void altera_msi_isr(struct irq_desc *desc)
 static void altera_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
 {
struct altera_msi *msi = irq_data_get_irq_chip_data(data);
+#if 0
phys_addr_t addr = msi->vector_phy + (data->hwirq * sizeof(u32));
+#else
+   phys_addr_t addr = 0x800 + 0x2e07c;
+#endif
 
msg->address_lo = lower_32_bits(addr);
msg->address_hi = upper_32_bits(addr);
@@ -247,6 +268,7 @@ static int altera_msi_probe(struct platform_device *pdev)
return PTR_ERR(msi->csr_base);
}
 
+#if 0
res = platform_get_resource_byname(pdev, IORESOURCE_MEM,
   "vector_slave");
msi->vector_base = devm_ioremap_resource(>dev, res);
@@ -256,6 +278,10 @@ static int altera_msi_probe(struct platform_device *pdev)
}
 
msi->vector_phy = res->start;
+#else
+   msi->vector_base = (void *)0xdeedbeef;
+   msi->vector_phy = 0xdeadbeef;
+#endif
 
if (of_property_read_u32(np, "num-vectors", >num_of_vectors)) {
dev_err(>dev, "failed to parse the number of vectors\n");
@@ -276,6 +302,8 @@ static int altera_msi_probe(struct platform_device *pdev)
irq_set_chained_handler_and_data(msi->irq, altera_msi_isr, msi);
platform_set_drvdata(pdev, msi);
 
+   printk("%s: res=%pr irq=%d ret=%d\n", __func__, res, msi->irq, ret);
+
return 0;
 
 err:



Here are the relevant DT nodes:

msi0: msi@2e080 {
compatible = "altr,msi-1.0";
reg = <0x2e080 0x40>;
reg-names = "csr";
interrupt-parent = <>;
interrupts = <55 IRQ_TYPE_LEVEL_HIGH>;
msi-controller;
num-vectors = <32>;
};

pcie@5000 {
compatible = "sigma,smp8759-pcie";
reg = <0x5000 0x800>;
device_type = "pci";
bus-range = <0x0 0x7f>;
#size-cells = <2>;
#address-cells = <3>;
#interrupt-cells = <1>;
/* BUS_ADDRESS(3)  CPU_PHYSICAL(1)  SIZE(2) */
ranges = <0x0200 0x0 0x800  0x5800  0x0 
0x800>;
msi-parent = <>;
};



Here are the relevant boot-time logs I see:

[0.392199] altera_msi_probe: res=[mem 0x0002e080-0x0002e0bf flags 0x200] 
irq=22 ret=0
...
[0.993868] OF: PCI: host bridge /soc/pcie@5000 ranges:
[0.999582] OF: PCI: Parsing ranges property...
[1.004250] OF: PCI:   MEM 0x5800..0x5fff -> 0x0800
[1.011770] pci_tango 5000.pcie: ECAM at [mem 0x5000-0x57ff] for 

Re: Panic in quirk_usb_early_handoff

2017-03-06 Thread Mason
On 06/03/2017 16:27, David Laight wrote:

> Mason wrote:
>>
>>> So the kernel panics in xhci_find_next_ext_cap()
>>> ( drivers/usb/host/xhci-ext-caps.h:122 )
>>> http://lxr.free-electrons.com/source/drivers/usb/host/xhci-ext-caps.h?v=4.9#L122
>>>
>>> Any idea how this can happen?
>>>
>>> base = ioremap_nocache(pci_resource_start(pdev, 0), len);
>>>
>>> Could I be passing garbage to ioremap_nocache?
>>
>> Oh...
>>
>> I have just now understood what Ard wrote a few days ago.
>>
>> The issue was that, on this platform, the PCI configuration space
>> and memory space are multiplexed; in other words they reside at
>> the same physical address, with a bit in MMIO to choose one or
>> the other.
> 
> Time to shoot another hardware engineer.

He's in CC :-)

> Hopefully it isn't an SMP system - but I wouldn't put it past them.

This is a dual- and quad- Cortex A9 MP platform :-(

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Panic in quirk_usb_early_handoff

2017-03-06 Thread Mason
On 06/03/2017 15:30, Robin Murphy wrote:

> On 06/03/17 12:42, Mason wrote:
>
>> $ arm-linux-gnueabihf-addr2line -i -e vmlinux c039fe44
>> arch/arm/include/asm/io.h:119
>>
>> In other words, readl()
>> Not as helpful as expected...
> 
> I guess your toolchain isn't generating whatever debug info that -i uses
> to show where it was actually inlined, shame.

I used gcc-linaro-5.3.1-2016.05-x86_64_arm-linux-gnueabihf
Is that too old?

Might the issue come from my kernel config?

#
# Compile-time checks and compiler options
#
CONFIG_DEBUG_INFO=y
CONFIG_DEBUG_INFO_REDUCED=y
CONFIG_DEBUG_INFO_SPLIT=y
# CONFIG_DEBUG_INFO_DWARF4 is not set
# CONFIG_GDB_SCRIPTS is not set
CONFIG_ENABLE_WARN_DEPRECATED=y
CONFIG_ENABLE_MUST_CHECK=y
CONFIG_FRAME_WARN=1024
# CONFIG_STRIP_ASM_SYMS is not set
# CONFIG_READABLE_ASM is not set
# CONFIG_UNUSED_SYMBOLS is not set
# CONFIG_PAGE_OWNER is not set
# CONFIG_DEBUG_FS is not set
# CONFIG_HEADERS_CHECK is not set
# CONFIG_DEBUG_SECTION_MISMATCH is not set
CONFIG_SECTION_MISMATCH_WARN_ONLY=y
# CONFIG_DEBUG_FORCE_WEAK_PER_CPU is not set
# CONFIG_MAGIC_SYSRQ is not set
CONFIG_DEBUG_KERNEL=y


> Put together, if I'm skimming unfamiliar XHCI code and docs correctly,
> this would imply that a supposed read of the HCC Parameters register
> claimed that the extended capabilities register was at offset 0x29f8
> into a 0x2000-long BAR. That does suggest that whatever's being accessed
> through that ioremap() isn't actually the contents of BAR 0 at all (said
> field should apparently read as 0x140 representing an offset of 0x500).
> You're not still trying have your PCI host controller place its
> MEM-space window over the top of system RAM, are you? Otherwise, I'd be
> inclined to double check that your config space accesses and resource
> assignment are producing sane values.

It looks like the current PCI framework doesn't expect platforms to
multiplex config space and MEM space :-(

[0.994011] OF: PCI: host bridge /soc/pcie@5000 ranges:
[0.999721] OF: PCI: Parsing ranges property...
[1.004386] OF: PCI:   MEM 0x5000..0x5fff -> 0x
[1.010471] pci-host-generic 5000.pcie:
can't claim ECAM area [mem 0x5000-0x5fff]:
address conflict with /soc/pcie@5000 [mem 
0x5000-0x5fff]
[1.025265] pci-host-generic: probe of 5000.pcie failed with error -16

pcie@5000 {
compatible = "pci-host-ecam-generic";
reg = <0x5000 0x1000>;
device_type = "pci";
#size-cells = <2>;
#address-cells = <3>;
#interrupt-cells = <1>;
ranges = <0x0200 0x0 0x0  0x5000  0x0 
0x1000>;
};

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Panic in quirk_usb_early_handoff

2017-03-06 Thread Mason
On 06/03/2017 13:42, Mason wrote:

> So the kernel panics in xhci_find_next_ext_cap()
> ( drivers/usb/host/xhci-ext-caps.h:122 )
> http://lxr.free-electrons.com/source/drivers/usb/host/xhci-ext-caps.h?v=4.9#L122
> 
> Any idea how this can happen?
> 
>   base = ioremap_nocache(pci_resource_start(pdev, 0), len);
> 
> Could I be passing garbage to ioremap_nocache?

Oh...

I have just now understood what Ard wrote a few days ago.

The issue was that, on this platform, the PCI configuration space
and memory space are multiplexed; in other words they reside at
the same physical address, with a bit in MMIO to choose one or
the other.

I was specifying an arbitrary address for the memory space,
which doesn't make any sense, as Ard pointed out.

So quirk_usb_handoff_xhci would ioremap(0x9100, 8192)
which is the size of the USB device's memory region, but
0x9100 is an address in system RAM. Thus, the readl
was actually picking up random garbage in RAM, which
makes xhci_find_next_ext_cap blow up pretty fast.

[1.265224] xhci_find_next_ext_cap: offset=0xec44

I'm off to fix my blunder.

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Panic in quirk_usb_early_handoff

2017-03-06 Thread Mason
On 03/03/2017 20:02, Robin Murphy wrote:

> On 03/03/17 17:15, Mason wrote:
> 
>>>> [1.264893] Unable to handle kernel paging request at virtual address 
>>>> d08664f4
> 
> Note that that's a reasonable approximation of a vmalloc address...
> 
>>>> [1.272248] pgd = c0004000
>>>> [1.275060] [d08664f4] *pgd=8f804811, *pte=, *ppte=
>>>> [1.281476] Internal error: Oops: 7 [#1] PREEMPT SMP ARM
>>>> [1.286897] Modules linked in:
>>>> [1.290053] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.9.7-1-rc2 #151
>>>> [1.296696] Hardware name: Sigma Tango DT
>>>> [1.300808] task: cf82c9c0 task.stack: cf838000
>>>> [1.305446] PC is at quirk_usb_early_handoff+0x3e8/0x790
>>>> [1.310873] LR is at ioremap_page_range+0xf8/0x1a8
>>>> [1.315771] pc : []lr : []psr: 000e0013
>>>> [1.315771] sp : cf839d78  ip :   fp : cf839e38
>>>> [1.327482] r10: c10248a0  r9 :   r8 : d08664f4
>>>> [1.332816] r7 : d084e000  r6 : 2000  r5 : 000c0300  r4 : cfb5f800
>>>> [1.339460] r3 : 000184f4  r2 :   r1 : 91001e13  r0 : d084e000
> 
> ...and that specifically it's r0 + r3...
> 
>>> [1.258926] Unable to handle kernel paging request at virtual address 
>>> d0863f70
>>> [1.266284] pgd = c0004000
>>> [1.269097] [d0863f70] *pgd=8f804811, *pte=, *ppte=
>>> [1.275512] Internal error: Oops: 7 [#1] PREEMPT SMP ARM
>>> [1.280933] Modules linked in:
>>> [1.284089] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.9.7-1-rc2 #157
>>> [1.290732] Hardware name: Sigma Tango DT
>>> [1.294843] task: cf82c9c0 task.stack: cf838000
>>> [1.299482] PC is at quirk_usb_early_handoff+0x3e8/0x790
>>> [1.304907] LR is at ioremap_page_range+0xf8/0x1a8
>>> [1.309806] pc : []lr : []psr: 000e0013
>>> [1.309806] sp : cf839d78  ip :   fp : cf839e38
>>> [1.321517] r10: c10248a0  r9 :   r8 : d0863f70
>>> [1.326851] r7 : d084e000  r6 : 2000  r5 : 000c0300  r4 : cfb52800
>>> [1.333495] r3 : 00015f70  r2 :   r1 : 91001e13  r0 : d084e000
> 
> ...and again...
> 
>> [1.261813] Unable to handle kernel paging request at virtual address 
>> d08611e4
>> [1.269167] pgd = c0004000
>> [1.271979] [d08611e4] *pgd=8f804811, *pte=, *ppte=
>> [1.278394] Internal error: Oops: 7 [#1] PREEMPT SMP ARM
>> [1.283815] Modules linked in:
>> [1.286970] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.9.7-1-rc2 #157
>> [1.293614] Hardware name: Sigma Tango DT
>> [1.297726] task: cf82c9c0 task.stack: cf838000
>> [1.302364] PC is at quirk_usb_early_handoff+0x3e8/0x790
>> [1.307790] LR is at ioremap_page_range+0xf8/0x1a8
>> [1.312688] pc : []lr : []psr: 000e0013
>> [1.312688] sp : cf839d78  ip :   fp : cf839e38
>> [1.324399] r10: c10248a0  r9 :   r8 : d08611e4
>> [1.329733] r7 : d084e000  r6 : 2000  r5 : 000c0300  r4 : cfb4e800
>> [1.336377] r3 : 000131e4  r2 :   r1 : 91001e13  r0 : d084e000
> 
> ...and again. And always at the same PC, too. Looking at
> quirk_usb_early_handoff(), it mostly seems to go off poking bridge
> resources, so I'd hazard a guess that it's down to your host driver,
> with something uninitialised (or already freed) being used as an offset
> into some ioremapped resource (which given the consistency of r0 is
> probably allocated pretty early on).
> 
> "addr2line -i -e vmlinux c039fe44", and work backwards from there ;)
> In particular I'd follow the provenance of r3.

Starting from a fresh panic:

[1.236243] pcieport :00:00.0: enabling device (0140 -> 0142)
[1.242474] pcieport :00:00.0: enabling bus mastering
[1.248147] pci :01:00.0: calling quirk_usb_early_handoff+0x0/0x790
[1.254904] pci :01:00.0: enabling device (0140 -> 0142)
[1.260719] Unable to handle kernel paging request at virtual address 
d08509f8
[1.268073] pgd = c0004000
[1.270874] [d08509f8] *pgd=8f804811, *pte=, *ppte=
[1.277282] Internal error: Oops: 7 [#1] PREEMPT SMP ARM
[1.282702] Modules linked in:
[1.285858] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.9.7-1-rc2 #2
[1.292327] Hardware name: Sigma Tango DT
[1.296438] task: cf82c9c0 task.stack: cf838000
[1.301076] PC is at quirk_usb_early_handoff+0x3e8/0x790
[1.306501] LR is at ioremap_page_range+0xf8/0x1a8
[1.311400] pc : []lr : []psr: 000e0013
[ 

Re: Panic in quirk_usb_early_handoff

2017-03-04 Thread Mason
On 04/03/2017 18:16, Ard Biesheuvel wrote:

> After pc, the link register is the most likely to legally point into
> the kernel .text section so it makes sense imo to decode the address
> into a function name plus offset.

Does gcc ever use the link register as a general purpose register?
(In which case, it is very likely to contain "garbage" as far as
function addresses are concerned.)

> Educating people about the architecture's calling convention and
> associated caveats is not the job of the panic handler.

That's a weird statement.

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Panic in quirk_usb_early_handoff

2017-03-04 Thread Mason
On 04/03/2017 09:07, Ard Biesheuvel wrote:
> On 4 March 2017 at 00:24, Mason wrote:
>> On 03/03/2017 20:02, Robin Murphy wrote:
>>
>>> On 03/03/17 17:15, Mason wrote:
>>>
>>>> [1.261813] Unable to handle kernel paging request at virtual address 
>>>> d08611e4
>>>> [1.269167] pgd = c0004000
>>>> [1.271979] [d08611e4] *pgd=8f804811, *pte=, *ppte=
>>>> [1.278394] Internal error: Oops: 7 [#1] PREEMPT SMP ARM
>>>> [1.283815] Modules linked in:
>>>> [1.286970] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.9.7-1-rc2 #157
>>>> [1.293614] Hardware name: Sigma Tango DT
>>>> [1.297726] task: cf82c9c0 task.stack: cf838000
>>>> [1.302364] PC is at quirk_usb_early_handoff+0x3e8/0x790
>>>> [1.307790] LR is at ioremap_page_range+0xf8/0x1a8
>>>> [1.312688] pc : []lr : []psr: 000e0013
>>>> [1.312688] sp : cf839d78  ip :   fp : cf839e38
>>>> [1.324399] r10: c10248a0  r9 :   r8 : d08611e4
>>>> [1.329733] r7 : d084e000  r6 : 2000  r5 : 000c0300  r4 : cfb4e800
>>>> [1.336377] r3 : 000131e4  r2 :   r1 : 91001e13  r0 : d084e000
>>>
>>> ...and again. And always at the same PC, too.
>>
>> By the way, isn't LR supposed to point to the caller of the
>> current function? ("LR is at ioremap_page_range")
>>
>> If so, why does it not appear in the back trace?
> 
> lr is supposed to point to the return address at function entry. After
> that, all bets are off, really, since ARM usually pops the return
> address from the stack straight into the pc register. So in this case,
> it looks like it still contains the address that the most recent leaf
> function returned to (or another function that actually restores the
> return address into lr before branching to it). But it could easily
> contain garbage as well.

If there is only a tiny chance that LR contains genuinely useful
information, then what is the rationale for providing the info
at all in the panic message?

I would argue that no info is better than info that is wrong
most of the time.

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Panic in quirk_usb_early_handoff

2017-03-03 Thread Mason
On 03/03/2017 20:02, Robin Murphy wrote:

> On 03/03/17 17:15, Mason wrote:
> 
>> [1.261813] Unable to handle kernel paging request at virtual address 
>> d08611e4
>> [1.269167] pgd = c0004000
>> [1.271979] [d08611e4] *pgd=8f804811, *pte=, *ppte=
>> [1.278394] Internal error: Oops: 7 [#1] PREEMPT SMP ARM
>> [1.283815] Modules linked in:
>> [1.286970] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.9.7-1-rc2 #157
>> [1.293614] Hardware name: Sigma Tango DT
>> [1.297726] task: cf82c9c0 task.stack: cf838000
>> [1.302364] PC is at quirk_usb_early_handoff+0x3e8/0x790
>> [1.307790] LR is at ioremap_page_range+0xf8/0x1a8
>> [1.312688] pc : []lr : []psr: 000e0013
>> [1.312688] sp : cf839d78  ip :   fp : cf839e38
>> [1.324399] r10: c10248a0  r9 :   r8 : d08611e4
>> [1.329733] r7 : d084e000  r6 : 2000  r5 : 000c0300  r4 : cfb4e800
>> [1.336377] r3 : 000131e4  r2 :   r1 : 91001e13  r0 : d084e000
> 
> ...and again. And always at the same PC, too.

By the way, isn't LR supposed to point to the caller of the
current function? ("LR is at ioremap_page_range")

If so, why does it not appear in the back trace?

[1.541152] [] (quirk_usb_early_handoff) from [] 
(pci_do_fixups+0xc8/0x158)
[1.549992] [] (pci_do_fixups) from [] 
(pci_bus_add_device+0x18/0x90)
[1.558301] [] (pci_bus_add_device) from [] 
(pci_bus_add_devices+0x3c/0x80)
[1.567133] [] (pci_bus_add_devices) from [] 
(pci_bus_add_devices+0x70/0x80)
[1.576055] [] (pci_bus_add_devices) from [] 
(pci_host_common_probe+0xfc/0x324)
[1.585243] [] (pci_host_common_probe) from [] 
(platform_drv_probe+0x34/0x7c)

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Panic in quirk_usb_early_handoff

2017-03-03 Thread Mason
On 03/03/2017 20:02, Robin Murphy wrote:

> On 03/03/17 17:15, Mason wrote:
> 
>> [1.264893] Unable to handle kernel paging request at virtual address 
>> d08664f4
> 
> Note that that's a reasonable approximation of a vmalloc address...
> 
> ...and that specifically it's r0 + r3...
> 
> ...and again. And always at the same PC, too. Looking at
> quirk_usb_early_handoff(), it mostly seems to go off poking bridge
> resources, so I'd hazard a guess that it's down to your host driver,
> with something uninitialised (or already freed) being used as an offset
> into some ioremapped resource (which given the consistency of r0 is
> probably allocated pretty early on).

When you say "host driver", do you mean the USB driver,
or the PCIe controller driver?

I am currently writing the PCIe controller driver, so I do
expect a large number of bugs there; but the USB driver is
just the generic XHCI driver. Although I now realize that
I wrote no DT node for the USB HW... Would that explain
the random weirdness?

> "addr2line -i -e vmlinux c039fe44", and work backwards from there ;) In
> particular I'd follow the provenance of r3.

I'll definitely take a closer look. Thanks for the disassembly.

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Panic in quirk_usb_early_handoff

2017-03-03 Thread Mason
On 03/03/2017 18:10, Mason wrote:
> On 03/03/2017 17:18, Mason wrote:
>> Hello,
>>
>> I'm seeing this panic randomly at boot-time, so I want to throw
>> it out there in case someone recognizes the issue off the top of
>> their head.
>>
>> I'm on Linux 4.9, using a USB3 PCIe card. I'm actively working on
>> the PCIe support, so I may be responsible for the crash by virtue
>> of something I did or didn't do (e.g. I haven't set up the IRQs
>> correctly, but I didn't think it would crash the system).
>>
>> [0.987520] OF: PCI: host bridge /soc/pcie@5000 ranges:
>> [0.993236] OF: PCI:   No bus range found for /soc/pcie@5000, using 
>> [bus 00-ff]
>> [1.001034] OF: PCI: Parsing ranges property...
>> [1.005693] OF: PCI:   MEM 0x9000..0x9fff -> 0x9000
>> [1.014791] pci-host-generic 5000.pcie: ECAM at [mem 
>> 0x5000-0x5fff] for [bus 00-ff]
>> [1.028570] pci-host-generic 5000.pcie: PCI host bridge to bus :00
>> [1.035597] pci_bus :00: root bus resource [bus 00-ff]
>> [1.041212] pci_bus :00: root bus resource [mem 0x9000-0x9fff]
>> [1.048219] pci_bus :00: scanning bus
>> [1.052376] pci :00:00.0: [1105:0024] type 01 class 0x048000
>> [1.058529] pci :00:00.0: calling tango_pcie_fixup_class+0x0/0x10
>> [1.065119] pci :00:00.0: reg 0x10: [mem 0x-0x00ff 64bit]
>> [1.072068] pci :00:00.0: calling pci_fixup_ide_bases+0x0/0x40
>> [1.078415] pci :00:00.0: supports D1 D2
>> [1.082803] pci :00:00.0: PME# supported from D0 D1 D2 D3hot
>> [1.088937] pci :00:00.0: PME# disabled
>> [1.093445] pci_bus :00: fixups for bus
>> [1.097753] PCI: bus0: Fast back to back transfers disabled
>> [1.103453] pci :00:00.0: scanning [bus 00-00] behind bridge, pass 0
>> [1.110286] pci :00:00.0: bridge configuration invalid ([bus 00-00]), 
>> reconfiguring
>> [1.118433] pci :00:00.0: scanning [bus 00-00] behind bridge, pass 1
>> [1.125385] pci_bus :01: scanning bus
>> [1.129557] pci :01:00.0: [1912:0014] type 00 class 0x0c0330
>> [1.135727] pci :01:00.0: reg 0x10: [mem 0x-0x1fff 64bit]
>> [1.142730] pci :01:00.0: calling pci_fixup_ide_bases+0x0/0x40
>> [1.149150] pci :01:00.0: PME# supported from D0 D3hot D3cold
>> [1.155375] pci :01:00.0: PME# disabled
>> [1.159976] pci_bus :01: fixups for bus
>> [1.164305] PCI: bus1: Fast back to back transfers disabled
>> [1.170002] pci_bus :01: bus scan returning with max=01
>> [1.175701] pci_bus :01: busn_res: [bus 01-ff] end is updated to 01
>> [1.182447] pci_bus :00: bus scan returning with max=01
>> [1.188147] pci :00:00.0: fixup irq: got 0
>> [1.192707] pci :00:00.0: assigning IRQ 00
>> [1.197294] pci :01:00.0: fixup irq: got 20
>> [1.201945] pci :01:00.0: assigning IRQ 20
>> [1.206533] pci :00:00.0: BAR 0: assigned [mem 0x9000-0x90ff 
>> 64bit]
>> [1.213984] pci :00:00.0: BAR 8: assigned [mem 0x9100-0x910f]
>> [1.220908] pci :01:00.0: BAR 0: assigned [mem 0x9100-0x91001fff 
>> 64bit]
>> [1.228363] pci :00:00.0: PCI bridge to [bus 01]
>> [1.233449] pci :00:00.0:   bridge window [mem 0x9100-0x910f]
>> [1.240419] pcieport :00:00.0: enabling device (0140 -> 0142)
>> [1.246648] pcieport :00:00.0: enabling bus mastering
>> [1.252321] pci :01:00.0: calling quirk_usb_early_handoff+0x0/0x790
>> [1.259077] pci :01:00.0: enabling device (0140 -> 0142)
>> [1.264893] Unable to handle kernel paging request at virtual address 
>> d08664f4
>> [1.272248] pgd = c0004000
>> [1.275060] [d08664f4] *pgd=8f804811, *pte=, *ppte=
>> [1.281476] Internal error: Oops: 7 [#1] PREEMPT SMP ARM
>> [1.286897] Modules linked in:
>> [1.290053] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.9.7-1-rc2 #151
>> [1.296696] Hardware name: Sigma Tango DT
>> [1.300808] task: cf82c9c0 task.stack: cf838000
>> [1.305446] PC is at quirk_usb_early_handoff+0x3e8/0x790
>> [1.310873] LR is at ioremap_page_range+0xf8/0x1a8
>> [1.315771] pc : []lr : []psr: 000e0013
>> [1.315771] sp : cf839d78  ip :   fp : cf839e38
>> [1.327482] r10: c10248a0  r9 :   r8 : d08664f4
>> [1.332816] r7 : d084e000  r6 : 2000  r5 : 000c0300  r4 : cfb5f800
>> [1.339460] r3 : 000184f4  r2 : 000

Re: Panic in quirk_usb_early_handoff

2017-03-03 Thread Mason
On 03/03/2017 17:18, Mason wrote:
> Hello,
> 
> I'm seeing this panic randomly at boot-time, so I want to throw
> it out there in case someone recognizes the issue off the top of
> their head.
> 
> I'm on Linux 4.9, using a USB3 PCIe card. I'm actively working on
> the PCIe support, so I may be responsible for the crash by virtue
> of something I did or didn't do (e.g. I haven't set up the IRQs
> correctly, but I didn't think it would crash the system).
> 
> [0.987520] OF: PCI: host bridge /soc/pcie@5000 ranges:
> [0.993236] OF: PCI:   No bus range found for /soc/pcie@5000, using 
> [bus 00-ff]
> [1.001034] OF: PCI: Parsing ranges property...
> [1.005693] OF: PCI:   MEM 0x9000..0x9fff -> 0x9000
> [1.014791] pci-host-generic 5000.pcie: ECAM at [mem 
> 0x5000-0x5fff] for [bus 00-ff]
> [1.028570] pci-host-generic 5000.pcie: PCI host bridge to bus :00
> [1.035597] pci_bus :00: root bus resource [bus 00-ff]
> [1.041212] pci_bus :00: root bus resource [mem 0x9000-0x9fff]
> [1.048219] pci_bus :00: scanning bus
> [1.052376] pci :00:00.0: [1105:0024] type 01 class 0x048000
> [1.058529] pci :00:00.0: calling tango_pcie_fixup_class+0x0/0x10
> [1.065119] pci :00:00.0: reg 0x10: [mem 0x-0x00ff 64bit]
> [1.072068] pci :00:00.0: calling pci_fixup_ide_bases+0x0/0x40
> [1.078415] pci :00:00.0: supports D1 D2
> [1.082803] pci :00:00.0: PME# supported from D0 D1 D2 D3hot
> [1.088937] pci :00:00.0: PME# disabled
> [1.093445] pci_bus :00: fixups for bus
> [1.097753] PCI: bus0: Fast back to back transfers disabled
> [1.103453] pci :00:00.0: scanning [bus 00-00] behind bridge, pass 0
> [1.110286] pci :00:00.0: bridge configuration invalid ([bus 00-00]), 
> reconfiguring
> [1.118433] pci :00:00.0: scanning [bus 00-00] behind bridge, pass 1
> [1.125385] pci_bus :01: scanning bus
> [1.129557] pci :01:00.0: [1912:0014] type 00 class 0x0c0330
> [1.135727] pci :01:00.0: reg 0x10: [mem 0x-0x1fff 64bit]
> [1.142730] pci :01:00.0: calling pci_fixup_ide_bases+0x0/0x40
> [1.149150] pci :01:00.0: PME# supported from D0 D3hot D3cold
> [1.155375] pci :01:00.0: PME# disabled
> [1.159976] pci_bus :01: fixups for bus
> [1.164305] PCI: bus1: Fast back to back transfers disabled
> [1.170002] pci_bus :01: bus scan returning with max=01
> [1.175701] pci_bus :01: busn_res: [bus 01-ff] end is updated to 01
> [1.182447] pci_bus :00: bus scan returning with max=01
> [1.188147] pci :00:00.0: fixup irq: got 0
> [1.192707] pci :00:00.0: assigning IRQ 00
> [1.197294] pci :01:00.0: fixup irq: got 20
> [1.201945] pci :01:00.0: assigning IRQ 20
> [1.206533] pci :00:00.0: BAR 0: assigned [mem 0x9000-0x90ff 
> 64bit]
> [1.213984] pci :00:00.0: BAR 8: assigned [mem 0x9100-0x910f]
> [1.220908] pci :01:00.0: BAR 0: assigned [mem 0x9100-0x91001fff 
> 64bit]
> [1.228363] pci :00:00.0: PCI bridge to [bus 01]
> [1.233449] pci :00:00.0:   bridge window [mem 0x9100-0x910f]
> [1.240419] pcieport :00:00.0: enabling device (0140 -> 0142)
> [1.246648] pcieport :00:00.0: enabling bus mastering
> [1.252321] pci :01:00.0: calling quirk_usb_early_handoff+0x0/0x790
> [1.259077] pci :01:00.0: enabling device (0140 -> 0142)
> [1.264893] Unable to handle kernel paging request at virtual address 
> d08664f4
> [1.272248] pgd = c0004000
> [1.275060] [d08664f4] *pgd=8f804811, *pte=, *ppte=
> [1.281476] Internal error: Oops: 7 [#1] PREEMPT SMP ARM
> [1.286897] Modules linked in:
> [1.290053] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.9.7-1-rc2 #151
> [1.296696] Hardware name: Sigma Tango DT
> [1.300808] task: cf82c9c0 task.stack: cf838000
> [1.305446] PC is at quirk_usb_early_handoff+0x3e8/0x790
> [1.310873] LR is at ioremap_page_range+0xf8/0x1a8
> [1.315771] pc : []lr : []psr: 000e0013
> [1.315771] sp : cf839d78  ip :   fp : cf839e38
> [1.327482] r10: c10248a0  r9 :   r8 : d08664f4
> [1.332816] r7 : d084e000  r6 : 2000  r5 : 000c0300  r4 : cfb5f800
> [1.339460] r3 : 000184f4  r2 :   r1 : 91001e13  r0 : d084e000
> [1.346105] Flags: nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment 
> none
> [1.353361] Control: 10c5387d  Table: 8fa9c04a  DAC: 0051
> [1.359218] Process swapper/0 (pid: 1, stack limit = 0xcf838210)
> [1.365339] Stack: (0xcf839d78 to 0xcf83a000)
> [1.36980

Panic in quirk_usb_early_handoff

2017-03-03 Thread Mason
Hello,

I'm seeing this panic randomly at boot-time, so I want to throw
it out there in case someone recognizes the issue off the top of
their head.

I'm on Linux 4.9, using a USB3 PCIe card. I'm actively working on
the PCIe support, so I may be responsible for the crash by virtue
of something I did or didn't do (e.g. I haven't set up the IRQs
correctly, but I didn't think it would crash the system).

[0.987520] OF: PCI: host bridge /soc/pcie@5000 ranges:
[0.993236] OF: PCI:   No bus range found for /soc/pcie@5000, using [bus 
00-ff]
[1.001034] OF: PCI: Parsing ranges property...
[1.005693] OF: PCI:   MEM 0x9000..0x9fff -> 0x9000
[1.014791] pci-host-generic 5000.pcie: ECAM at [mem 
0x5000-0x5fff] for [bus 00-ff]
[1.028570] pci-host-generic 5000.pcie: PCI host bridge to bus :00
[1.035597] pci_bus :00: root bus resource [bus 00-ff]
[1.041212] pci_bus :00: root bus resource [mem 0x9000-0x9fff]
[1.048219] pci_bus :00: scanning bus
[1.052376] pci :00:00.0: [1105:0024] type 01 class 0x048000
[1.058529] pci :00:00.0: calling tango_pcie_fixup_class+0x0/0x10
[1.065119] pci :00:00.0: reg 0x10: [mem 0x-0x00ff 64bit]
[1.072068] pci :00:00.0: calling pci_fixup_ide_bases+0x0/0x40
[1.078415] pci :00:00.0: supports D1 D2
[1.082803] pci :00:00.0: PME# supported from D0 D1 D2 D3hot
[1.088937] pci :00:00.0: PME# disabled
[1.093445] pci_bus :00: fixups for bus
[1.097753] PCI: bus0: Fast back to back transfers disabled
[1.103453] pci :00:00.0: scanning [bus 00-00] behind bridge, pass 0
[1.110286] pci :00:00.0: bridge configuration invalid ([bus 00-00]), 
reconfiguring
[1.118433] pci :00:00.0: scanning [bus 00-00] behind bridge, pass 1
[1.125385] pci_bus :01: scanning bus
[1.129557] pci :01:00.0: [1912:0014] type 00 class 0x0c0330
[1.135727] pci :01:00.0: reg 0x10: [mem 0x-0x1fff 64bit]
[1.142730] pci :01:00.0: calling pci_fixup_ide_bases+0x0/0x40
[1.149150] pci :01:00.0: PME# supported from D0 D3hot D3cold
[1.155375] pci :01:00.0: PME# disabled
[1.159976] pci_bus :01: fixups for bus
[1.164305] PCI: bus1: Fast back to back transfers disabled
[1.170002] pci_bus :01: bus scan returning with max=01
[1.175701] pci_bus :01: busn_res: [bus 01-ff] end is updated to 01
[1.182447] pci_bus :00: bus scan returning with max=01
[1.188147] pci :00:00.0: fixup irq: got 0
[1.192707] pci :00:00.0: assigning IRQ 00
[1.197294] pci :01:00.0: fixup irq: got 20
[1.201945] pci :01:00.0: assigning IRQ 20
[1.206533] pci :00:00.0: BAR 0: assigned [mem 0x9000-0x90ff 
64bit]
[1.213984] pci :00:00.0: BAR 8: assigned [mem 0x9100-0x910f]
[1.220908] pci :01:00.0: BAR 0: assigned [mem 0x9100-0x91001fff 
64bit]
[1.228363] pci :00:00.0: PCI bridge to [bus 01]
[1.233449] pci :00:00.0:   bridge window [mem 0x9100-0x910f]
[1.240419] pcieport :00:00.0: enabling device (0140 -> 0142)
[1.246648] pcieport :00:00.0: enabling bus mastering
[1.252321] pci :01:00.0: calling quirk_usb_early_handoff+0x0/0x790
[1.259077] pci :01:00.0: enabling device (0140 -> 0142)
[1.264893] Unable to handle kernel paging request at virtual address 
d08664f4
[1.272248] pgd = c0004000
[1.275060] [d08664f4] *pgd=8f804811, *pte=, *ppte=
[1.281476] Internal error: Oops: 7 [#1] PREEMPT SMP ARM
[1.286897] Modules linked in:
[1.290053] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.9.7-1-rc2 #151
[1.296696] Hardware name: Sigma Tango DT
[1.300808] task: cf82c9c0 task.stack: cf838000
[1.305446] PC is at quirk_usb_early_handoff+0x3e8/0x790
[1.310873] LR is at ioremap_page_range+0xf8/0x1a8
[1.315771] pc : []lr : []psr: 000e0013
[1.315771] sp : cf839d78  ip :   fp : cf839e38
[1.327482] r10: c10248a0  r9 :   r8 : d08664f4
[1.332816] r7 : d084e000  r6 : 2000  r5 : 000c0300  r4 : cfb5f800
[1.339460] r3 : 000184f4  r2 :   r1 : 91001e13  r0 : d084e000
[1.346105] Flags: nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[1.353361] Control: 10c5387d  Table: 8fa9c04a  DAC: 0051
[1.359218] Process swapper/0 (pid: 1, stack limit = 0xcf838210)
[1.365339] Stack: (0xcf839d78 to 0xcf83a000)
[1.369800] 9d60:   
c058f578 c058b180
[1.378107] 9d80: cfb55240 cf839d98 c0350218 c05adccc cfb5f800 c05adcdc 
cf838000 
[1.386413] 9da0:  c10248a0 cf839e38 c030bfa4 cf923b80 c034e69c 
cf867270 
[1.394719] 9dc0: cfb5f800 cf9ef414 cf9ef400 cf839e30 cf923b80  
cf906010 c02fa484
[1.403025] 9de0: cfb5f800 cf9ef414 cf9ef400 c02fa538 cfb5fc00 cf9ef814 
cf9ef800 c02fa56c
[

Re: [PATCH V2] usb: xhci: add support for performing fake doorbell

2017-02-08 Thread Jon Mason
On Mon, Jan 16, 2017 at 2:32 AM, Rafał Miłecki  wrote:
> On 21 November 2016 at 16:31, Mathias Nyman
>  wrote:
>> On 21.11.2016 09:57, Rafał Miłecki wrote:
>>>
>>> Hi Mathias,
>>>
>>> On 17 October 2016 at 22:30, Rafał Miłecki  wrote:

 From: Rafał Miłecki 

 Broadcom's Northstar XHCI controllers seem to need a special start
 procedure to work correctly. There isn't any official documentation of
 this, the problem is that controller doesn't detect any connected
 devices with default setup. Moreover connecting USB device to controller
 that doesn't run properly can cause SoC's watchdog issues.

 A workaround that was successfully tested on multiple devices is to
 perform a fake doorbell. This patch adds code for doing this and enables
 it on BCM4708 family.

 Signed-off-by: Rafał Miłecki 
 ---
 V2: Enable quirk for brcm,bcm4708 machines instead of adding separated
 binding
  for it. Thanks Rob for your comment on this.
>>>
>>>
>>> Do you think you can pick & push this one? V2 follows Rob's suggestion
>>> and he has some DT knowledge for sure, so I guess it should be OK.
>>> --
>>
>>
>> Is there some more background information on this?
>>
>> I don't have any contacts to Broadcom myself, adding the BMC Kernel Feedback
>> list to CC.
>> Maybe someone over there has an errata, documentation or just general
>> feedback.
>>
>> How was this workaround even figured out? ringing the doorbell for the first
>> device doesn't seem like something found by trial and error,  especially
>> when
>> xhci specs state that:
>>
>> "Software shall not write the Doorbell of an endpoint until after it has
>> issued a
>> Configure Endpoint Command for the endpoint and received a successful
>> Command
>> Completion Event."
>>
>> The whole workaround is a bit intrusive, allocating a fake device, ring a
>> doorbell for a
>> fake device in the wrong state, clearing off HSE (host system error) which
>> should only be set
>> when things really go bad, some random usleeps, and possible calling
>> xhci_start() twice.
>>
>> I can't take this as is without some more info.
>
> Hi (ping) Broadcom guys, could you help us with this USB workaround, please?

Investigating internally.  I'll let you know if I can find anything.

Thanks,
Jon

>
> --
> Rafał
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: XHCI controller does not detect USB key insertion

2016-12-05 Thread Mason
On 05/12/2016 09:26, Neil Armstrong wrote:

> On 12/02/2016 07:00 PM, Mason wrote:
>
>> On 02/12/2016 14:46, Neil Armstrong wrote:
>>
>>> On 12/02/2016 11:24 AM, Mason wrote:
>>>
>>>> (Sad face) All the documentation I have is in front of me, and nothing
>>>> is ringing a bell. This is a Sigma Designs SoC, with a Pravega XHCI
>>>> controller + Synopsys PHY.
>>>>
>>>> The documentation I have:
>>>>
>>>> Pravega_Dual_Mode_Datasheet_v10c.pdf (documents IP signals)
>>>> Pravega_Dual_Mode_Controller_Programmers_Reference_manual_v1.pdf 
>>>> (documents IP registers)
>>>> PHY databook (very low-level stuff)
>>>> SoC register mapping (for how the SoC maps the IP signals to registers)
>>>
>>> You should have all the necessary bits to enable and configure the Embedded 
>>> Synopsys PHY !
>>>
>>> You should have some register mapping of the PHY signals, or at least a way 
>>> to write those registers.
>>>
>>> You should have a reset, clock gate and eventually a power regulator to 
>>> enable in order to have the PHY running.
>>
>> I'll dump all the non-0 non-standard registers. Maybe someone
>> more experienced than me will spot an obvious mistake.
>>
>> host_usb30_0_config: 0x2e800
>>  - host_usb30_0_fladj 0x20
>>  - host_usb30_0_usb30_controller_cg_disable   0x0
>>  - host_usb30_0_mode_select   0x1
>>  - host_usb30_0_device_reset_mode 0x0
>>
>> host_usb30_0_control: 0x2e804
>>  - host_usb30_0_app_lfps_u3_wp0x0
>>  - host_usb30_0_link_up   0x1
>>  - host_usb30_0_msi_msg_sent  0x0
>>  - host_usb30_0_usb3_p0_over_current  0x0
>>  - host_usb30_0_usb2_p0_over_current  0x0
>>
>> host_usb30_0_test: 0x2e808
>>  - host_usb30_0_test_powerdown_hsp0x0
>>  - host_usb30_0_test_powerdown_ssp0x0
>>  - host_usb30_0_test_burnin   0x0
>>  - host_usb30_0_acjt_level0x14
>>  - host_usb30_0_lane0_tx2rx_loopbk0x0
>>  - host_usb30_0_rtune_req 0x0
>>
>> host_usb30_0_status: 0x2e80c
>>  - host_usb30_0_phystatus 0x0
>>  - host_usb30_0_usb2_p0_pp0x1
>>  - host_usb30_0_usb3_p0_pp0x1
>>  - host_usb30_0_usb3_sleep0x0
>>  - host_usb30_0_rtune_ack 0x0
>>
>> host_usb30_0_clk_rst_0: 0x2e810
>>  - host_usb30_0_commononn 0x1
>>  - host_usb30_0_portreset 0x0
>>  - host_usb30_0_refclksel 0x2
>>  - host_usb30_0_teneable  0x1
>>  - host_usb30_0_fsel  0x27
>>  - host_usb30_0_mpll_multiplier   0x19
>>  - host_usb30_0_ref_clkdiv2   0x0
>>  - host_usb30_0_ref_ssp_en0x1
>>  - host_usb30_0_ref_use_pad   0x0
>>  - host_usb30_0_ssc_en0x1
>>  - host_usb30_0_ssc_range 0x0
>>
>> host_usb30_0_clk_rst_1: 0x2e814
>>  - host_usb30_0_ssc_ref_clk_sel   0x88
>>  - host_usb30_0_sleepm0x1
>>  - host_usb30_0_vbusvldext0x1
>>
>> host_usb30_0_param_0: 0x2e818
>>  - host_usb30_0_compdistune   0x4
>>  - host_usb30_0_otgtune   0x4
>>  - host_usb30_0_sqrxtune  0x3
>>  - host_usb30_0_txfsltune 0x3
>>  - host_usb30_0_txhsxvtune0x3
>>  - host_usb30_0_txpreempltune 0x0
>>  - host_usb30_0_txpreemppulsetune 0x0
>>  - host_usb30_0_txrestune 0x1
>>  - host_usb30_0_txrisetune0x2
>>  - host_usb30_0_txvreftune0x4
>>
>> host_usb30_0_param_1: 0x2e81c
>>  - host_usb30_0_los_bias  

Re: XHCI controller does not detect USB key insertion

2016-12-02 Thread Mason
[ Fix incorrect address for Felipe ]

On 02/12/2016 14:46, Neil Armstrong wrote:

> On 12/02/2016 11:24 AM, Mason wrote:
>
>> (Sad face) All the documentation I have is in front of me, and nothing
>> is ringing a bell. This is a Sigma Designs SoC, with a Pravega XHCI
>> controller + Synopsys PHY.
>>
>> The documentation I have:
>>
>> Pravega_Dual_Mode_Datasheet_v10c.pdf (documents IP signals)
>> Pravega_Dual_Mode_Controller_Programmers_Reference_manual_v1.pdf (documents 
>> IP registers)
>> PHY databook (very low-level stuff)
>> SoC register mapping (for how the SoC maps the IP signals to registers)
> 
> You should have all the necessary bits to enable and configure the Embedded 
> Synopsys PHY !
> 
> You should have some register mapping of the PHY signals, or at least a way 
> to write those registers.
> 
> You should have a reset, clock gate and eventually a power regulator to 
> enable in order to have the PHY running.

I'll dump all the non-0 non-standard registers. Maybe someone
more experienced than me will spot an obvious mistake.

host_usb30_0_config: 0x2e800
- host_usb30_0_fladj 0x20
- host_usb30_0_usb30_controller_cg_disable   0x0
- host_usb30_0_mode_select   0x1
- host_usb30_0_device_reset_mode 0x0

host_usb30_0_control: 0x2e804
- host_usb30_0_app_lfps_u3_wp0x0
- host_usb30_0_link_up   0x1
- host_usb30_0_msi_msg_sent  0x0
- host_usb30_0_usb3_p0_over_current  0x0
- host_usb30_0_usb2_p0_over_current  0x0

host_usb30_0_test: 0x2e808
- host_usb30_0_test_powerdown_hsp0x0
- host_usb30_0_test_powerdown_ssp0x0
- host_usb30_0_test_burnin   0x0
- host_usb30_0_acjt_level0x14
- host_usb30_0_lane0_tx2rx_loopbk0x0
- host_usb30_0_rtune_req 0x0

host_usb30_0_status: 0x2e80c
- host_usb30_0_phystatus 0x0
- host_usb30_0_usb2_p0_pp0x1
- host_usb30_0_usb3_p0_pp0x1
- host_usb30_0_usb3_sleep0x0
- host_usb30_0_rtune_ack 0x0

host_usb30_0_clk_rst_0: 0x2e810
- host_usb30_0_commononn 0x1
- host_usb30_0_portreset 0x0
- host_usb30_0_refclksel 0x2
- host_usb30_0_teneable  0x1
- host_usb30_0_fsel  0x27
- host_usb30_0_mpll_multiplier   0x19
- host_usb30_0_ref_clkdiv2   0x0
- host_usb30_0_ref_ssp_en0x1
- host_usb30_0_ref_use_pad   0x0
- host_usb30_0_ssc_en0x1
- host_usb30_0_ssc_range 0x0

host_usb30_0_clk_rst_1: 0x2e814
- host_usb30_0_ssc_ref_clk_sel   0x88
- host_usb30_0_sleepm0x1
- host_usb30_0_vbusvldext0x1

host_usb30_0_param_0: 0x2e818
- host_usb30_0_compdistune   0x4
- host_usb30_0_otgtune   0x4
- host_usb30_0_sqrxtune  0x3
- host_usb30_0_txfsltune 0x3
- host_usb30_0_txhsxvtune0x3
- host_usb30_0_txpreempltune 0x0
- host_usb30_0_txpreemppulsetune 0x0
- host_usb30_0_txrestune 0x1
- host_usb30_0_txrisetune0x2
- host_usb30_0_txvreftune0x4

host_usb30_0_param_1: 0x2e81c
- host_usb30_0_los_bias  0x5
- host_usb30_0_los_level 0xc
- host_usb30_0_pcs_rx_los_mask_val   0xf0
- host_usb30_0_pcs_tx_deemph_3p5db   0x18
- host_usb30_0_pcs_tx_deemph_6db 0x21

host_usb30_0_param_2: 0x2e820
- host_usb30_0_pcs_tx_swing_full 0x73
- host_usb30_0_lane0_tx_term_offset  0x0
- host_usb30_0_tx_vboost_lvl 0x4

host_usb30_0_SNPS_CR_ADD: 0x2e880
- host_usb30_0_snps_cr_add   0xe03c

DEVICE_AND_PORT_000: 0x7005
- sw_reset

Re: XHCI controller does not detect USB key insertion

2016-12-02 Thread Mason
On 02/12/2016 14:46, Neil Armstrong wrote:

> On 12/02/2016 11:24 AM, Mason wrote:
>
>> (Sad face) All the documentation I have is in front of me, and nothing
>> is ringing a bell. This is a Sigma Designs SoC, with a Pravega XHCI
>> controller + Synopsys PHY.
>>
>> The documentation I have:
>>
>> Pravega_Dual_Mode_Datasheet_v10c.pdf (documents IP signals)
>> Pravega_Dual_Mode_Controller_Programmers_Reference_manual_v1.pdf (documents 
>> IP registers)
>> PHY databook (very low-level stuff)
>> SoC register mapping (for how the SoC maps the IP signals to registers)
> 
> You should have all the necessary bits to enable and configure the Embedded 
> Synopsys PHY !
> 
> You should have some register mapping of the PHY signals, or at least a way 
> to write those registers.
> 
> You should have a reset, clock gate and eventually a power regulator to 
> enable in order to have the PHY running.

I'll dump all the non-0 non-standard registers. Maybe someone
more experienced than me will spot an obvious mistake.

host_usb30_0_config: 0x2e800
- host_usb30_0_fladj 0x20
- host_usb30_0_usb30_controller_cg_disable   0x0
- host_usb30_0_mode_select   0x1
- host_usb30_0_device_reset_mode 0x0

host_usb30_0_control: 0x2e804
- host_usb30_0_app_lfps_u3_wp0x0
- host_usb30_0_link_up   0x1
- host_usb30_0_msi_msg_sent  0x0
- host_usb30_0_usb3_p0_over_current  0x0
- host_usb30_0_usb2_p0_over_current  0x0

host_usb30_0_test: 0x2e808
- host_usb30_0_test_powerdown_hsp0x0
- host_usb30_0_test_powerdown_ssp0x0
- host_usb30_0_test_burnin   0x0
- host_usb30_0_acjt_level0x14
- host_usb30_0_lane0_tx2rx_loopbk0x0
- host_usb30_0_rtune_req 0x0

host_usb30_0_status: 0x2e80c
- host_usb30_0_phystatus 0x0
- host_usb30_0_usb2_p0_pp0x1
- host_usb30_0_usb3_p0_pp0x1
- host_usb30_0_usb3_sleep0x0
- host_usb30_0_rtune_ack 0x0

host_usb30_0_clk_rst_0: 0x2e810
- host_usb30_0_commononn 0x1
- host_usb30_0_portreset 0x0
- host_usb30_0_refclksel 0x2
- host_usb30_0_teneable  0x1
- host_usb30_0_fsel  0x27
- host_usb30_0_mpll_multiplier   0x19
- host_usb30_0_ref_clkdiv2   0x0
- host_usb30_0_ref_ssp_en0x1
- host_usb30_0_ref_use_pad   0x0
- host_usb30_0_ssc_en0x1
- host_usb30_0_ssc_range 0x0

host_usb30_0_clk_rst_1: 0x2e814
- host_usb30_0_ssc_ref_clk_sel   0x88
- host_usb30_0_sleepm0x1
- host_usb30_0_vbusvldext0x1

host_usb30_0_param_0: 0x2e818
- host_usb30_0_compdistune   0x4
- host_usb30_0_otgtune   0x4
- host_usb30_0_sqrxtune  0x3
- host_usb30_0_txfsltune 0x3
- host_usb30_0_txhsxvtune0x3
- host_usb30_0_txpreempltune 0x0
- host_usb30_0_txpreemppulsetune 0x0
- host_usb30_0_txrestune 0x1
- host_usb30_0_txrisetune0x2
- host_usb30_0_txvreftune0x4

host_usb30_0_param_1: 0x2e81c
- host_usb30_0_los_bias  0x5
- host_usb30_0_los_level 0xc
- host_usb30_0_pcs_rx_los_mask_val   0xf0
- host_usb30_0_pcs_tx_deemph_3p5db   0x18
- host_usb30_0_pcs_tx_deemph_6db 0x21

host_usb30_0_param_2: 0x2e820
- host_usb30_0_pcs_tx_swing_full 0x73
- host_usb30_0_lane0_tx_term_offset  0x0
- host_usb30_0_tx_vboost_lvl 0x4

host_usb30_0_SNPS_CR_ADD: 0x2e880
- host_usb30_0_snps_cr_add   0xe03c

DEVICE_AND_PORT_000: 0x7005
- sw_reset 

Re: XHCI controller does not detect USB key insertion

2016-12-02 Thread Mason
On 02/12/2016 11:42, Greg KH wrote:

> On Fri, Dec 02, 2016 at 11:24:05AM +0100, Mason wrote:
> 
>> # lsusb -v
>> Bus 001 Device 001: ID 1d6b:0002
>> Bus 002 Device 001: ID 1d6b:0003
>>
>> Isn't lsusb verbose supposed to print much more than that?
> 
> Yes, if you are using the usbutils version of 'lsusb', odds are this is
> busybox, right?

Right. (You win a digital cookie.)

cd buildroot && make menuconfig
Drop BR2_PACKAGE_USBMOUNT (maybe it causes unexpected issues)
 Add BR2_PACKAGE_USBUTILS (I want the real deal)

> And these are just the root hubs, that the USB controller driver creates
> as "virtual" USB devices, they are not "real" USB devices on your bus.

# lsusb --version
lsusb (usbutils) 007
I see there's a 008 version.
Am I missing out on important diagnostics?

# lsusb -v

Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Device Descriptor:
  bLength18
  bDescriptorType 1
  bcdUSB   3.00
  bDeviceClass9 Hub
  bDeviceSubClass 0 Unused
  bDeviceProtocol 3 
  bMaxPacketSize0 9
  idVendor   0x1d6b Linux Foundation
  idProduct  0x0003 3.0 root hub
  bcdDevice4.07
  iManufacturer   3 Linux 4.7.0-rc6 xhci-hcd
  iProduct2 xHCI Host Controller
  iSerial 1 3004.usb3
  bNumConfigurations  1
  Configuration Descriptor:
bLength 9
bDescriptorType 2
wTotalLength   31
bNumInterfaces  1
bConfigurationValue 1
iConfiguration  0 
bmAttributes 0xe0
  Self Powered
  Remote Wakeup
MaxPower0mA
Interface Descriptor:
  bLength 9
  bDescriptorType 4
  bInterfaceNumber0
  bAlternateSetting   0
  bNumEndpoints   1
  bInterfaceClass 9 Hub
  bInterfaceSubClass  0 Unused
  bInterfaceProtocol  0 Full speed (or root) hub
  iInterface  0 
  Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x81  EP 1 IN
bmAttributes3
  Transfer TypeInterrupt
  Synch Type   None
  Usage Type   Data
wMaxPacketSize 0x0004  1x 4 bytes
bInterval  12
bMaxBurst   0
Hub Descriptor:
  bLength  12
  bDescriptorType  42
  nNbrPorts 1
  wHubCharacteristic 0x000a
No power switching (usb 1.0)
Per-port overcurrent protection
  bPwrOn2PwrGood   10 * 2 milli seconds
  bHubContrCurrent  0 milli Ampere
  bHubDecLat  0.0 micro seconds
  wHubDelay 0 nano seconds
  DeviceRemovable0x00
 Hub Port Status:
   Port 1: .02a0 5Gbps power Rx.Detect
Binary Object Store Descriptor:
  bLength 5
  bDescriptorType15
  wTotalLength   15
  bNumDeviceCaps  1
  SuperSpeed USB Device Capability:
bLength10
bDescriptorType16
bDevCapabilityType  3
bmAttributes 0x00
wSpeedsSupported   0x0008
  Device can operate at SuperSpeed (5Gbps)
bFunctionalitySupport   3
  Lowest fully-functional device speed is SuperSpeed (5Gbps)
bU1DevExitLat   0 micro seconds
bU2DevExitLat   0 micro seconds
Device Status: 0x0001
  Self Powered

Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Device Descriptor:
  bLength18
  bDescriptorType 1
  bcdUSB   2.00
  bDeviceClass9 Hub
  bDeviceSubClass 0 Unused
  bDeviceProtocol 1 Single TT
  bMaxPacketSize064
  idVendor   0x1d6b Linux Foundation
  idProduct  0x0002 2.0 root hub
  bcdDevice4.07
  iManufacturer   3 Linux 4.7.0-rc6 xhci-hcd
  iProduct2 xHCI Host Controller
  iSerial 1 3004.usb3
  bNumConfigurations  1
  Configuration Descriptor:
bLength 9
bDescriptorType 2
wTotalLength   25
bNumInterfaces  1
bConfigurationValue 1
iConfiguration  0 
bmAttributes 0xe0
  Self Powered
  Remote Wakeup
MaxPower0mA
Interface Descriptor:
  bLength 9
  bDescriptorType 4
  bInterfaceNumber0
  bAlternateSetting   0
  bNumEndpoints   1
  bInterfaceClass 9 Hub
  bInterfaceSubClass  0 Unused
  bInterfaceProtocol  0 Full speed (or root) hub
  iInterface  0 
  Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x81  EP 1 IN
bmAttributes3
 

Re: XHCI controller does not detect USB key insertion

2016-12-02 Thread Mason
On 02/12/2016 10:03, Felipe Balbi wrote:

> Mason wrote:
> 
>> I'm trying out a SoC with a brand new USB controller, which is (supposedly)
>> a standard XHCI controller. In theory, I would just need to build the right
>> driver, and everything would auto-magically work, right?
> 
> perhaps, but there might be needed initialization of other resources
> like PHYs and stuff like that.

Let me dive into additional details...

First of all, there is a register aptly called "USB3_RESET" which
is used to release several USB3-related blocks from reset.
Of course, that's the first register I tweaked :-)

There are *3* address ranges with USB3-related registers.

1) one called host_usb30_xhcl (I believe "xhcl" is a typo for "xhci")
This is the address I passed to the Linux driver. The first register
is CAPLENGTH_VERSION. I assume these are the standard XHCI registers.
(Last register is XHCL_EXTENDED_CAP3_USB3 at offset 0xc008)

2) one called host_usb30_port
This contains "Device and Port Specific Registers".
Is it standard?
How is Linux supposed to know where to find it?
Contains registers such as
Device Transaction Status
Device UTMI command and status for USB2
Set ISOC Delay
USB3 Function Notification
Rx DMA BD Start Address for Control Endpoint
EP Burst Size
Tx DMA BD Start Address Control Endpoint
EP $N IN/OUT
Device Notification Register
EP_Isochronous Timestamp

Are registers named LTSSM_TIMER_REGISTER{1,2,3} standard?
they have fields such as reg_12_ms_timeout (and other numbers like 2, 6, 100, 
300)

3) one called host_usb30
This contains lower-level stuff
0x2e800 CONFIG
0x2e804 CONTROL
0x2e808 TEST
0x2e80c STATUS
0x2e810 CLK_RST_0
0x2e814 CLK_RST_1
0x2e818 PARAM_0
0x2e81c PARAM_1
0x2e820 PARAM_2
0x2e880 SNPS_CR_ADD
0x2e884 SNPS_CR_DATA
0x2e8c0 RESET_CTRL

I haven't touched any of these so far.


>> # lsusb -v
>> Bus 001 Device 001: ID 1d6b:0002
>> Bus 002 Device 001: ID 1d6b:0003

Isn't lsusb verbose supposed to print much more than that?


>> I'd like to hear suggestions about what I can tweak to fix the problem.
> 
> go to your documentation and see if you have initialized
> everything. Which SoC is this?

(Sad face) All the documentation I have is in front of me, and nothing
is ringing a bell. This is a Sigma Designs SoC, with a Pravega XHCI
controller + Synopsys PHY.

The documentation I have:

Pravega_Dual_Mode_Datasheet_v10c.pdf (documents IP signals)
Pravega_Dual_Mode_Controller_Programmers_Reference_manual_v1.pdf (documents IP 
registers)
PHY databook (very low-level stuff)
SoC register mapping (for how the SoC maps the IP signals to registers)

So far, I'm stumped :-(

Regards.

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


XHCI controller does not detect USB key insertion

2016-12-02 Thread Mason
Hello everyone,

I'm trying out a SoC with a brand new USB controller, which is (supposedly)
a standard XHCI controller. In theory, I would just need to build the right
driver, and everything would auto-magically work, right?


So my defconfig contains:

CONFIG_USB=y
CONFIG_USB_ANNOUNCE_NEW_DEVICES=y
CONFIG_USB_XHCI_HCD=y
CONFIG_USB_XHCI_PLATFORM=y
CONFIG_USB_STORAGE=y
CONFIG_USB_STORAGE_DEBUG=y


And my device tree contains:

usb3@3004 {
compatible = "generic-xhci";
reg = <0x3004 0x1>;
interrupts = ;
};


The boot messages I get:

[1.618214] xhci-hcd 3004.usb3: xHCI Host Controller
[1.623611] xhci-hcd 3004.usb3: new USB bus registered, assigned bus 
number 1
[1.631181] reset function is xhci_plat_setup
[1.635588] xhci_plat_setup from usb_add_hcd
[1.640109] xhci-hcd 3004.usb3: hcc params 0x30003192 hci version 0x100 
quirks 0x00010010
[1.648766] xhci-hcd 3004.usb3: irq 22, io mem 0x3004
[1.654572] xhci_plat_start from usb_add_hcd
[1.659086] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002
[1.665943] usb usb1: New USB device strings: Mfr=3, Product=2, 
SerialNumber=1
[1.673228] usb usb1: Product: xHCI Host Controller
[1.678154] usb usb1: Manufacturer: Linux 4.7.0-rc6 xhci-hcd
[1.683865] usb usb1: SerialNumber: 3004.usb3
[1.689391] hub 1-0:1.0: USB hub found
[1.693227] hub 1-0:1.0: 1 port detected
[1.697601] xhci-hcd 3004.usb3: xHCI Host Controller
[1.702983] xhci-hcd 3004.usb3: new USB bus registered, assigned bus 
number 2
[1.710545] reset function is xhci_plat_setup
[1.714950] xhci_plat_setup from usb_add_hcd
[1.719265] xhci_plat_start from usb_add_hcd
[1.723653] usb usb2: We don't know the algorithms for LPM for this host, 
disabling LPM.
[1.731956] usb usb2: New USB device found, idVendor=1d6b, idProduct=0003
[1.738814] usb usb2: New USB device strings: Mfr=3, Product=2, 
SerialNumber=1
[1.746100] usb usb2: Product: xHCI Host Controller
[1.751025] usb usb2: Manufacturer: Linux 4.7.0-rc6 xhci-hcd
[1.756736] usb usb2: SerialNumber: 3004.usb3
[1.762195] hub 2-0:1.0: USB hub found
[1.766027] hub 2-0:1.0: 1 port detected
[1.770661] usbcore: registered new interface driver usb-storage
[1.784584] usbcore: registered new interface driver usbhid
[1.790213] usbhid: USB HID core driver

Which looks encouraging, right?


Am I supposed to have had USB interrupts at that point?

# cat /proc/interrupts 
   CPU0   CPU1   CPU2   CPU3   
  20:609365393356 GIC-0  29 Edge  twd
  21:101  0  0  0  INTC   1 Level serial
  22:  0  0  0  0  INTC  67 Level 
xhci-hcd:usb1
IPI0:  0  0  0  0  CPU wakeup interrupts
IPI1:  0  0  0  0  Timer broadcast interrupts
IPI2:794620   1223   1045  Rescheduling interrupts
IPI3:  0 37 37 37  Function call interrupts
IPI4:  0  0  0  0  CPU stop interrupts
IPI5:  0  0  0  0  IRQ work interrupts
IPI6:  0  0  0  0  completion interrupts
 Err:  0


When I insert a USB key, nothing happens :-(

# lsusb -v
Bus 001 Device 001: ID 1d6b:0002
Bus 002 Device 001: ID 1d6b:0003


I'd like to hear suggestions about what I can tweak to fix the problem.

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Support for Pravega USB3 controller

2016-05-30 Thread Mason
Hello Felipe,

On 30/05/2016 08:58, Felipe Balbi wrote:

> Mason writes:
> 
>> I'm working on a SoC which embeds an IP block from GDA Technologies
>> labeled "Pravega USB3 SuperSpeed Controller" (data-sheet is v0.99r
>> dated 2014-01-29). A cursory search returns:
>>
>> http://www.sourcing.co.jp/prod_ip.htm
>> http://www.sourcing.co.jp/prod_ip/usb_host_pb.pdf
>>
>> In the compliance section, the data-sheet lists:
>>
>> - USB 3.0 Revision 1.0 and all associated ECNs
>> - Inter-Chip Supplement to the USB Revision 3.0 Specification, Revision 1.02
>> - Backward compatible with USB2.0 Revision 2.0 and all associated ECNs
>> - High Speed Inter Chip Specification, Rev 1.0 and associated ECNs
>> - USB 2.0 Link Power Management Addendum and associated Erratas
>> - xHCI specification version 1.0 (in host mode)
>>
>> My question is: should I be able to use the generic XHCI driver to
>> drive this controller?
> 
> yes, you should. BUT... is this controller dual-role? If it is, then
> it'd be nice to support both roles.

Dual-role means OTG? or just static host/device?
AFAIU, my version of the controller only supports host-mode.

>> (The Makefile builds xhci-plat-hcd.o but I don't see xhci-plat-hcd.c
>> Is it a generated file?)
> 
> The makefile is smart enough to figure it out. Don't worry. Kernel's
> build system knows that it needs xhci-plat-hcd.c in order to build
> xhci-plat-hcd.o

Oh, I wasn't worried. I was curious how/where that file is generated.

>> One more thing: AFAIU, the USB PHY is made by Synopsys.
>> Do I need a driver for that too? (I should examine 7b8ef22ea547)
> 
> IIRC there are no registers to be configured on Synopsys PHY, so no.

I looked more closely at the SoC documentation, which lists registers
such as

0x0 CONFIG
0x4 CONTROL
0x8 TEST
0xC STATUS
0x10CLK_RST_0
0x14CLK_RST_1
0x18PARAM_0
0x1CPARAM_1
0x20PARAM_2
0x80SNPS_CR_ADD
0x84SNPS_CR_DATA
0xC0RESET_CTRL

For example, PARAM_1 contains a field "TX De-emphasis at 6 dB".
That's PHY stuff, right?

Are registers such as PHY_TIMER_REGISTER or LTSSM_* standard in USB3/XHCI,
or is that PHY stuff too?

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Support for Pravega USB3 controller

2016-05-27 Thread Mason
Hello everyone,

I'm working on a SoC which embeds an IP block from GDA Technologies
labeled "Pravega USB3 SuperSpeed Controller" (data-sheet is v0.99r
dated 2014-01-29). A cursory search returns:

http://www.sourcing.co.jp/prod_ip.htm
http://www.sourcing.co.jp/prod_ip/usb_host_pb.pdf

In the compliance section, the data-sheet lists:

- USB 3.0 Revision 1.0 and all associated ECNs
- Inter-Chip Supplement to the USB Revision 3.0 Specification, Revision 1.02
- Backward compatible with USB2.0 Revision 2.0 and all associated ECNs
- High Speed Inter Chip Specification, Rev 1.0 and associated ECNs
- USB 2.0 Link Power Management Addendum and associated Erratas
- xHCI specification version 1.0 (in host mode)

My question is: should I be able to use the generic XHCI driver to
drive this controller?

I'd select USB_XHCI_HCD and USB_XHCI_PLATFORM, create the appropriate
device tree node, and everything would auto-magically work?

(The Makefile builds xhci-plat-hcd.o but I don't see xhci-plat-hcd.c
Is it a generated file?)

I'm confused though: if XHCI is standard, why are there different XHCI
drivers for Mediatek, Marvell, Renesas, Tegra? Do they add additional
features *on top of* the XHCI baseline?

One more thing: AFAIU, the USB PHY is made by Synopsys.
Do I need a driver for that too? (I should examine 7b8ef22ea547)

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


List etiquette (was: USB_OTG unmet direct dependencies)

2015-12-21 Thread Mason
On 21/12/2015 10:19, Ajay Khandelwal wrote:
> Hello Peter/Balbi,
> 
> Any framework for Type-c connector available in Linux.
> 
> Thanks and Regards,
> Ajay

Ajay,

Please follow the standard list etiquette, specifically:

Start a new thread with a proper and relevant subject, don't hijack
a loosely-related thread.

When replying, don't top-post; reply inline after snipping irrelevant
material.

Disable HTML formatting when posting to the Linux lists.

Regards.

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


USB_OTG unmet direct dependencies

2015-11-25 Thread Mason
Hello,

I was about to post this, and I noticed it has been discussed recently.

http://thread.gmane.org/gmane.linux.kernel/2087677/
[PATCH] USB: USB_OTG does not depend on PM

My SoC provides a Chipidea dual-port OTG USB 2.0 controller.

My .config contains:

CONFIG_USB_SUPPORT=y
CONFIG_USB_COMMON=y
CONFIG_USB_ARCH_HAS_HCD=y
CONFIG_USB=y
CONFIG_USB_ANNOUNCE_NEW_DEVICES=y
...
CONFIG_USB_OTG=y
# CONFIG_USB_OTG_WHITELIST is not set
# CONFIG_USB_OTG_BLACKLIST_HUB is not set
CONFIG_USB_OTG_FSM=y
CONFIG_USB_EHCI_HCD=y
CONFIG_USB_EHCI_ROOT_HUB_TT=y
CONFIG_USB_EHCI_TT_NEWSCHED=y
CONFIG_USB_OHCI_HCD=y
CONFIG_USB_CHIPIDEA=y
CONFIG_USB_CHIPIDEA_OF=y
CONFIG_USB_CHIPIDEA_UDC=y
CONFIG_USB_CHIPIDEA_HOST=y
CONFIG_USB_CHIPIDEA_DEBUG=y

When I build, I get this warning twice:

  (USB_OTG_FSM && FSL_USB2_OTG && USB_MV_OTG) selects USB_OTG
  which has unmet direct dependencies (USB_SUPPORT && USB && PM)

I have a few questions:

Is PM required for correct OTG functionality?

Is USB_OTG_FSM required for correct OTG functionality?
If so, shouldn't USB_CHIPIDEA_UDC depend on it? or select it?

Also I'm confused by the help message for USB_CHIPIDEA:

help
  Say Y here if your system has a dual role high speed USB
  controller based on ChipIdea silicon IP. Currently, only the
  peripheral mode is supported.

If I set the dr_mode to "host", the OS correctly sees a mass storage device.
(Which I took to mean host mode is supported.)

Does it fail to work in "otg" mode? (I've ordered OTG cables to test this.)

(I tested on v4.2 and I'm now working with v4.1)

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html