Re: [Nouveau] pcieport 0000:00:01.0: PME: Spurious native interrupt (nvidia with nouveau and thunderbolt on thinkpad P73)

2020-09-19 Thread Marc MERLIN
On Sun, Sep 13, 2020 at 01:15:45PM -0700, Marc MERLIN wrote:
> On Mon, Sep 07, 2020 at 05:29:35PM -0700, Marc MERLIN wrote:
> > On Tue, Sep 08, 2020 at 01:51:19AM +0200, Karol Herbst wrote:
> > > oh, I somehow missed that "disp ctor failed" message. I think that
> > > might explain why things are a bit hanging. From the top of my head I
> > > am not sure if that's something known or something new. But just in
> > > case I CCed Lyude and Ben. And I think booting with
> > > nouveau.debug=disp=trace could already show something relevant.
> > 
> > Thanks.
> > I've added that to my boot for next time I reboot.
> > 
> > I'm moving some folks to Bcc now, and let's remove the lists other than
> > nouveau on followups (lkml and pci). I'm just putting a warning here
> > so that it shows up in other list archives and anyone finding this
> > later knows that they should look in the nouveau archives for further
> > updates/resolution.
> 
> Hi, I didn't hear back on this issue. Did you need the 
> nouveau.debug=disp=trace
> or are you already working on the "disp ctor failed" issue?

I rebooted with the option you asked for:
BOOT_IMAGE=/vmlinuz-5.8.5-amd64-preempt-sysrq-20190817 
root=/dev/mapper/cryptroot ro rootflags=subvol=roo
t cryptopts=source=/dev/nvme0n1p7,keyscript=/sbin/cryptgetpw 
usbcore.autosuspend=1 pcie_aspm=force resume=/dev/dm-1 acpi_backlight=ven
dor nouveau.debug=disp=trace

[8.371448] nouveau: detected PR support, will not use DSM
[8.371458] nouveau :01:00.0: runtime IRQ mapping not provided by arch
[8.371463] nouveau :01:00.0: enabling device ( -> 0003)
[8.371510] Console: switching to colour dummy device 80x25
[8.371542] i915 :00:02.0: vgaarb: deactivate vga console
[8.371574] nouveau :01:00.0: NVIDIA TU104 (164000a1)
[8.373522] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[8.374215] i915 :00:02.0: vgaarb: changed VGA decodes: 
olddecodes=io+mem,decodes=none:owns=mem
[8.377328] i915 :00:02.0: [drm] Finished loading DMC firmware 
i915/kbl_dmc_ver1_04.bin (v1.4)
[8.472037] nouveau :01:00.0: bios: version 90.04.4d.00.2c

note that I still get a 3mn hang at boot here

[  188.334912] nouveau :01:00.0: disp: destroy running...
[  188.341741] nouveau :01:00.0: disp: destroy completed in 1us
[  188.344559] nouveau :01:00.0: disp ctor failed, -12
[  188.347708] nouveau: probe of :01:00.0 failed with error -12

As a reminder:
sauron:~# lspci |grep -i nvidia
01:00.0 VGA compatible controller: NVIDIA Corporation TU104GLM [Quadro RTX 4000 
Mobile / Max-Q] (rev a1)
01:00.1 Audio device: NVIDIA Corporation TU104 HD Audio Controller (rev a1)
01:00.2 USB controller: NVIDIA Corporation TU104 USB 3.1 Host Controller (rev 
a1)
01:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU104 USB Type-C UCSI 
Controller (rev a1)

full boot still looks like this:
[9.812614] Console: switching to colour frame buffer device 240x67
[9.844351] i915 :00:02.0: fb0: i915drmfb frame buffer device

16 seconds here? Why?

[   25.107472] thunderbolt :06:00.0: saving config space at offset 0x0 
(reading 0x15eb8086)
[   25.107503] thunderbolt :06:00.0: saving config space at offset 0x4 
(reading 0x100406)
[   25.107509] thunderbolt :06:00.0: saving config space at offset 0x8 
(reading 0x886)
[   25.107514] thunderbolt :06:00.0: saving config space at offset 0xc 
(reading 0x20)
[   25.107520] thunderbolt :06:00.0: saving config space at offset 0x10 
(reading 0xcc10)
[   25.107525] thunderbolt :06:00.0: saving config space at offset 0x14 
(reading 0xcc14)
[   25.107530] thunderbolt :06:00.0: saving config space at offset 0x18 
(reading 0x0)
[   25.107535] thunderbolt :06:00.0: saving config space at offset 0x1c 
(reading 0x0)
[   25.107540] thunderbolt :06:00.0: saving config space at offset 0x20 
(reading 0x0)
[   25.107545] thunderbolt :06:00.0: saving config space at offset 0x24 
(reading 0x0)
[   25.107550] thunderbolt :06:00.0: saving config space at offset 0x28 
(reading 0x0)
[   25.107556] thunderbolt :06:00.0: saving config space at offset 0x2c 
(reading 0x229b17aa)
[   25.107561] thunderbolt :06:00.0: saving config space at offset 0x30 
(reading 0x0)
[   25.107566] thunderbolt :06:00.0: saving config space at offset 0x34 
(reading 0x80)
[   25.107571] thunderbolt :06:00.0: saving config space at offset 0x38 
(reading 0x0)
[   25.107576] thunderbolt :06:00.0: saving config space at offset 0x3c 
(reading 0x1ff)
[   25.107661] thunderbolt :06:00.0: PME# enabled
[   25.125418] pcieport :05:00.0: saving config space at offset 0x0 
(reading 0x15ea8086)
[   25.125448] pcieport :05:00.0: saving config space at offset 0x4 
(reading 0x100407)
[   25.125454] pcieport :05:00.0: saving config space at offset 0x8 
(reading 0x6040006)
[   25.125459] pcieport :05:00.0: saving config space at offset 0xc 
(reading 0x10020)
[   25.125464] 

Re: [Nouveau] pcieport 0000:00:01.0: PME: Spurious native interrupt (nvidia with nouveau and thunderbolt on thinkpad P73)

2020-09-13 Thread Marc MERLIN
On Mon, Sep 07, 2020 at 05:29:35PM -0700, Marc MERLIN wrote:
> On Tue, Sep 08, 2020 at 01:51:19AM +0200, Karol Herbst wrote:
> > oh, I somehow missed that "disp ctor failed" message. I think that
> > might explain why things are a bit hanging. From the top of my head I
> > am not sure if that's something known or something new. But just in
> > case I CCed Lyude and Ben. And I think booting with
> > nouveau.debug=disp=trace could already show something relevant.
> 
> Thanks.
> I've added that to my boot for next time I reboot.
> 
> I'm moving some folks to Bcc now, and let's remove the lists other than
> nouveau on followups (lkml and pci). I'm just putting a warning here
> so that it shows up in other list archives and anyone finding this
> later knows that they should look in the nouveau archives for further
> updates/resolution.

Hi, I didn't hear back on this issue. Did you need the nouveau.debug=disp=trace
or are you already working on the "disp ctor failed" issue?

Thanks
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
 
Home page: http://marc.merlins.org/   | PGP 7F55D5F27AAF9D08
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] pcieport 0000:00:01.0: PME: Spurious native interrupt (nvidia with nouveau and thunderbolt on thinkpad P73)

2020-09-07 Thread Marc MERLIN
On Tue, Sep 08, 2020 at 01:51:19AM +0200, Karol Herbst wrote:
> oh, I somehow missed that "disp ctor failed" message. I think that
> might explain why things are a bit hanging. From the top of my head I
> am not sure if that's something known or something new. But just in
> case I CCed Lyude and Ben. And I think booting with
> nouveau.debug=disp=trace could already show something relevant.

Thanks.
I've added that to my boot for next time I reboot.

I'm moving some folks to Bcc now, and let's remove the lists other than
nouveau on followups (lkml and pci). I'm just putting a warning here
so that it shows up in other list archives and anyone finding this
later knows that they should look in the nouveau archives for further
updates/resolution.

Thanks,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
 
Home page: http://marc.merlins.org/   | PGP 7F55D5F27AAF9D08
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] pcieport 0000:00:01.0: PME: Spurious native interrupt (nvidia with nouveau and thunderbolt on thinkpad P73)

2020-09-07 Thread Karol Herbst
On Mon, Sep 7, 2020 at 10:58 PM Marc MERLIN  wrote:
>
> On Mon, Sep 07, 2020 at 09:14:03PM +0200, Karol Herbst wrote:
> > > - changes in the nouveau driver. Mika told me the PCIe regression
> > >   "pcieport :00:01.0: PME: Spurious native interrupt!" is supposed
> > >   to be fixed in 5.8, but I still get a 4mn hang or so during boot and
> > >   with 5.8, removing the USB key, didn't help make the boot faster
> >
> > that's the root port the GPU is attached to, no? I saw that message on
> > the Thinkpad P1G2 when runtime resuming the Nvidia GPU, but it does
> > seem to come from the root port.
>
> Hi Karol, thanks for your answer.
>
> 00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core 
> Processor PCIe Controller (x16) (rev 0d)
> 01:00.0 VGA compatible controller: NVIDIA Corporation TU104GLM [Quadro RTX 
> 4000 Mobile / Max-Q] (rev a1)
>
> > Well, you'd also need it when attaching external displays.
>
> Indeed. I just don't need that on this laptop, but familiar with the not
> so seemless procedure to turn on both GPUs, and mirror the intel one into
> the nvidia one for external output.
>
> > > [   11.262985] nvidia-gpu :01:00.3: PME# enabled
> > > [   11.303060] nvidia-gpu :01:00.3: PME# disabled
> >
> > mhh, interesting. I heard some random comments that the Nvidia
> > USB-C/UCSI driver is a bit broken and can cause various issues. Mind
> > blacklisting i2c-nvidia-gpu and typec_nvidia (and verify they don't
> > get loaded) and see if that helps?
>
> Right, this one:
> 01:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU104 USB Type-C 
> UCSI Controller (rev a1)
> Sure, I'll blacklist it. Ok, just did that, removed from initrd,
> rebooted, and it was no better.
>
> From initrd (before root gets mounted), I have this:
> nouveau  1961984  0
> mxm_wmi16384  1 nouveau
> hwmon  32768  1 nouveau
> ttm   102400  1 nouveau
> wmi32768  2 nouveau,mxm_wmi
>
> I still got a 2mn hang. and a nouveau probe error
> [  189.124530] nouveau: probe of :01:00.0 failed with error -12
>
>
> Here's what it looks like:
> [9.693230] hid: raw HID events driver (C) Jiri Kosina
> [9.694988] usbcore: registered new interface driver usbhid
> [9.694989] usbhid: USB HID core driver
> [9.696700] hid-generic 0003:1050:0200.0001: hiddev0,hidraw0: USB HID 
> v1.00 Device [Yubico Yubico Gnubby (gnubby1)] on usb-:00:14.0-2/input0
> [9.784456] Console: switching to colour frame buffer device 240x67
> [9.816297] i915 :00:02.0: fb0: i915drmfb frame buffer device
> [   25.087400] thunderbolt :06:00.0: saving config space at offset 0x0 
> (reading 0x15eb8086)
> [   25.087414] thunderbolt :06:00.0: saving config space at offset 0x4 
> (reading 0x100406)
> [   25.087419] thunderbolt :06:00.0: saving config space at offset 0x8 
> (reading 0x886)
> [   25.087424] thunderbolt :06:00.0: saving config space at offset 0xc 
> (reading 0x20)
> [   25.087430] thunderbolt :06:00.0: saving config space at offset 0x10 
> (reading 0xcc10)
> [   25.087435] thunderbolt :06:00.0: saving config space at offset 0x14 
> (reading 0xcc14)
> [   25.087440] thunderbolt :06:00.0: saving config space at offset 0x18 
> (reading 0x0)
> [   25.087445] thunderbolt :06:00.0: saving config space at offset 0x1c 
> (reading 0x0)
> [   25.087450] thunderbolt :06:00.0: saving config space at offset 0x20 
> (reading 0x0)
> [   25.087455] thunderbolt :06:00.0: saving config space at offset 0x24 
> (reading 0x0)
> [   25.087460] thunderbolt :06:00.0: saving config space at offset 0x28 
> (reading 0x0)
> [   25.087466] thunderbolt :06:00.0: saving config space at offset 0x2c 
> (reading 0x229b17aa)
> [   25.087471] thunderbolt :06:00.0: saving config space at offset 0x30 
> (reading 0x0)
> [   25.087476] thunderbolt :06:00.0: saving config space at offset 0x34 
> (reading 0x80)
> [   25.087481] thunderbolt :06:00.0: saving config space at offset 0x38 
> (reading 0x0)
> [   25.087486] thunderbolt :06:00.0: saving config space at offset 0x3c 
> (reading 0x1ff)
> [   25.087571] thunderbolt :06:00.0: PME# enabled
> [   25.105353] pcieport :05:00.0: saving config space at offset 0x0 
> (reading 0x15ea8086)
> [   25.105364] pcieport :05:00.0: saving config space at offset 0x4 
> (reading 0x100407)
> [   25.105370] pcieport :05:00.0: saving config space at offset 0x8 
> (reading 0x6040006)
> [   25.105375] pcieport :05:00.0: saving config space at offset 0xc 
> (reading 0x10020)
> [   25.105380] pcieport :05:00.0: saving config space at offset 0x10 
> (reading 0x0)
> [   25.105384] pcieport :05:00.0: saving config space at offset 0x14 
> (reading 0x0)
> [   25.105389] pcieport :05:00.0: saving config space at offset 0x18 
> (reading 0x60605)
> [   25.105394] pcieport :05:00.0: saving config space at offset 0x1c 
> (reading 

Re: [Nouveau] pcieport 0000:00:01.0: PME: Spurious native interrupt (nvidia with nouveau and thunderbolt on thinkpad P73)

2020-09-07 Thread Marc MERLIN
On Mon, Sep 07, 2020 at 09:14:03PM +0200, Karol Herbst wrote:
> > - changes in the nouveau driver. Mika told me the PCIe regression
> >   "pcieport :00:01.0: PME: Spurious native interrupt!" is supposed
> >   to be fixed in 5.8, but I still get a 4mn hang or so during boot and
> >   with 5.8, removing the USB key, didn't help make the boot faster
> 
> that's the root port the GPU is attached to, no? I saw that message on
> the Thinkpad P1G2 when runtime resuming the Nvidia GPU, but it does
> seem to come from the root port.

Hi Karol, thanks for your answer.
 
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core 
Processor PCIe Controller (x16) (rev 0d)
01:00.0 VGA compatible controller: NVIDIA Corporation TU104GLM [Quadro RTX 4000 
Mobile / Max-Q] (rev a1)

> Well, you'd also need it when attaching external displays.
 
Indeed. I just don't need that on this laptop, but familiar with the not
so seemless procedure to turn on both GPUs, and mirror the intel one into
the nvidia one for external output. 

> > [   11.262985] nvidia-gpu :01:00.3: PME# enabled
> > [   11.303060] nvidia-gpu :01:00.3: PME# disabled
> 
> mhh, interesting. I heard some random comments that the Nvidia
> USB-C/UCSI driver is a bit broken and can cause various issues. Mind
> blacklisting i2c-nvidia-gpu and typec_nvidia (and verify they don't
> get loaded) and see if that helps?

Right, this one:
01:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU104 USB Type-C UCSI 
Controller (rev a1)
Sure, I'll blacklist it. Ok, just did that, removed from initrd,
rebooted, and it was no better.

>From initrd (before root gets mounted), I have this:
nouveau  1961984  0
mxm_wmi16384  1 nouveau
hwmon  32768  1 nouveau
ttm   102400  1 nouveau
wmi32768  2 nouveau,mxm_wmi

I still got a 2mn hang. and a nouveau probe error
[  189.124530] nouveau: probe of :01:00.0 failed with error -12


Here's what it looks like:
[9.693230] hid: raw HID events driver (C) Jiri Kosina
[9.694988] usbcore: registered new interface driver usbhid
[9.694989] usbhid: USB HID core driver
[9.696700] hid-generic 0003:1050:0200.0001: hiddev0,hidraw0: USB HID v1.00 
Device [Yubico Yubico Gnubby (gnubby1)] on usb-:00:14.0-2/input0
[9.784456] Console: switching to colour frame buffer device 240x67
[9.816297] i915 :00:02.0: fb0: i915drmfb frame buffer device
[   25.087400] thunderbolt :06:00.0: saving config space at offset 0x0 
(reading 0x15eb8086)
[   25.087414] thunderbolt :06:00.0: saving config space at offset 0x4 
(reading 0x100406)
[   25.087419] thunderbolt :06:00.0: saving config space at offset 0x8 
(reading 0x886)
[   25.087424] thunderbolt :06:00.0: saving config space at offset 0xc 
(reading 0x20)
[   25.087430] thunderbolt :06:00.0: saving config space at offset 0x10 
(reading 0xcc10)
[   25.087435] thunderbolt :06:00.0: saving config space at offset 0x14 
(reading 0xcc14)
[   25.087440] thunderbolt :06:00.0: saving config space at offset 0x18 
(reading 0x0)
[   25.087445] thunderbolt :06:00.0: saving config space at offset 0x1c 
(reading 0x0)
[   25.087450] thunderbolt :06:00.0: saving config space at offset 0x20 
(reading 0x0)
[   25.087455] thunderbolt :06:00.0: saving config space at offset 0x24 
(reading 0x0)
[   25.087460] thunderbolt :06:00.0: saving config space at offset 0x28 
(reading 0x0)
[   25.087466] thunderbolt :06:00.0: saving config space at offset 0x2c 
(reading 0x229b17aa)
[   25.087471] thunderbolt :06:00.0: saving config space at offset 0x30 
(reading 0x0)
[   25.087476] thunderbolt :06:00.0: saving config space at offset 0x34 
(reading 0x80)
[   25.087481] thunderbolt :06:00.0: saving config space at offset 0x38 
(reading 0x0)
[   25.087486] thunderbolt :06:00.0: saving config space at offset 0x3c 
(reading 0x1ff)
[   25.087571] thunderbolt :06:00.0: PME# enabled
[   25.105353] pcieport :05:00.0: saving config space at offset 0x0 
(reading 0x15ea8086)
[   25.105364] pcieport :05:00.0: saving config space at offset 0x4 
(reading 0x100407)
[   25.105370] pcieport :05:00.0: saving config space at offset 0x8 
(reading 0x6040006)
[   25.105375] pcieport :05:00.0: saving config space at offset 0xc 
(reading 0x10020)
[   25.105380] pcieport :05:00.0: saving config space at offset 0x10 
(reading 0x0)
[   25.105384] pcieport :05:00.0: saving config space at offset 0x14 
(reading 0x0)
[   25.105389] pcieport :05:00.0: saving config space at offset 0x18 
(reading 0x60605)
[   25.105394] pcieport :05:00.0: saving config space at offset 0x1c 
(reading 0x1f1)
[   25.105399] pcieport :05:00.0: saving config space at offset 0x20 
(reading 0xcc10cc10)
[   25.105404] pcieport :05:00.0: saving config space at offset 0x24 
(reading 0x1fff1)
[   25.105409] pcieport :05:00.0: saving config 

Re: [Nouveau] pcieport 0000:00:01.0: PME: Spurious native interrupt (nvidia with nouveau and thunderbolt on thinkpad P73)

2020-09-07 Thread Karol Herbst
On Sun, Sep 6, 2020 at 8:52 PM Marc MERLIN  wrote:
>
> Ok, I have an update to this problem. I added the nouveau list because
> I can't quite tell if the issue is:
> - the PCIe changes that went in 5.6 I think (or 5.5?), referenced below
>
> - a new issue with thunderbold on thinkpad P73, that seems to be
>   triggered if I have a USB-C yubikey in the port. With 5.7, my issues
>   went away if I removed the USB key during boot, showing an interaction
>   between nouveau and thunderbolt
>
> - changes in the nouveau driver. Mika told me the PCIe regression
>   "pcieport :00:01.0: PME: Spurious native interrupt!" is supposed
>   to be fixed in 5.8, but I still get a 4mn hang or so during boot and
>   with 5.8, removing the USB key, didn't help make the boot faster
>

that's the root port the GPU is attached to, no? I saw that message on
the Thinkpad P1G2 when runtime resuming the Nvidia GPU, but it does
seem to come from the root port.

> I don't otherwise use the nvidia chip I so wish I didn't have, I only
> use intel graphics on that laptop, but I must apparently use the nouveau
> driver to manage the nouveau chip so that it's turned off and not
> burning 60W doing nothing.
>

Well, you'd also need it when attaching external displays.

> lspci is in the quoted message below, I won't copy it here again, but
> here's the nvidia bit:
> 01:00.0 VGA compatible controller: NVIDIA Corporation TU104GLM [Quadro RTX 
> 4000 Mobile / Max-Q] (rev a1)
> 01:00.1 Audio device: NVIDIA Corporation TU104 HD Audio Controller (rev a1)
> 01:00.2 USB controller: NVIDIA Corporation TU104 USB 3.1 Host Controller (rev 
> a1)
> 01:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU104 USB Type-C 
> UCSI Controller (rev a1)
>
> Here are 5 boots, 4 on 5.8.5:
>
> dmesg.1_hang_but_no_warning.txt https://pastebin.com/Y5NaH08n
> Boot hung for quite a while, but no clear output
>
> dmesg.2_pme_spurious.txt https://pastebin.com/dX19aCpj
> [8.185808] nvidia-gpu :01:00.3: runtime IRQ mapping not provided by 
> arch
> [8.185989] nvidia-gpu :01:00.3: enabling device ( -> 0002)
> [8.188986] nvidia-gpu :01:00.3: enabling bus mastering
> [   11.936507] nvidia-gpu :01:00.3: PME# enabled
> [   11.975985] nvidia-gpu :01:00.3: PME# disabled
> [   11.976011] pcieport :00:01.0: PME: Spurious native interrupt!
>
> dmesg.3_usb_key_yanked.txt https://pastebin.com/m7QLnCZt
> I yanked the USB key during boot, that seemed to help unlock things with
> 5.7, but did not with 5.8. It's hung on a loop of:
> [   11.262854] nvidia-gpu :01:00.3: saving config space at offset 0x0 
> (reading 0x1ad910de)
> [   11.262863] nvidia-gpu :01:00.3: saving config space at offset 0x4 
> (reading 0x100406)
> [   11.262869] nvidia-gpu :01:00.3: saving config space at offset 0x8 
> (reading 0xc8000a1)
> [   11.262874] nvidia-gpu :01:00.3: saving config space at offset 0xc 
> (reading 0x80)
> [   11.262880] nvidia-gpu :01:00.3: saving config space at offset 0x10 
> (reading 0xce054000)
> [   11.262885] nvidia-gpu :01:00.3: saving config space at offset 0x14 
> (reading 0x0)
> [   11.262890] nvidia-gpu :01:00.3: saving config space at offset 0x18 
> (reading 0x0)
> [   11.262895] nvidia-gpu :01:00.3: saving config space at offset 0x1c 
> (reading 0x0)
> [   11.262900] nvidia-gpu :01:00.3: saving config space at offset 0x20 
> (reading 0x0)
> [   11.262906] nvidia-gpu :01:00.3: saving config space at offset 0x24 
> (reading 0x0)
> [   11.262911] nvidia-gpu :01:00.3: saving config space at offset 0x28 
> (reading 0x0)
> [   11.262916] nvidia-gpu :01:00.3: saving config space at offset 0x2c 
> (reading 0x229b17aa)
> [   11.262921] nvidia-gpu :01:00.3: saving config space at offset 0x30 
> (reading 0x0)
> [   11.262926] nvidia-gpu :01:00.3: saving config space at offset 0x34 
> (reading 0x68)
> [   11.262931] nvidia-gpu :01:00.3: saving config space at offset 0x38 
> (reading 0x0)
> [   11.262937] nvidia-gpu :01:00.3: saving config space at offset 0x3c 
> (reading 0x4ff)
> [   11.262985] nvidia-gpu :01:00.3: PME# enabled
> [   11.303060] nvidia-gpu :01:00.3: PME# disabled
>

mhh, interesting. I heard some random comments that the Nvidia
USB-C/UCSI driver is a bit broken and can cause various issues. Mind
blacklisting i2c-nvidia-gpu and typec_nvidia (and verify they don't
get loaded) and see if that helps?

> dmesg.4_5.5_boot_fine.txt https://pastebin.com/WXgQTUYP
> reference boot with 4.5, it works fine, no issues
>
> dmesg.5_no_key_still_hang.txt https://pastebin.com/kcT8Ras0
> unfortunately, booting without the USB-C key in thunderbolt, did not
> allow this boot to be faster, it looks different though:
> [6.723454] pcieport :00:01.0: runtime IRQ mapping not provided by arch
> [6.723598] pcieport :00:01.0: PME: Signaling with IRQ 122
> [6.724011] pcieport :00:01.0: saving config space at offset 0x0 
> (reading 0x19018086)
> [6.724016] 

Re: [Nouveau] pcieport 0000:00:01.0: PME: Spurious native interrupt (nvidia with nouveau and thunderbolt on thinkpad P73)

2020-09-06 Thread Marc MERLIN
Ok, I have an update to this problem. I added the nouveau list because
I can't quite tell if the issue is:
- the PCIe changes that went in 5.6 I think (or 5.5?), referenced below

- a new issue with thunderbold on thinkpad P73, that seems to be
  triggered if I have a USB-C yubikey in the port. With 5.7, my issues
  went away if I removed the USB key during boot, showing an interaction
  between nouveau and thunderbolt

- changes in the nouveau driver. Mika told me the PCIe regression
  "pcieport :00:01.0: PME: Spurious native interrupt!" is supposed
  to be fixed in 5.8, but I still get a 4mn hang or so during boot and
  with 5.8, removing the USB key, didn't help make the boot faster

I don't otherwise use the nvidia chip I so wish I didn't have, I only
use intel graphics on that laptop, but I must apparently use the nouveau
driver to manage the nouveau chip so that it's turned off and not
burning 60W doing nothing.

lspci is in the quoted message below, I won't copy it here again, but
here's the nvidia bit:
01:00.0 VGA compatible controller: NVIDIA Corporation TU104GLM [Quadro RTX 4000 
Mobile / Max-Q] (rev a1)
01:00.1 Audio device: NVIDIA Corporation TU104 HD Audio Controller (rev a1)
01:00.2 USB controller: NVIDIA Corporation TU104 USB 3.1 Host Controller (rev 
a1)
01:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU104 USB Type-C UCSI 
Controller (rev a1)

Here are 5 boots, 4 on 5.8.5:

dmesg.1_hang_but_no_warning.txt https://pastebin.com/Y5NaH08n
Boot hung for quite a while, but no clear output

dmesg.2_pme_spurious.txt https://pastebin.com/dX19aCpj
[8.185808] nvidia-gpu :01:00.3: runtime IRQ mapping not provided by arch
[8.185989] nvidia-gpu :01:00.3: enabling device ( -> 0002)
[8.188986] nvidia-gpu :01:00.3: enabling bus mastering
[   11.936507] nvidia-gpu :01:00.3: PME# enabled
[   11.975985] nvidia-gpu :01:00.3: PME# disabled
[   11.976011] pcieport :00:01.0: PME: Spurious native interrupt!

dmesg.3_usb_key_yanked.txt https://pastebin.com/m7QLnCZt
I yanked the USB key during boot, that seemed to help unlock things with
5.7, but did not with 5.8. It's hung on a loop of:
[   11.262854] nvidia-gpu :01:00.3: saving config space at offset 0x0 
(reading 0x1ad910de)
[   11.262863] nvidia-gpu :01:00.3: saving config space at offset 0x4 
(reading 0x100406)
[   11.262869] nvidia-gpu :01:00.3: saving config space at offset 0x8 
(reading 0xc8000a1)
[   11.262874] nvidia-gpu :01:00.3: saving config space at offset 0xc 
(reading 0x80)
[   11.262880] nvidia-gpu :01:00.3: saving config space at offset 0x10 
(reading 0xce054000)
[   11.262885] nvidia-gpu :01:00.3: saving config space at offset 0x14 
(reading 0x0)
[   11.262890] nvidia-gpu :01:00.3: saving config space at offset 0x18 
(reading 0x0)
[   11.262895] nvidia-gpu :01:00.3: saving config space at offset 0x1c 
(reading 0x0)
[   11.262900] nvidia-gpu :01:00.3: saving config space at offset 0x20 
(reading 0x0)
[   11.262906] nvidia-gpu :01:00.3: saving config space at offset 0x24 
(reading 0x0)
[   11.262911] nvidia-gpu :01:00.3: saving config space at offset 0x28 
(reading 0x0)
[   11.262916] nvidia-gpu :01:00.3: saving config space at offset 0x2c 
(reading 0x229b17aa)
[   11.262921] nvidia-gpu :01:00.3: saving config space at offset 0x30 
(reading 0x0)
[   11.262926] nvidia-gpu :01:00.3: saving config space at offset 0x34 
(reading 0x68)
[   11.262931] nvidia-gpu :01:00.3: saving config space at offset 0x38 
(reading 0x0)
[   11.262937] nvidia-gpu :01:00.3: saving config space at offset 0x3c 
(reading 0x4ff)
[   11.262985] nvidia-gpu :01:00.3: PME# enabled
[   11.303060] nvidia-gpu :01:00.3: PME# disabled

dmesg.4_5.5_boot_fine.txt https://pastebin.com/WXgQTUYP
reference boot with 4.5, it works fine, no issues

dmesg.5_no_key_still_hang.txt https://pastebin.com/kcT8Ras0
unfortunately, booting without the USB-C key in thunderbolt, did not
allow this boot to be faster, it looks different though:
[6.723454] pcieport :00:01.0: runtime IRQ mapping not provided by arch
[6.723598] pcieport :00:01.0: PME: Signaling with IRQ 122
[6.724011] pcieport :00:01.0: saving config space at offset 0x0 
(reading 0x19018086)
[6.724016] pcieport :00:01.0: saving config space at offset 0x4 
(reading 0x100407)
[6.724021] pcieport :00:01.0: saving config space at offset 0x8 
(reading 0x604000d)
[6.724025] pcieport :00:01.0: saving config space at offset 0xc 
(reading 0x81)
[6.724029] pcieport :00:01.0: saving config space at offset 0x10 
(reading 0x0)
[6.724033] pcieport :00:01.0: saving config space at offset 0x14 
(reading 0x0)
[6.724037] pcieport :00:01.0: saving config space at offset 0x18 
(reading 0x10100)
[6.724041] pcieport :00:01.0: saving config space at offset 0x1c 
(reading 0x20002020)
[6.724046] pcieport :00:01.0: saving config space at offset 0x20