Hi, any pointer on how to debug this:

[   19.741005] nouveau 0000:01:00.0: enabling device (0541 -> 0543)
[   19.741095] nouveau 0000:01:00.0: Using 32-bit DMA via iommu
[   19.741165] nouveau 0000:01:00.0: NVIDIA GP108 (138000a1)
[   19.752562] tg3 0004:01:00.0 enP4p1s0f0: renamed from eth0
[   19.832879] [drm] Initialized ast 0.1.0 20120228 for 0005:02:00.0 on minor 0
[   19.856391] nouveau 0000:01:00.0: bios: version 86.08.13.00.12
[   19.857574] nouveau 0000:01:00.0: Using 32-bit DMA via iommu
[   19.857812] nouveau 0000:01:00.0: fb: 2048 MiB GDDR5
[   22.401204] random: fast init done
[   23.064311] nouveau 0000:01:00.0: DRM: VRAM: 2048 MiB
[   23.064326] nouveau 0000:01:00.0: DRM: GART: 536870912 MiB
[   23.064341] nouveau 0000:01:00.0: DRM: BIT table 'A' not found
[   23.064356] nouveau 0000:01:00.0: DRM: BIT table 'L' not found
[   23.064371] nouveau 0000:01:00.0: DRM: TMDS table version 2.0
[   23.064386] nouveau 0000:01:00.0: DRM: DCB version 4.1
[   23.064399] nouveau 0000:01:00.0: DRM: DCB outp 00: 01800346 04600010
[   23.064416] nouveau 0000:01:00.0: DRM: DCB outp 01: 01000342 00020010
[   23.064432] nouveau 0000:01:00.0: DRM: DCB outp 02: 01011352 00020020
[   23.064448] nouveau 0000:01:00.0: DRM: DCB conn 00: 00001046
[   23.064463] nouveau 0000:01:00.0: DRM: DCB conn 01: 00002161
[   23.065303] nouveau 0000:01:00.0: disp: chid 0 mthd 0000 data 00000000 
00001000 00000001
[   23.065323] nouveau 0000:01:00.0: disp: chid 1 mthd 0000 data 00000000 
00001000 00000001
[   23.086649] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[   23.087500] [drm] Driver supports precise vblank timestamp query.
[   23.088876] nouveau 0000:01:00.0: DRM: MM: using COPY for buffer copies
[   23.354442] nouveau 0000:01:00.0: bus: MMIO read of 00000000 FAULT at 409800 
[ TIMEOUT ]
[   25.354017] ------------[ cut here ]------------
[   25.355515] nouveau 0000:01:00.0: timeout
[   25.357105] WARNING: CPU: 0 PID: 586 at 
drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c:1524 
gf100_gr_init_ctxctl_ext+0x798/0xa50 [nouveau]
[   25.358654] Modules linked in: nouveau(+) ast i2c_algo_bit drm_kms_helper 
syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm aacraid tg3 vmx_crypto 
drm_panel_orientation_quirks i2c_core crc32c_vpmsum
[   25.360265] CPU: 0 PID: 586 Comm: kworker/0:3 Not tainted 4.20.0+ #4
[   25.361865] Workqueue: events work_for_cpu_fn
[   25.363471] NIP:  c00800000dbfae40 LR: c00800000dbfae3c CTR: c000000000c4f870
[   25.365096] REGS: c00000000a416fa0 TRAP: 0700   Not tainted  (4.20.0+)
[   25.366737] MSR:  9000000000029033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 48008482  
XER: 00000000
[   25.368402] CFAR: c000000000119834 IRQMASK: 0 
               GPR00: c00800000dbfae3c c00000000a417230 c00800000dd38c00 
000000000000001d 
               GPR04: 0000000000000001 0000000000000000 0000000000000293 
0000000000000000 
               GPR08: 0000000000000007 0000000000000007 0000000000000001 
c00800001c64d0a0 
               GPR12: 0000000000008000 c0000000017c3000 c00000000014a9d8 
c00000000abd3b00 
               GPR16: 0000000000000000 0000000000000000 0000000000000000 
0000000000000000 
               GPR20: 0000000000000000 0000000000000000 fffffffffffffef7 
0000000000000000 
               GPR24: c00000798484fab8 00000000000000ff 0000000000000020 
0000000000000000 
               GPR28: c000000009f5d000 0000000000000000 c000000005020000 
c000000009f5b000 
[   25.383286] NIP [c00800000dbfae40] gf100_gr_init_ctxctl_ext+0x798/0xa50 
[nouveau]
[   25.384940] LR [c00800000dbfae3c] gf100_gr_init_ctxctl_ext+0x794/0xa50 
[nouveau]
[   25.386539] Call Trace:
[   25.388174] [c00000000a417230] [c00800000dbfae3c] 
gf100_gr_init_ctxctl_ext+0x794/0xa50 [nouveau] (unreliable)
[   25.389831] [c00000000a4172e0] [c00800000dbfc404] 
gf100_gr_init_ctxctl+0x3c/0x3e0 [nouveau]
[   25.391476] [c00000000a417390] [c00800000dbf9f94] gf100_gr_init_+0xac/0xd0 
[nouveau]
[   25.393117] [c00000000a4173c0] [c00800000dbe703c] nvkm_gr_init+0x34/0x50 
[nouveau]
[   25.394740] [c00000000a4173e0] [c00800000db0f4d8] 
nvkm_engine_init+0x190/0x2e0 [nouveau]
[   25.396354] [c00000000a417470] [c00800000db16ed4] 
nvkm_subdev_init+0x11c/0x320 [nouveau]
[   25.397965] [c00000000a4174f0] [c00800000db0f6ac] 
nvkm_engine_ref.part.0+0x84/0xd0 [nouveau]
[   25.399578] [c00000000a417530] [c00800000db11ef4] nvkm_ioctl_new+0x1cc/0x3c0 
[nouveau]
[   25.401173] [c00000000a417660] [c00800000db123b4] nvkm_ioctl+0x10c/0x370 
[nouveau]
[   25.402769] [c00000000a417700] [c00800000dc26818] 
nvkm_client_ioctl+0x20/0x40 [nouveau]
[   25.404360] [c00000000a417720] [c00800000db0b07c] 
nvif_object_ioctl+0x74/0xa0 [nouveau]
[   25.405945] [c00000000a417740] [c00800000db0bb90] 
nvif_object_init+0xe8/0x1a0 [nouveau]
[   25.407539] [c00000000a4177b0] [c00800000dc3f888] 
nvc0_fbcon_accel_init+0x70/0xaa0 [nouveau]
[   25.409119] [c00000000a417810] [c00800000dc3b4f4] 
nouveau_fbcon_create+0x58c/0x5b0 [nouveau]
[   25.410653] [c00000000a417950] [c00800001c317eb0] 
__drm_fb_helper_initial_config_and_unlock+0x2d8/0x5d0 [drm_kms_helper]
[   25.412209] [c00000000a417a00] [c00800000dc3c1a8] 
nouveau_fbcon_init+0x210/0x280 [nouveau]
[   25.413708] [c00000000a417a50] [c00800000dc23e88] 
nouveau_drm_device_init+0x5d0/0x9c0 [nouveau]
[   25.415162] [c00000000a417b60] [c00800000dc24624] 
nouveau_drm_probe+0x2bc/0x340 [nouveau]
[   25.416494] [c00000000a417bb0] [c0000000006ed36c] local_pci_probe+0x6c/0x140
[   25.417766] [c00000000a417c40] [c00000000013c748] work_for_cpu_fn+0x38/0x60
[   25.418984] [c00000000a417c70] [c0000000001417d0] 
process_one_work+0x250/0x500
[   25.420210] [c00000000a417d10] [c000000000141cf0] worker_thread+0x270/0x5b0
[   25.421434] [c00000000a417db0] [c00000000014ab7c] kthread+0x1ac/0x1c0
[   25.422655] [c00000000a417e20] [c00000000000bdd4] 
ret_from_kernel_thread+0x5c/0x68
[   25.423866] Instruction dump:
[   25.425068] e8410018 e9210060 7c641b78 e9290010 e9290010 e8a90050 2fa50000 
419e0114 
[   25.426303] 3c620000 e863c7a0 4807a2f9 e8410018 <0fe00000> 3ba0fff0 4bfffc14 
60000000 
[   25.427555] ---[ end trace 11a5d40b65319c36 ]---
[   25.428806] nouveau 0000:01:00.0: gr: init failed, -16
[   25.480400] nouveau 0000:01:00.0: DRM: allocated 3840x2160 fb: 0x200000, bo 
(____ptrval____)
[   25.746336] nouveau 0000:01:00.0: i2c: aux 0004: begin idle timeout bad00100
[   25.791511] nouveau 0000:01:00.0: fb1: nouveaufb frame buffer device
[   25.871488] [drm] Initialized nouveau 1.3.1 20120801 for 0000:01:00.0 on 
minor 1
[   26.367628] nouveau 0000:01:00.0: i2c: aux 0004: begin idle timeout bad00100


This is ppc64 5.0rc1 with 4k pages. Maybe it is some iommu issues
like thing not mapped properly to the GPU.

lspci:

0000:01:00.0 VGA compatible controller: NVIDIA Corporation GP108 [GeForce GT 
1030] (rev a1) (prog-if 00 [VGA controller])
        Subsystem: Micro-Star International Co., Ltd. [MSI] Device 8c98
        Device tree node: 
/sys/firmware/devicetree/base/pciex@600c3c0000000/pci@0/vga@0
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ 
Stepping- SERR+ FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- 
<MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin A routed to IRQ 21
        NUMA node: 0
        Region 0: Memory at 600c000000000 (32-bit, non-prefetchable) [size=16M]
        Region 1: [virtual] Memory at 6000000000000 (64-bit, prefetchable) 
[size=256M]
        Region 3: Memory at 6000010000000 (64-bit, prefetchable) [size=32M]
        Region 5: I/O ports at 0000
        [virtual] Expansion ROM at 600c001000000 [disabled] [size=512K]
        Capabilities: [60] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
                Address: 1000000000000000  Data: 0000
        Capabilities: [78] Express (v2) Legacy Endpoint, MSI 00
                DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s 
unlimited, L1 <64us
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
                DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
                        MaxPayload 256 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- 
TransPend-
                LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM L0s L1, Exit 
Latency L0s <512ns, L1 <4us
                        ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s (downgraded), Width x4 (ok)
                        TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range AB, TimeoutDis+, LTR+, OBFF 
Via message
                         AtomicOpsCap: 32bit- 64bit- 128bitCAS-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis+, LTR-, 
OBFF Disabled
                         AtomicOpsCtl: ReqEn-
                LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, 
EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -6dB, 
EqualizationComplete+, EqualizationPhase1+
                         EqualizationPhase2+, EqualizationPhase3-, 
LinkEqualizationRequest-
        Capabilities: [100 v1] Virtual Channel
                Caps:   LPEVC=0 RefClk=100ns PATEntryBits=1
                Arb:    Fixed- WRR32- WRR64- WRR128-
                Ctrl:   ArbSelect=Fixed
                Status: InProgress-
                VC0:    Caps:   PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
                        Arb:    Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
                        Ctrl:   Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
                        Status: NegoPending- InProgress-
        Capabilities: [250 v1] Latency Tolerance Reporting
                Max snoop latency: 0ns
                Max no snoop latency: 0ns
        Capabilities: [128 v1] Power Budgeting <?>
        Capabilities: [420 v2] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- 
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- 
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- 
RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- 
AdvNonFatalErr-
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- 
AdvNonFatalErr+
                AERCap: First Error Pointer: 00, ECRCGenCap- ECRCGenEn+ 
ECRCChkCap- ECRCChkEn+
                        MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
                HeaderLog: 00000000 00000000 00000000 00000000
        Capabilities: [600 v1] Vendor Specific Information: ID=0001 Rev=1 
Len=024 <?>
        Capabilities: [900 v1] Secondary PCI Express <?>
        Kernel driver in use: nouveau
        Kernel modules: nouveau

0000:01:00.1 Audio device: NVIDIA Corporation GP108 High Definition Audio 
Controller (rev a1)
        Subsystem: Micro-Star International Co., Ltd. [MSI] Device 8c98
        Device tree node: 
/sys/firmware/devicetree/base/pciex@600c3c0000000/pci@0/multimedia-device@0,1
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ 
Stepping- SERR+ FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- 
<MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin B routed to IRQ 511
        NUMA node: 0
        Region 0: Memory at 600c001080000 (32-bit, non-prefetchable) [size=16K]
        Capabilities: [60] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
                Address: 0000000000000000  Data: 0000
        Capabilities: [78] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s 
unlimited, L1 <64us
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- 
SlotPowerLimit 0.000W
                DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
                        MaxPayload 256 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- 
TransPend-
                LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM L0s L1, Exit 
Latency L0s <512ns, L1 <4us
                        ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
                LnkCtl: ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk-
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s (downgraded), Width x4 (ok)
                        TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range AB, TimeoutDis+, LTR+, OBFF 
Via message
                         AtomicOpsCap: 32bit- 64bit- 128bitCAS-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis+, LTR-, 
OBFF Disabled
                         AtomicOpsCtl: ReqEn-
                LnkSta2: Current De-emphasis Level: -6dB, 
EqualizationComplete-, EqualizationPhase1-
                         EqualizationPhase2-, EqualizationPhase3-, 
LinkEqualizationRequest-
        Capabilities: [100 v2] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- 
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- 
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- 
RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- 
AdvNonFatalErr-
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- 
AdvNonFatalErr+
                AERCap: First Error Pointer: 00, ECRCGenCap- ECRCGenEn+ 
ECRCChkCap- ECRCChkEn+
                        MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
                HeaderLog: 00000000 00000000 00000000 00000000
        Kernel driver in use: snd_hda_intel
        Kernel modules: snd_hda_intel


Only have intermittent access to that system :(

Cheers,
Jérôme
_______________________________________________
Nouveau mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/nouveau

Reply via email to