Re: radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
On Sat, Jan 12, 2019 at 09:50:51PM +0100, Borislav Petkov wrote: > Hi guys, > > my odyssey with the GPU continues. This time it didn't reset itself > but started spewing a single line about the hardware locking up. > > The machine was responsive to sysrq so I was able to write out > /var/log/messages and reboot. > > This is still with 4.20-rc7 but I'm building 5.0-rc1 to see if there's a > difference. Well, not really. This time the reset succeeded and the machine is still alive: [111333.620619] radeon :1d:00.0: ring 0 stalled for more than 10360msec [111333.620626] radeon :1d:00.0: GPU lockup (current fence id 0x0080f31d last fence id 0x0080f416 on ring 0) [111334.132277] radeon :1d:00.0: ring 0 stalled for more than 10872msec [111334.132283] radeon :1d:00.0: GPU lockup (current fence id 0x0080f31d last fence id 0x0080f418 on ring 0) [111334.199083] radeon :1d:00.0: failed to get a new IB (-35) [111334.199107] [drm:radeon_cs_ioctl [radeon]] *ERROR* Failed to get ib ! [111334.206116] radeon :1d:00.0: Saved 8121 dwords of commands on ring 0. [111334.206127] radeon :1d:00.0: GPU softreset: 0x0008 [111334.206130] radeon :1d:00.0: R_008010_GRBM_STATUS = 0xA0001030 [111334.206132] radeon :1d:00.0: R_008014_GRBM_STATUS2 = 0x0003 [111334.206135] radeon :1d:00.0: R_000E50_SRBM_STATUS = 0x20C0 [111334.206137] radeon :1d:00.0: R_008674_CP_STALLED_STAT1 = 0x [111334.206139] radeon :1d:00.0: R_008678_CP_STALLED_STAT2 = 0x [111334.206141] radeon :1d:00.0: R_00867C_CP_BUSY_STAT = 0x00020182 [111334.206144] radeon :1d:00.0: R_008680_CP_STAT = 0x80028645 [111334.206146] radeon :1d:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 [111334.272194] radeon :1d:00.0: R_008020_GRBM_SOFT_RESET=0x4001 [111334.272247] radeon :1d:00.0: SRBM_SOFT_RESET=0x0100 [111334.274336] radeon :1d:00.0: R_008010_GRBM_STATUS = 0xA0003030 [111334.274338] radeon :1d:00.0: R_008014_GRBM_STATUS2 = 0x0003 [111334.274339] radeon :1d:00.0: R_000E50_SRBM_STATUS = 0x200080C0 [111334.274341] radeon :1d:00.0: R_008674_CP_STALLED_STAT1 = 0x [111334.274342] radeon :1d:00.0: R_008678_CP_STALLED_STAT2 = 0x [111334.274344] radeon :1d:00.0: R_00867C_CP_BUSY_STAT = 0x [111334.274345] radeon :1d:00.0: R_008680_CP_STAT = 0x8010 [111334.274347] radeon :1d:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 [111334.274354] radeon :1d:00.0: GPU reset succeeded, trying to resume [111334.290030] [drm] PCIE gen 2 link speeds already enabled [111334.292121] [drm] PCIE GART of 512M enabled (table at 0x00142000). [111334.292135] radeon :1d:00.0: WB enabled [111334.292137] radeon :1d:00.0: fence driver on ring 0 use gpu addr 0x2c00 and cpu addr 0xfb2c042c [111334.292325] radeon :1d:00.0: fence driver on ring 5 use gpu addr 0x000521d0 and cpu addr 0x14f22c80 [111334.323193] [drm] ring test on 0 succeeded in 0 usecs [111334.497890] [drm] ring test on 5 succeeded in 1 usecs [111334.497896] [drm] UVD initialized successfully. [111334.724316] [drm] ib test on ring 0 succeeded in 0 usecs [111335.380416] [drm] ib test on ring 5 succeeded -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply. ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000017a66bf last fence id 0x00000000017a67a1 on ring 0)
Hi guys, my odyssey with the GPU continues. This time it didn't reset itself but started spewing a single line about the hardware locking up. The machine was responsive to sysrq so I was able to write out /var/log/messages and reboot. This is still with 4.20-rc7 but I'm building 5.0-rc1 to see if there's a difference. Any and all ideas what to do here are greatly appreciated. Thx. Jan 12 21:38:16 zn vmunix: [257393.853806] radeon :1d:00.0: GPU lockup (current fence id 0x017a66bf last fence id 0x017a67a0 on ring 0) Jan 12 21:38:16 zn vmunix: [257394.369780] radeon :1d:00.0: GPU lockup (current fence id 0x017a66bf last fence id 0x017a67a0 on ring 0) Jan 12 21:38:17 zn vmunix: [257394.877795] radeon :1d:00.0: GPU lockup (current fence id 0x017a66bf last fence id 0x017a67a1 on ring 0) Jan 12 21:38:17 zn vmunix: [257395.389789] radeon :1d:00.0: GPU lockup (current fence id 0x017a66bf last fence id 0x017a67a1 on ring 0) Jan 12 21:38:18 zn vmunix: [257395.901786] radeon :1d:00.0: GPU lockup (current fence id 0x017a66bf last fence id 0x017a67a1 on ring 0) Jan 12 21:38:18 zn vmunix: [257396.413787] radeon :1d:00.0: GPU lockup (current fence id 0x017a66bf last fence id 0x017a67a1 on ring 0) Jan 12 21:38:19 zn vmunix: [257396.925790] radeon :1d:00.0: GPU lockup (current fence id 0x017a66bf last fence id 0x017a67a1 on ring 0) Jan 12 21:38:19 zn vmunix: [257397.437787] radeon :1d:00.0: GPU lockup (current fence id 0x017a66bf last fence id 0x017a67a1 on ring 0) Jan 12 21:38:20 zn vmunix: [257397.949788] radeon :1d:00.0: GPU lockup (current fence id 0x017a66bf last fence id 0x017a67a1 on ring 0) Jan 12 21:38:20 zn vmunix: [257398.461786] radeon :1d:00.0: GPU lockup (current fence id 0x017a66bf last fence id 0x017a67a1 on ring 0) Jan 12 21:38:21 zn vmunix: [257398.973824] radeon :1d:00.0: GPU lockup (current fence id 0x017a66bf last fence id 0x017a67a1 on ring 0) Jan 12 21:38:21 zn vmunix: [257399.485828] radeon :1d:00.0: GPU lockup (current fence id 0x017a66bf last fence id 0x017a67a1 on ring 0) Jan 12 21:38:22 zn vmunix: [257399.997842] radeon :1d:00.0: GPU lockup (current fence id 0x017a66bf last fence id 0x017a67a1 on ring 0) Jan 12 21:38:22 zn vmunix: [257400.509852] radeon :1d:00.0: GPU lockup (current fence id 0x017a66bf last fence id 0x017a67a1 on ring 0) Jan 12 21:38:23 zn vmunix: [257401.021845] radeon :1d:00.0: GPU lockup (current fence id 0x017a66bf last fence id 0x017a67a1 on ring 0) Jan 12 21:38:23 zn vmunix: [257401.533826] radeon :1d:00.0: GPU lockup (current fence id 0x017a66bf last fence id 0x017a67a1 on ring 0) Jan 12 21:38:24 zn vmunix: [257402.045822] radeon :1d:00.0: GPU lockup (current fence id 0x017a66bf last fence id 0x017a67a1 on ring 0) Jan 12 21:38:24 zn vmunix: [257402.557826] radeon :1d:00.0: GPU lockup (current fence id 0x017a66bf last fence id 0x017a67a1 on ring 0) Jan 12 21:38:25 zn vmunix: [257403.069833] radeon :1d:00.0: GPU lockup (current fence id 0x017a66bf last fence id 0x017a67a1 on ring 0) Jan 12 21:38:25 zn vmunix: [257403.581826] radeon :1d:00.0: GPU lockup (current fence id 0x017a66bf last fence id 0x017a67a1 on ring 0) Jan 12 21:38:26 zn vmunix: [257404.093828] radeon :1d:00.0: GPU lockup (current fence id 0x017a66bf last fence id 0x017a67a1 on ring 0) Jan 12 21:38:26 zn vmunix: [257404.605827] radeon :1d:00.0: GPU lockup (current fence id 0x017a66bf last fence id 0x017a67a1 on ring 0) Jan 12 21:38:27 zn vmunix: [257405.117821] radeon :1d:00.0: GPU lockup (current fence id 0x017a66bf last fence id 0x017a67a1 on ring 0) Jan 12 21:38:27 zn vmunix: [257405.629883] radeon :1d:00.0: GPU lockup (current fence id 0x017a66bf last fence id 0x017a67a1 on ring 0) Jan 12 21:38:28 zn vmunix: [257406.141842] radeon :1d:00.0: GPU lockup (current fence id 0x017a66bf last fence id 0x017a67a1 on ring 0) Jan 12 21:38:28 zn vmunix: [257406.653828] radeon :1d:00.0: GPU lockup (current fence id 0x017a66bf last fence id 0x017a67a1 on ring 0) Jan 12 21:38:29 zn vmunix: [257407.165827] radeon :1d:00.0: GPU lockup (current fence id 0x017a66bf last fence id 0x017a67a1 on ring 0) Jan 12 21:38:29 zn vmunix: [257407.677827] radeon :1d:00.0: GPU lockup (current fence id 0x017a66bf last fence id 0x017a67a1 on ring 0) Jan 12 21:38:30 zn vmunix: [257408.189823] radeon :1d:00.0: GPU lockup (current fence id 0x017a66bf last fence id 0x017a67a1 on ring 0) Jan 12 21:38:30 zn vmunix: [257408.701827] radeon