Reproduced on Fedora (with kernel 4.18.16-300.fc29.x86_64).

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1782716

Title:
  [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  Running the 4.17.0-5-generic kernel on a ppc64le machine with a Radeon
  R9 Fury GPU

  
  0033:01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. 
[AMD/ATI] Fiji [Radeon R9 FURY / NANO Series] (rev ff)

  [ 2361.958847] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, 
last signaled seq=8777, last emitted seq=8778
  [ 2362.080397] EEH: Frozen PHB#33-PE#0 detected
  [ 2362.080470] EEH: PE location: CPU2 Slot1 (16x), PHB location: N/A
  [ 2362.080568] CPU: 53 PID: 874 Comm: kworker/53:1 Not tainted 
4.17.0-5-generic #6-Ubuntu
  [ 2362.080575] Workqueue: events drm_sched_job_timedout [gpu_sched]
  [ 2362.080577] Call Trace:
  [ 2362.080584] [c0000000fb7078f0] [c000000000d275ac] dump_stack+0xb0/0xf4 
(unreliable)
  [ 2362.080590] [c0000000fb707930] [c00000000003ba0c] 
eeh_dev_check_failure+0x5bc/0x5e0
  [ 2362.080593] [c0000000fb7079e0] [c00000000003babc] 
eeh_check_failure+0x8c/0xd0
  [ 2362.080628] [c0000000fb707a20] [c00800000cfa1b88] 
amdgpu_mm_rreg+0x280/0x2a0 [amdgpu]
  [ 2362.080676] [c0000000fb707a70] [c00800000d04cf68] 
gmc_v8_0_check_soft_reset+0x30/0xe0 [amdgpu]
  [ 2362.080711] [c0000000fb707aa0] [c00800000cfa1194] 
amdgpu_device_ip_check_soft_reset.part.1+0x8c/0x140 [amdgpu]
  [ 2362.080745] [c0000000fb707b30] [c00800000cfa649c] 
amdgpu_device_gpu_recover+0x854/0xa40 [amdgpu]
  [ 2362.080799] [c0000000fb707c00] [c00800000d0b97a4] 
amdgpu_job_timedout+0x5c/0x80 [amdgpu]
  [ 2362.080805] [c0000000fb707c70] [c00800000c8f0040] 
drm_sched_job_timedout+0x38/0x60 [gpu_sched]
  [ 2362.080810] [c0000000fb707c90] [c000000000137928] 
process_one_work+0x298/0x580
  [ 2362.080813] [c0000000fb707d20] [c000000000137c98] worker_thread+0x88/0x610
  [ 2362.080817] [c0000000fb707dc0] [c000000000140958] kthread+0x1a8/0x1b0
  [ 2362.080822] [c0000000fb707e30] [c00000000000b658] 
ret_from_kernel_thread+0x5c/0x84
  [ 2362.080827] [drm] IP block:gmc_v8_0 is hung!
  [ 2362.080832] [drm] IP block:tonga_ih is hung!
  [ 2362.080843] [drm] IP block:gfx_v8_0 is hung!
  [ 2362.080845] EEH: Detected PCI bus error on PHB#33-PE#0
  [ 2362.080847] EEH: This PCI device has failed 1 times in the last hour
  [ 2362.080849] EEH: Notify device drivers to shutdown
  [ 2362.080850] [drm] IP block:sdma_v3_0 is hung!
  [ 2362.080856] [drm] IP block:uvd_v6_0 is hung!
  [ 2362.080858] EEH: Collect temporary log
  [ 2362.080866] [drm] IP block:vce_v3_0 is hung!
  [ 2362.080867] [drm] GPU recovery disabled.
  [ 2362.080903] EEH: of node=0033:01:00.1
  [ 2362.080905] EEH: PCI device/vendor: ffffffff
  [ 2362.080907] EEH: PCI cmd/status register: ffffffff
  [ 2362.080908] EEH: PCI-E capabilities and status follow:
  [ 2362.080915] EEH: PCI-E 00: ffffffff ffffffff ffffffff ffffffff 
  [ 2362.080920] EEH: PCI-E 10: ffffffff ffffffff ffffffff ffffffff 
  [ 2362.080921] EEH: PCI-E 20: ffffffff 
  [ 2362.080922] EEH: PCI-E AER capability register set follows:
  [ 2362.080928] EEH: PCI-E AER 00: ffffffff ffffffff ffffffff ffffffff 
  [ 2362.080933] EEH: PCI-E AER 10: ffffffff ffffffff ffffffff ffffffff 
  [ 2362.080938] EEH: PCI-E AER 20: ffffffff ffffffff ffffffff ffffffff 
  [ 2362.080940] EEH: PCI-E AER 30: ffffffff ffffffff 
  [ 2362.080941] EEH: of node=0033:01:00.0
  [ 2362.080943] EEH: PCI device/vendor: ffffffff
  [ 2362.080945] EEH: PCI cmd/status register: ffffffff
  [ 2362.080945] EEH: PCI-E capabilities and status follow:
  [ 2362.080951] EEH: PCI-E 00: ffffffff ffffffff ffffffff ffffffff 
  [ 2362.080956] EEH: PCI-E 10: ffffffff ffffffff ffffffff ffffffff 
  [ 2362.080957] EEH: PCI-E 20: ffffffff 
  [ 2362.080958] EEH: PCI-E AER capability register set follows:
  [ 2362.080964] EEH: PCI-E AER 00: ffffffff ffffffff ffffffff ffffffff 
  [ 2362.080969] EEH: PCI-E AER 10: ffffffff ffffffff ffffffff ffffffff 
  [ 2362.080974] EEH: PCI-E AER 20: ffffffff ffffffff ffffffff ffffffff 
  [ 2362.080975] EEH: PCI-E AER 30: ffffffff ffffffff 
  [ 2362.080977] PHB4 PHB#51 Diag-data (Version: 1)
  [ 2362.080978] brdgCtl:    00000002
  [ 2362.080979] RootSts:    00060020 00402000 c1010008 00100107 00000000
  [ 2362.080980] RootErrSts: 00000000 00000020 00000000
  [ 2362.080981] PhbSts:     0000001c00000000 0000001c00000000
  [ 2362.080982] Lem:        0000000100000000 0000000000000000 0000000100000000
  [ 2362.080983] PhbErr:     000000c000000000 0000008000000000 2148000098000240 
a008400000000000
  [ 2362.080984] RegbErr:    0090000000000000 0010000000000000 4800003c00000000 
0000000000000200
  [ 2362.080985] PE[000] A/B: 8000000000000000 8000000000000000
  [ 2362.080987] PE[..1fe] A/B: as above
  [ 2362.080988] PE[1ff] A/B: b740002a01000000 8000000000000000
  [ 2362.080988] EEH: Reset with hotplug activity
  [ 2362.579139] iommu: Removing device 0033:01:00.1 from group 3
  [ 2362.579206] pci 0033:01:00.1: Dropping the link to 0033:01:00.0
  [ 2362.579665] [drm] amdgpu: finishing device.
  [ 2363.495059] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma1 timeout, 
last signaled seq=8052, last emitted seq=8054
  [ 2363.495192] [drm] IP block:gmc_v8_0 is hung!
  [ 2363.495197] [drm] IP block:tonga_ih is hung!
  [ 2363.495208] [drm] IP block:gfx_v8_0 is hung!
  [ 2363.495212] [drm] IP block:sdma_v3_0 is hung!
  [ 2363.495217] [drm] IP block:uvd_v6_0 is hung!
  [ 2363.495225] [drm] IP block:vce_v3_0 is hung!
  [ 2363.495226] [drm] GPU recovery disabled.
  [ 2372.712463] [drm:amdgpu_dm_atomic_check [amdgpu]] *ERROR* [CRTC:43:crtc-0] 
hw_done or flip_done timed out

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1782716/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to