[Bug 1787904] Re: [radeon] machine crash after GPU reset
[Expired for linux (Ubuntu) because there has been no activity for 60 days.] ** Changed in: linux (Ubuntu) Status: Incomplete => Expired -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1787904 Title: [radeon] machine crash after GPU reset To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1787904/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1787904] Re: [radeon] machine crash after GPU reset
** Tags added: cscc -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1787904 Title: [radeon] machine crash after GPU reset To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1787904/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1787904] Re: [radeon] machine crash after GPU reset
Experienced crashes under kernel 4.15.0-51 after resuming from suspend on a Dell Latitude E6540 running BIOS version A12 with the following graphics hardware: Intel Corporation 4th Gen Core Processor Integrated Graphics Controller (rev 06) Advanced Micro Devices, Inc. [AMD/ATI] Mars XTX [Radeon HD 8790M] After upgrading to kernel 5.0.0-16 seems to not be reoccurring. After searching online seems that the issues possibly relate to the issue fixed by this patch: https://patchwork.kernel.org/patch/10813959/ Interestingly I don't seem to see it included in kernel 5.0.0-16... Here are some kernel logs showing the issue: Jun 13 22:41:54 *** kernel: [38511.640938] acpi device:41: Cannot transition to power state D3hot for parent in (unknown) Jun 13 22:41:54 *** kernel: [38511.691197] wlp3s0: deauthenticating from 9c:d6:43:ce:eb:7a by local choice (Reason: 3=DEAUTH_LEAVING) Jun 13 22:41:54 *** kernel: [38511.710803] radeon :01:00.0: Refused to change power state, currently in D3 Jun 13 22:41:54 *** kernel: [38511.792418] radeon :01:00.0: Refused to change power state, currently in D3 Jun 13 22:41:54 *** kernel: [38511.812417] radeon :01:00.0: Refused to change power state, currently in D3 Jun 13 22:41:59 *** kernel: [38516.721783] PM: suspend entry (deep) [System suspended] Jun 13 23:07:54 *** kernel: [38516.721784] PM: Syncing filesystems ... done. Jun 13 23:07:54 *** kernel: [38516.769515] Freezing user space processes ... Jun 13 23:07:54 *** kernel: [38516.816504] [drm:atom_op_jump [radeon]] *ERROR* atombios stuck in loop for more than 5secs aborting Jun 13 23:07:54 *** kernel: [38516.816525] [drm:atom_execute_table_locked [radeon]] *ERROR* atombios stuck executing D1B2 (len 62, WS 0, PS 0) @ 0xD1CE Jun 13 23:07:54 *** kernel: [38516.816534] [drm:atom_execute_table_locked [radeon]] *ERROR* atombios stuck executing B914 (len 236, WS 4, PS 0) @ 0xB9E1 Jun 13 23:07:54 *** kernel: [38516.816543] [drm:atom_execute_table_locked [radeon]] *ERROR* atombios stuck executing B872 (len 74, WS 0, PS 8) @ 0xB87A Jun 13 23:07:54 *** kernel: [38516.817400] [drm:si_dpm_enable [radeon]] *ERROR* si_init_smc_table failed Jun 13 23:07:54 *** kernel: [38516.817413] [drm:radeon_pm_resume [radeon]] *ERROR* radeon: dpm resume failed Jun 13 23:07:54 *** kernel: [38516.817417] [drm] probing gen 2 caps for device 8086:c01 = 261ad03/e Jun 13 23:07:54 *** kernel: [38516.817417] [drm] enabling PCIE gen 3 link speeds, disable with radeon.pcie_gen2=0 Jun 13 23:07:54 *** kernel: [38517.376157] radeon :01:00.0: Wait for MC idle timedout ! Jun 13 23:07:54 *** kernel: [38517.484936] radeon :01:00.0: Wait for MC idle timedout ! Jun 13 23:07:54 *** kernel: [38517.493371] [drm] PCIE GART of 2048M enabled (table at 0x001D6000). Jun 13 23:07:54 *** kernel: [38517.493468] radeon :01:00.0: WB enabled Jun 13 23:07:54 *** kernel: [38517.493470] radeon :01:00.0: fence driver on ring 0 use gpu addr 0x8c00 and cpu addr 0xa0d314c4 Jun 13 23:07:54 *** kernel: [38517.493471] radeon :01:00.0: fence driver on ring 1 use gpu addr 0x8c04 and cpu addr 0x58fe2292 Jun 13 23:07:54 *** kernel: [38517.493472] radeon :01:00.0: fence driver on ring 2 use gpu addr 0x8c08 and cpu addr 0x95a12eba Jun 13 23:07:54 *** kernel: [38517.493473] radeon :01:00.0: fence driver on ring 3 use gpu addr 0x8c0c and cpu addr 0x965b6556 Jun 13 23:07:54 *** kernel: [38517.493474] radeon :01:00.0: fence driver on ring 4 use gpu addr 0x8c10 and cpu addr 0x1907e776 Jun 13 23:07:54 *** kernel: [38517.493671] radeon :01:00.0: fence driver on ring 5 use gpu addr 0x00075a18 and cpu addr 0xcf9b1760 Jun 13 23:07:54 *** kernel: [38517.602966] radeon :01:00.0: failed VCE resume (-110). Jun 13 23:07:54 *** kernel: [38518.585588] [drm:r600_ring_test [radeon]] *ERROR* radeon: ring 0 test failed (scratch(0x850C)=0x) Jun 13 23:07:54 *** kernel: [38518.585601] [drm:si_resume [radeon]] *ERROR* si startup failed on resume Jun 13 23:07:54 *** kernel: [38518.586403] [drm:si_dpm_enable [radeon]] *ERROR* si_init_smc_table failed Jun 13 23:07:54 *** kernel: [38518.586415] [drm:radeon_pm_resume [radeon]] *ERROR* radeon: dpm resume failed Jun 13 23:07:54 *** kernel: [38518.617080] (elapsed 1.847 seconds) done. Jun 13 23:07:54 *** kernel: [38518.617083] OOM killer disabled. Jun 13 23:07:54 *** kernel: [38518.617083] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. Jun 13 23:07:54 *** kernel: [38518.618463] Suspending console(s) (use no_console_suspend to debug) [...] Jun 13 23:07:54 *** kernel: [38519.445085] ACPI: EC: interrupt blocked Jun 13 23:07:54 *** kernel: [38519.466649] ACPI: Preparing to enter system sleep state S3 Jun 13 23:07:54 *** kernel: [38519.470503] ACPI: EC: event blocked Jun 13 23:07:54 *** kernel: [38519.470504] ACPI: EC: EC stopped [...] Jun 13 23:07:54 ***
[Bug 1787904] Re: [radeon] machine crash after GPU reset
[Expired for linux (Ubuntu) because there has been no activity for 60 days.] ** Changed in: linux (Ubuntu) Status: Incomplete => Expired -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1787904 Title: [radeon] machine crash after GPU reset To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1787904/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1787904] Re: [radeon] machine crash after GPU reset
[Fri Sep 28 00:06:28 2018] radeon :01:00.0: ring 0 stalled for more than 10248msec [Fri Sep 28 00:06:28 2018] radeon :01:00.0: GPU lockup (current fence id 0x00404e50 last fence id 0x00404edc on ring 0) [Fri Sep 28 00:06:28 2018] radeon :01:00.0: ring 3 stalled for more than 10204msec [Fri Sep 28 00:06:28 2018] radeon :01:00.0: GPU lockup (current fence id 0x00b9d1c3 last fence id 0x00b9d21e on ring 3) [Fri Sep 28 00:06:29 2018] radeon :01:00.0: Saved 8932 dwords of commands on ring 0. [Fri Sep 28 00:06:29 2018] radeon :01:00.0: GPU softreset: 0x004C [Fri Sep 28 00:06:29 2018] radeon :01:00.0: GRBM_STATUS = 0xA0003028 [Fri Sep 28 00:06:29 2018] radeon :01:00.0: GRBM_STATUS_SE0 = 0x0006 [Fri Sep 28 00:06:29 2018] radeon :01:00.0: GRBM_STATUS_SE1 = 0x0006 [Fri Sep 28 00:06:29 2018] radeon :01:00.0: SRBM_STATUS = 0x2AC0 [Fri Sep 28 00:06:29 2018] radeon :01:00.0: SRBM_STATUS2 = 0x [Fri Sep 28 00:06:29 2018] radeon :01:00.0: R_008674_CP_STALLED_STAT1 = 0x2000 [Fri Sep 28 00:06:29 2018] radeon :01:00.0: R_008678_CP_STALLED_STAT2 = 0x [Fri Sep 28 00:06:29 2018] radeon :01:00.0: R_00867C_CP_BUSY_STAT = 0x0042 [Fri Sep 28 00:06:29 2018] radeon :01:00.0: R_008680_CP_STAT = 0x8C028647 [Fri Sep 28 00:06:29 2018] radeon :01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83146 [Fri Sep 28 00:06:29 2018] radeon :01:00.0: R_00D834_DMA_STATUS_REG = 0x44C83D57 [Fri Sep 28 00:06:29 2018] radeon :01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x [Fri Sep 28 00:06:29 2018] radeon :01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x [Fri Sep 28 00:06:29 2018] radeon :01:00.0: GRBM_SOFT_RESET=0xDDFF [Fri Sep 28 00:06:29 2018] radeon :01:00.0: SRBM_SOFT_RESET=0x00100100 [Fri Sep 28 00:06:29 2018] radeon :01:00.0: GRBM_STATUS = 0x3028 [Fri Sep 28 00:06:29 2018] radeon :01:00.0: GRBM_STATUS_SE0 = 0x0006 [Fri Sep 28 00:06:29 2018] radeon :01:00.0: GRBM_STATUS_SE1 = 0x0006 [Fri Sep 28 00:06:29 2018] radeon :01:00.0: SRBM_STATUS = 0x20C0 [Fri Sep 28 00:06:29 2018] radeon :01:00.0: SRBM_STATUS2 = 0x [Fri Sep 28 00:06:29 2018] radeon :01:00.0: R_008674_CP_STALLED_STAT1 = 0x [Fri Sep 28 00:06:29 2018] radeon :01:00.0: R_008678_CP_STALLED_STAT2 = 0x [Fri Sep 28 00:06:29 2018] radeon :01:00.0: R_00867C_CP_BUSY_STAT = 0x [Fri Sep 28 00:06:29 2018] radeon :01:00.0: R_008680_CP_STAT = 0x [Fri Sep 28 00:06:29 2018] radeon :01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 [Fri Sep 28 00:06:29 2018] radeon :01:00.0: R_00D834_DMA_STATUS_REG = 0x44C83D57 [Fri Sep 28 00:06:29 2018] radeon :01:00.0: GPU reset succeeded, trying to resume [Fri Sep 28 00:06:29 2018] [drm] probing gen 2 caps for device 8086:1901 = 261ad03/e [Fri Sep 28 00:06:29 2018] [drm] PCIE gen 3 link speeds already enabled [Fri Sep 28 00:06:29 2018] [drm] PCIE GART of 2048M enabled (table at 0x001D6000). [Fri Sep 28 00:06:29 2018] radeon :01:00.0: WB enabled [Fri Sep 28 00:06:29 2018] radeon :01:00.0: fence driver on ring 0 use gpu addr 0x8c00 and cpu addr 0x70214f68 [Fri Sep 28 00:06:29 2018] radeon :01:00.0: fence driver on ring 1 use gpu addr 0x8c04 and cpu addr 0xb5b3fa4b [Fri Sep 28 00:06:29 2018] radeon :01:00.0: fence driver on ring 2 use gpu addr 0x8c08 and cpu addr 0x552c5fb0 [Fri Sep 28 00:06:29 2018] radeon :01:00.0: fence driver on ring 3 use gpu addr 0x8c0c and cpu addr 0xec35d8ef [Fri Sep 28 00:06:29 2018] radeon :01:00.0: fence driver on ring 4 use gpu addr 0x8c10 and cpu addr 0x2cca1ee6 [Fri Sep 28 00:06:29 2018] radeon :01:00.0: fence driver on ring 5 use gpu addr 0x00075a18 and cpu addr 0xf6c849cc [Fri Sep 28 00:06:29 2018] radeon :01:00.0: failed VCE resume (-22). [Fri Sep 28 00:06:30 2018] [drm] ring test on 0 succeeded in 2 usecs [Fri Sep 28 00:06:30 2018] [drm] ring test on 1 succeeded in 1 usecs [Fri Sep 28 00:06:30 2018] [drm] ring test on 2 succeeded in 1 usecs [Fri Sep 28 00:06:30 2018] [drm] ring test on 3 succeeded in 9 usecs [Fri Sep 28 00:06:30 2018] [drm] ring test on 4 succeeded in 3 usecs [Fri Sep 28 00:06:30 2018] [drm] ring test on 5 succeeded in 2 usecs [Fri Sep 28 00:06:30 2018] [drm] UVD initialized successfully. [Fri Sep 28 00:06:30 2018] [drm:si_dpm_set_power_state [radeon]] *ERROR* si_set_sw_state failed [Fri Sep 28 00:06:31 2018] [drm:r600_ib_test [radeon]] *ERROR* radeon: fence wait timed out. [Fri Sep 28 00:06:31 2018] [drm:radeon_ib_ring_tests [radeon]] *ERROR* radeon: failed testing IB
[Bug 1787904] Re: [radeon] machine crash after GPU reset
I can reproduce the problem! If I watch youtube for short amount of time the graphics will freeze with the following kernel messages: Sep 5 15:51:36 winters kernel: [92333.375465] DMAR: [INTR-REMAP] Request device [00:00.1] fault index 1e [fault reason 38] Blocked an interrupt request due to source-id verification failure . . . Sep 5 15:51:46 winters kernel: [92343.376134] radeon :03:00.0: scheduling IB failed (-35). Sep 5 15:51:46 winters kernel: [92343.376231] [drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1787904 Title: [radeon] machine crash after GPU reset To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1787904/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1787904] Re: [radeon] machine crash after GPU reset
Today's freeze has a new message: Aug 31 14:30:38 winters kernel: [160881.141603] DMAR: DRHD: handling fault status reg 2 Aug 31 14:30:38 winters kernel: [160881.141614] DMAR: [INTR-REMAP] Request device [00:00.1] fault index 1e [fault reason 38] Blocked an interrupt request due to source-id verification failure Aug 31 14:30:38 winters kernel: [160881.141620] DMAR: [INTR-REMAP] Request device [00:00.0] fault index 1a [fault reason 38] Blocked an interrupt request due to source-id verification failure -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1787904 Title: [radeon] machine crash after GPU reset To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1787904/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1787904] Re: [radeon] machine crash after GPU reset
I am in sore need of help, as this problem is rendering work very difficult. I'm averaging a freeze a day. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1787904 Title: [radeon] machine crash after GPU reset To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1787904/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1787904] Re: [radeon] machine crash after GPU reset
I just using the browser to email and had the graphics freeze up on me again. This is getting in the way of work :(... Aug 28 11:18:00 winters kernel: [337756.570481] DMAR: DRHD: handling fault status reg 2 Aug 28 11:18:00 winters kernel: [337756.570492] DMAR: [INTR-REMAP] Request device [00:00.0] fault index 1a [fault reason 38] Blocked an interrupt request due to source-id verification failure Aug 28 11:18:10 winters kernel: [337766.640678] radeon :03:00.0: ring 0 stalled for more than 10116msec Aug 28 11:18:10 winters kernel: [337766.640686] radeon :03:00.0: GPU lockup (current fence id 0x000ce8c8 last fence id 0x000ce8f2 on ring 0) Aug 28 11:18:11 winters kernel: [337767.152724] radeon :03:00.0: ring 0 stalled for more than 10628msec Aug 28 11:18:11 winters kernel: [337767.152728] radeon :03:00.0: ring 3 stalled for more than 10240msec Aug 28 11:18:11 winters kernel: [337767.152734] radeon :03:00.0: GPU lockup (current fence id 0x0060036d last fence id 0x00600536 on ring 3) Aug 28 11:18:11 winters kernel: [337767.152741] radeon :03:00.0: GPU lockup (current fence id 0x000ce8c8 last fence id 0x000ce8f2 on ring 0) Aug 28 11:18:11 winters kernel: [337767.664697] radeon :03:00.0: ring 3 stalled for more than 10752msec Aug 28 11:18:11 winters kernel: [337767.664705] radeon :03:00.0: GPU lockup (current fence id 0x0060036d last fence id 0x00600536 on ring 3) Aug 28 11:18:11 winters kernel: [337767.664732] radeon :03:00.0: ring 0 stalled for more than 11140msec Aug 28 11:18:11 winters kernel: [337767.664741] radeon :03:00.0: GPU lockup (current fence id 0x000ce8c8 last fence id 0x000ce8f2 on ring 0) Aug 28 11:18:12 winters kernel: [337768.176728] radeon :03:00.0: ring 3 stalled for more than 11264msec Aug 28 11:18:12 winters kernel: [337768.176732] radeon :03:00.0: ring 0 stalled for more than 11652msec Aug 28 11:18:12 winters kernel: [337768.176738] radeon :03:00.0: GPU lockup (current fence id 0x000ce8c8 last fence id 0x000ce8f4 on ring 0) Aug 28 11:18:12 winters kernel: [337768.176744] radeon :03:00.0: GPU lockup (current fence id 0x0060036d last fence id 0x00600542 on ring 3) Aug 28 11:18:12 winters kernel: [337768.688694] radeon :03:00.0: ring 0 stalled for more than 12164msec Aug 28 11:18:12 winters kernel: [337768.688702] radeon :03:00.0: GPU lockup (current fence id 0x000ce8c8 last fence id 0x000ce8f4 on ring 0) Aug 28 11:18:12 winters kernel: [337768.688729] radeon :03:00.0: ring 3 stalled for more than 11776msec Aug 28 11:18:12 winters kernel: [337768.688737] radeon :03:00.0: GPU lockup (current fence id 0x0060036d last fence id 0x00600542 on ring 3) Aug 28 11:18:13 winters kernel: [337769.200726] radeon :03:00.0: ring 3 stalled for more than 12288msec Aug 28 11:18:13 winters kernel: [337769.200730] radeon :03:00.0: ring 0 stalled for more than 12676msec Aug 28 11:18:13 winters kernel: [337769.200736] radeon :03:00.0: GPU lockup (current fence id 0x000ce8c8 last fence id 0x000ce8f4 on ring 0) Aug 28 11:18:13 winters kernel: [337769.200742] radeon :03:00.0: GPU lockup (current fence id 0x0060036d last fence id 0x00600542 on ring 3) Aug 28 11:18:13 winters kernel: [337769.712718] radeon :03:00.0: ring 0 stalled for more than 13188msec Aug 28 11:18:13 winters kernel: [337769.712722] radeon :03:00.0: ring 3 stalled for more than 12800msec Aug 28 11:18:13 winters kernel: [337769.712727] radeon :03:00.0: GPU lockup (current fence id 0x0060036d last fence id 0x00600542 on ring 3) Aug 28 11:18:13 winters kernel: [337769.712734] radeon :03:00.0: GPU lockup (current fence id 0x000ce8c8 last fence id 0x000ce8f4 on ring 0) Aug 28 11:18:14 winters kernel: [337770.224720] radeon :03:00.0: ring 0 stalled for more than 13700msec Aug 28 11:18:14 winters kernel: [337770.224724] radeon :03:00.0: ring 3 stalled for more than 13312msec Aug 28 11:18:14 winters kernel: [337770.224730] radeon :03:00.0: GPU lockup (current fence id 0x0060036d last fence id 0x00600542 on ring 3) Aug 28 11:18:14 winters kernel: [337770.224737] radeon :03:00.0: GPU lockup (current fence id 0x000ce8c8 last fence id 0x000ce8f4 on ring 0) Aug 28 11:18:14 winters kernel: [337770.736698] radeon :03:00.0: ring 0 stalled for more than 14212msec Aug 28 11:18:14 winters kernel: [337770.736707] radeon :03:00.0: GPU lockup (current fence id 0x000ce8c8 last fence id 0x000ce8f4 on ring 0) Aug 28 11:18:14 winters kernel: [337770.736732] radeon :03:00.0: ring 3 stalled for more than 13824msec Aug 28 11:18:14 winters kernel: [337770.736739] radeon :03:00.0: GPU lockup (current fence id 0x0060036d last fence id
[Bug 1787904] Re: [radeon] machine crash after GPU reset
Something strange just happened. I clicked on the Gnome "Show Applications" button (which I never do -- was fidgeting while discussing something) all but the pointer froze. Here is the kern.log for the event: Aug 23 15:32:13 winters kernel: [278586.296029] pool[604]: segfault at 28 ip 7f8cc25ac1e7 sp 7f8b97feddf0 error 4 Aug 23 15:32:55 winters kernel: [278628.753949] rfkill: input handler enabled Aug 23 15:32:56 winters kernel: [278629.916462] radeon_dp_aux_transfer_native: 242 callbacks suppressed Aug 23 15:32:56 winters kernel: [278629.989692] rfkill: input handler disabled Aug 23 15:33:29 winters kernel: [278662.760412] rfkill: input handler enabled Aug 23 15:33:29 winters kernel: [278662.796389] radeon_dp_aux_transfer_native: 32 callbacks suppressed Aug 23 15:34:25 winters kernel: [278718.716359] radeon_dp_aux_transfer_native: 32 callbacks suppressed Aug 23 15:34:25 winters kernel: [278718.805826] rfkill: input handler disabled Aug 23 15:34:47 winters kernel: [278740.173708] rfkill: input handler enabled Aug 23 15:35:24 winters kernel: [278778.052300] radeon_dp_aux_transfer_native: 32 callbacks suppressed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1787904 Title: [radeon] machine crash after GPU reset To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1787904/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
Re: [Bug 1787904] Re: [radeon] machine crash after GPU reset
It seems to occur intermittently... I'm never sure what the conditions are, but it happens once every few weeks. I can be in the middle of work and the machine will crash. Unfortunately, I can't make it happen at will. Is there something you'd like me to do? On Wed, Aug 22, 2018 at 6:01 PM Joseph Salisbury < joseph.salisb...@canonical.com> wrote: > Do you have a way to reproduce this issue, or was it a one time event? > > ** Changed in: linux (Ubuntu) >Importance: Undecided => Medium > > ** Changed in: linux (Ubuntu) >Status: Confirmed => Incomplete > > -- > You received this bug notification because you are subscribed to the bug > report. > https://bugs.launchpad.net/bugs/1787904 > > Title: > [radeon] machine crash after GPU reset > > Status in linux package in Ubuntu: > Incomplete > > Bug description: > I showed up to work today (after being away for a couple of weeks) and > my machine had crashed. > I'm not sure why it happened this morning at 7am. My only guess is that > a janitor came in and touched the mouse which woke the screen up after a > couple of weeks of sleep. > (I have been using the machine remotely for computations, which had > finished yesterday, but the building was closed so I'm guessing this is the > first time the screen has come on in a while). > > The first unusual entries in my kern.log are: > > Aug 20 07:02:43 winters kernel: [2218007.668454] radeon :03:00.0: > ring 0 stalled for more than 10248msec > Aug 20 07:02:43 winters kernel: [2218007.668465] radeon :03:00.0: > GPU lockup (current fence id 0x000799d4 last fence id > 0x000799d6 on ring 0) > > They are plentiful, and are mixed with other messages (see attached > log file) such as: > > Aug 20 07:03:00 winters kernel: [2218025.438173] radeon :03:00.0: > failed VCE resume (-110). > Aug 20 07:03:01 winters kernel: [2218025.721068] [drm:r600_ring_test > [radeon]] *ERROR* radeon: ring 0 test failed (scratch(0x850C)=0xCAFEDEAD) > Aug 20 07:03:01 winters kernel: [2218025.721085] [drm:si_resume > [radeon]] *ERROR* si startup failed on resume > Aug 20 07:03:01 winters kernel: [2218025.722887] WARNING: CPU: 18 PID: > 3715 at > /build/linux-60XibS/linux-4.15.0/drivers/gpu/drm/radeon/radeon_object.c:84 > radeon_ttm_bo_destroy+0xfb/0x100 [radeon] > . > . > . > Aug 20 07:03:38 winters kernel: [2218062.876285] [drm:atom_op_jump > [radeon]] *ERROR* atombios stuck in loop for more than 5secs aborting > Aug 20 07:03:38 winters kernel: [2218062.876298] > [drm:atom_execute_table_locked [radeon]] *ERROR* atombios stuck executing > BBC8 (len 237, WS 0, PS 4) @ 0xBBD6 > Aug 20 07:03:38 winters kernel: [2218062.876309] > [drm:atom_execute_table_locked [radeon]] *ERROR* atombios stuck executing > B3EE (len 78, WS 12, PS 8) @ 0xB427 > > The last message before death appears to be: > > Aug 20 07:10:02 winters kernel: [2218447.006362] radeon :03:00.0: > GPU reset succeeded, trying to resume > > ProblemType: Bug > DistroRelease: Ubuntu 18.04 > Package: xserver-xorg-video-radeon 1:18.0.1-1 > ProcVersionSignature: Ubuntu 4.15.0-32.35-generic 4.15.18 > Uname: Linux 4.15.0-32-generic x86_64 > NonfreeKernelModules: wl > .tmp.unity_support_test.0: > > ApportVersion: 2.20.9-0ubuntu7.2 > Architecture: amd64 > CompizPlugins: No value set for > `/apps/compiz-1/general/screen0/options/active_plugins' > CompositorRunning: None > CurrentDesktop: ubuntu:GNOME > Date: Mon Aug 20 10:18:42 2018 > DistUpgraded: 2018-05-02 12:54:58,714 DEBUG icon theme changed, > re-reading > DistroCodename: bionic > DistroVariant: ubuntu > DkmsStatus: >bcmwl, 6.30.223.271+bdcom, 4.15.0-29-generic, x86_64: installed >bcmwl, 6.30.223.271+bdcom, 4.15.0-30-generic, x86_64: installed >bcmwl, 6.30.223.271+bdcom, 4.15.0-32-generic, x86_64: installed > ExtraDebuggingInterest: Yes > GraphicsCard: >Advanced Micro Devices, Inc. [AMD/ATI] Oland GL [FirePro W2100] > [1002:6608] (prog-if 00 [VGA controller]) > Subsystem: Dell Oland GL [FirePro W2100] [1028:2120] > InstallationDate: Installed on 2018-01-05 (226 days ago) > InstallationMedia: Ubuntu 17.04 "Zesty Zapus" - Release amd64 (20170412) > MachineType: Dell Inc. Precision Tower 7810 > ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-32-generic > root=UUID=90d43be7-a2f7-4500-8b88-9bd7a549d96d ro quiet splash vt.handoff=1 > SourcePackage: xserver-xorg-video-ati > UpgradeStatus: Upgraded to bionic on 2018-05-02 (109 days ago) > dmi.bios.date: 06/25/2018 > dmi.bios.vendor: Dell Inc. > dmi.bios.version: A27 > dmi.board.name: 0KJCC5 > dmi.board.vendor: Dell Inc. > dmi.board.version: A00 > dmi.chassis.type: 7 > dmi.chassis.vendor: Dell Inc. > dmi.modalias: > dmi:bvnDellInc.:bvrA27:bd06/25/2018:svnDellInc.:pnPrecisionTower7810:pvr:rvnDellInc.:rn0KJCC5:rvrA00:cvnDellInc.:ct7:cvr: > dmi.product.name: Precision Tower 7810 > dmi.sys.vendor: Dell Inc. > version.compiz:
[Bug 1787904] Re: [radeon] machine crash after GPU reset
Do you have a way to reproduce this issue, or was it a one time event? ** Changed in: linux (Ubuntu) Importance: Undecided => Medium ** Changed in: linux (Ubuntu) Status: Confirmed => Incomplete -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1787904 Title: [radeon] machine crash after GPU reset To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1787904/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs