[OSADL QA 3.18.9-rt4 #1] Radeon driver hangs
On 23.03.2015 07:14, Carsten Emde wrote: > Hi Michel, > > [..] > The most striking problem of kernel 3.18.9-rt4 affects all systems > that > are equipped with Radeon graphics (irrespective whether PCIe cards or > APUs with on-chip graphics). They suffer from a hanging radeon driver. > The block occurs when accelerated graphics load is created by > x11perf or > gltestperf. Sometimes only the graphics are frozen while ssh login > still > is possible, somtimes the entire box is no longer accessible at > all. In > any case, a reboot is needed to recover from this situation. > > Here is a selection of kernel messages: [...] The commits from http://cgit.freedesktop.org/~airlied/linux/commit/?h=drm-fixes=f957063fee6392bb9365370db6db74dc0b2dce0a to http://cgit.freedesktop.org/~airlied/linux/commit/?h=drm-fixes=cffefd9bb31cd35ab745d3b49005d10616d25bdc and http://cgit.freedesktop.org/~airlied/linux/commit/?h=drm-fixes=b6610101718d4ab90d793c482625e98eb1262cad might help for this. >>> >>> Thanks a lot. I have applied these patches to a number of systems: >>> # quilt applied | tail -7 >>> patches/drm-radeon-do-a-posting-read-in-r100_set_irq.patch >>> patches/drm-radeon-do-a-posting-read-in-rs600_set_irq.patch >>> patches/drm-radeon-do-a-posting-read-in-r600_set_irq.patch >>> patches/drm-radeon-do-a-posting-read-in-evergreen_set_irq.patch >>> patches/drm-radeon-do-a-posting-read-in-si_set_irq.patch >>> patches/drm-radeon-do-a-posting-read-in-cik_set_irq.patch >>> patches/drm-radeon-fix-wait-to-actually-occur-after-the-signaling-callback.patch >>> >>> >>> >>> The graphic boards still crash and freeze the screen, but in contrast >>> to the earlier situation the systems remain accessible, and the X >>> Window server can be restarted after the offensive programs are >>> removed. The crashes were reliably triggered by >>> - gltestperf >>>or >>> - x11perf -repeat 3 -subs 25 -time 2 -rect10 > This is not entirely correct, since gltestperf does not reliably crash > the graphics controller. However, "x11perf -repeat 3 -subs 25 -time 2 > -rect10" always does a reliable job to trigger the crash. > >>> but the crashes also occur several times per day during normal work >>> such as browsing the Internet or writing a text document. If you wish >>> me to provide additional diagnostic information such as running test >>> programs while the graphic boards are unresponsive, I certainly can do >>> that. >> >> Does it also happen with a kernel built from a current drm-fixes tree? >> http://cgit.freedesktop.org/~airlied/linux/log/?h=drm-fixes > No. Apparently, you need full preemption to expose the problem. > > The following list contains the results whether the command "x11perf > -repeat 3 -subs 25 -time 2 -rect10" freezes the Radeon board under test > (Radeon HD 7970 XFS / R9 280X) or not: > linux-3.12.33-rt47 no > linux-3.14.34-rt32 no > linux-3.14.34-drm-3.16.7-rt32* no > linux-3.18.7-rt1YES > linux-3.18.9-rt4YES > linux-3.18.9-rt5YES > linux-3.18.9-drm-3.16.7-rt5**no > linux-4.0.0-rc4 no > linux-drm-fixes no > *DRM subsystem backported from linux-3.16.7 to linux-3.14.34-rt32. > **DRM subsystem ported from linux-3.16.7 to linux-3.18.9-rt5. Can you test a non-rt 3.18.y kernel? There were some intermittent issues around 3.18 fixed by the patches I referenced above. Maybe I missed some other fixes, though. Maarten, do you remember any other fixes offhand that might help? > More observations: > If full function tracing is enabled (which makes the system about five > times slower), the graphics controller no longer freezes. With partial > function tracing such as "echo *drm* >set_ftrace_filter", the > controller still freezes. The trace then contains vblank interrupt > processing only, ioctls are no longer executed. > > This is the location where the driver hangs: > [25104.509258] INFO: task Xorg.bin:16591 blocked for more than 120 seconds. > [25104.516322] Not tainted 3.18.9-rt5 #2 > [25104.520715] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [25104.528853] Xorg.binD 8171ed90 0 16591 16239 > 0x10400080 > [25104.536102] 8800ba0bb8d8 0002 8800ba0bbfd8 > 0006 > [25104.536103] dc08 880626d0dc08 8800ba0bbfd8 > dc08 > [25104.536104] 88061b2cdcd0 880616d3a940 880035c1 > 880616d3a940 > [25104.559274] Call Trace: > [25104.561844] [] schedule+0x34/0xa0 > [25104.561846] [] schedule_timeout+0x23c/0x2a0 > [25104.561870] [] ? radeon_fence_process+0x16/0x40 > [radeon] > [25104.561879] [] ? > radeon_fence_any_seq_signaled+0x44/0x90 [radeon] > [25104.561887] [] > radeon_fence_wait_seq_timeout.constprop.8+0x327/0x380 [radeon] > [25104.561889] [] ?
[OSADL QA 3.18.9-rt4 #1] Radeon driver hangs
Hi Michel, [..] The most striking problem of kernel 3.18.9-rt4 affects all systems that are equipped with Radeon graphics (irrespective whether PCIe cards or APUs with on-chip graphics). They suffer from a hanging radeon driver. The block occurs when accelerated graphics load is created by x11perf or gltestperf. Sometimes only the graphics are frozen while ssh login still is possible, somtimes the entire box is no longer accessible at all. In any case, a reboot is needed to recover from this situation. Here is a selection of kernel messages: >>> [...] >>> The commits from >>> http://cgit.freedesktop.org/~airlied/linux/commit/?h=drm-fixes=f957063fee6392bb9365370db6db74dc0b2dce0a >>> >>> to >>> http://cgit.freedesktop.org/~airlied/linux/commit/?h=drm-fixes=cffefd9bb31cd35ab745d3b49005d10616d25bdc >>> >>> and >>> http://cgit.freedesktop.org/~airlied/linux/commit/?h=drm-fixes=b6610101718d4ab90d793c482625e98eb1262cad >>> >>> might help for this. >> >> Thanks a lot. I have applied these patches to a number of systems: >> # quilt applied | tail -7 >> patches/drm-radeon-do-a-posting-read-in-r100_set_irq.patch >> patches/drm-radeon-do-a-posting-read-in-rs600_set_irq.patch >> patches/drm-radeon-do-a-posting-read-in-r600_set_irq.patch >> patches/drm-radeon-do-a-posting-read-in-evergreen_set_irq.patch >> patches/drm-radeon-do-a-posting-read-in-si_set_irq.patch >> patches/drm-radeon-do-a-posting-read-in-cik_set_irq.patch >> patches/drm-radeon-fix-wait-to-actually-occur-after-the-signaling-callback.patch >> >> >> The graphic boards still crash and freeze the screen, but in contrast >> to the earlier situation the systems remain accessible, and the X >> Window server can be restarted after the offensive programs are >> removed. The crashes were reliably triggered by >> - gltestperf >>or >> - x11perf -repeat 3 -subs 25 -time 2 -rect10 This is not entirely correct, since gltestperf does not reliably crash the graphics controller. However, "x11perf -repeat 3 -subs 25 -time 2 -rect10" always does a reliable job to trigger the crash. >> but the crashes also occur several times per day during normal work >> such as browsing the Internet or writing a text document. If you wish >> me to provide additional diagnostic information such as running test >> programs while the graphic boards are unresponsive, I certainly can do >> that. > > Does it also happen with a kernel built from a current drm-fixes tree? > http://cgit.freedesktop.org/~airlied/linux/log/?h=drm-fixes No. Apparently, you need full preemption to expose the problem. The following list contains the results whether the command "x11perf -repeat 3 -subs 25 -time 2 -rect10" freezes the Radeon board under test (Radeon HD 7970 XFS / R9 280X) or not: linux-3.12.33-rt47 no linux-3.14.34-rt32 no linux-3.14.34-drm-3.16.7-rt32* no linux-3.18.7-rt1YES linux-3.18.9-rt4YES linux-3.18.9-rt5YES linux-3.18.9-drm-3.16.7-rt5**no linux-4.0.0-rc4 no linux-drm-fixes no *DRM subsystem backported from linux-3.16.7 to linux-3.14.34-rt32. **DRM subsystem ported from linux-3.16.7 to linux-3.18.9-rt5. More observations: If full function tracing is enabled (which makes the system about five times slower), the graphics controller no longer freezes. With partial function tracing such as "echo *drm* >set_ftrace_filter", the controller still freezes. The trace then contains vblank interrupt processing only, ioctls are no longer executed. This is the location where the driver hangs: [25104.509258] INFO: task Xorg.bin:16591 blocked for more than 120 seconds. [25104.516322] Not tainted 3.18.9-rt5 #2 [25104.520715] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [25104.528853] Xorg.binD 8171ed90 0 16591 16239 0x10400080 [25104.536102] 8800ba0bb8d8 0002 8800ba0bbfd8 0006 [25104.536103] dc08 880626d0dc08 8800ba0bbfd8 dc08 [25104.536104] 88061b2cdcd0 880616d3a940 880035c1 880616d3a940 [25104.559274] Call Trace: [25104.561844] [] schedule+0x34/0xa0 [25104.561846] [] schedule_timeout+0x23c/0x2a0 [25104.561870] [] ? radeon_fence_process+0x16/0x40 [radeon] [25104.561879] [] ? radeon_fence_any_seq_signaled+0x44/0x90 [radeon] [25104.561887] [] radeon_fence_wait_seq_timeout.constprop.8+0x327/0x380 [radeon] [25104.561889] [] ? __wake_up_sync+0x20/0x20 [25104.561898] [] radeon_fence_wait_any+0x57/0x70 [radeon] [25104.561914] [] radeon_sa_bo_new+0x2af/0x4b0 [radeon] [25104.561916] [] ? debug_smp_processor_id+0x17/0x20 [25104.561918] [] ? __kmalloc+0x8a/0x300 [25104.561932] [] radeon_ib_get+0x37/0xe0 [radeon] [25104.561943] [] radeon_cs_ioctl+0x22e/0x860 [radeon] [25104.561952] [] drm_ioctl+0x197/0x670 [drm] [25104.561954] [] ? debug_smp_processor_id+0x17/0x20 [25104.561956] [] ?
[OSADL QA 3.18.9-rt4 #1] Radeon driver hangs
On 16.03.2015 23:52, Carsten Emde wrote: > Hi Michel, > >>> [..] >>> The most striking problem of kernel 3.18.9-rt4 affects all systems that >>> are equipped with Radeon graphics (irrespective whether PCIe cards or >>> APUs with on-chip graphics). They suffer from a hanging radeon driver. >>> The block occurs when accelerated graphics load is created by x11perf or >>> gltestperf. Sometimes only the graphics are frozen while ssh login still >>> is possible, somtimes the entire box is no longer accessible at all. In >>> any case, a reboot is needed to recover from this situation. >>> >>> Here is a selection of kernel messages: >> [...] >> The commits from >> http://cgit.freedesktop.org/~airlied/linux/commit/?h=drm-fixes=f957063fee6392bb9365370db6db74dc0b2dce0a >> >> to >> http://cgit.freedesktop.org/~airlied/linux/commit/?h=drm-fixes=cffefd9bb31cd35ab745d3b49005d10616d25bdc >> >> and >> http://cgit.freedesktop.org/~airlied/linux/commit/?h=drm-fixes=b6610101718d4ab90d793c482625e98eb1262cad >> >> might help for this. > > Thanks a lot. I have applied these patches to a number of systems: > # quilt applied | tail -7 > patches/drm-radeon-do-a-posting-read-in-r100_set_irq.patch > patches/drm-radeon-do-a-posting-read-in-rs600_set_irq.patch > patches/drm-radeon-do-a-posting-read-in-r600_set_irq.patch > patches/drm-radeon-do-a-posting-read-in-evergreen_set_irq.patch > patches/drm-radeon-do-a-posting-read-in-si_set_irq.patch > patches/drm-radeon-do-a-posting-read-in-cik_set_irq.patch > patches/drm-radeon-fix-wait-to-actually-occur-after-the-signaling-callback.patch > > > The graphic boards still crash and freeze the screen, but in contrast > to the earlier situation the systems remain accessible, and the X > Window server can be restarted after the offensive programs are > removed. The crashes were reliably triggered by > - gltestperf > or > - x11perf -repeat 3 -subs 25 -time 2 -rect10 > but the crashes also occur several times per day during normal work > such as browsing the Internet or writing a text document. If you wish > me to provide additional diagnostic information such as running test > programs while the graphic boards are unresponsive, I certainly can do > that. Does it also happen with a kernel built from a current drm-fixes tree? http://cgit.freedesktop.org/~airlied/linux/log/?h=drm-fixes I might have missed other needed fixes. > Rack #0/Slot #3 [AMD/ATI] RV730 XT [Radeon HD 4670]: > > [21001.244036] INFO: task kworker/u24:6:267 blocked for more than 120 seconds. > [21001.257773] Not tainted 3.18.9-rt4 #27 > [21001.266284] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables > this message. > [21001.281911] kworker/u24:6 D 88081ed8b340 0 267 2 > 0x1000 > [21001.281937] Workqueue: radeon-crtc radeon_flip_work_func [radeon] > [21001.281940] 880805d2fbe8 0046 88081ed0c700 > > [21001.281941] 9000 c920 8808112fb420 > 880035254e30 > [21001.281943] c280 0100c280 0003 > 880035254e30 > [21001.281945] Call Trace: > [21001.281950] [] schedule+0x34/0xa0 > [21001.281953] [] schedule_timeout+0x22c/0x2d0 > [21001.281962] [] ? radeon_fence_process+0x16/0x40 [radeon] > [21001.281971] [] ? > radeon_fence_any_seq_signaled+0x44/0x90 [radeon] > [21001.281979] [] > radeon_fence_wait_seq_timeout.constprop.8+0x2e7/0x340 [radeon] > [21001.281982] [] ? __wake_up_sync+0x20/0x20 > [21001.281991] [] radeon_fence_wait+0x86/0xc0 [radeon] > [21001.282000] [] radeon_flip_work_func+0x15c/0x190 > [radeon] > [21001.282003] [] process_one_work+0x154/0x450 > [21001.282004] [] worker_thread+0x6b/0x4d0 > [21001.282006] [] ? rescuer_thread+0x290/0x290 > [21001.282007] [] ? rescuer_thread+0x290/0x290 > [21001.282009] [] kthread+0xcd/0xf0 > [21001.282010] [] ? kthread_worker_fn+0x1d0/0x1d0 > [21001.282013] [] ret_from_fork+0x7c/0xb0 > [21001.282014] [] ? kthread_worker_fn+0x1d0/0x1d0 > > > Rack #0/Slot #7 [AMD/ATI] Cayman XT [Radeon HD 6970] > > [ 481.091132] INFO: task Xorg:3459 blocked for more than 120 seconds. > [ 481.103594] Not tainted 3.18.9-rt4 #28 > [ 481.112101] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables > this message. > [ 481.127746] XorgD 88041e68ab40 0 3459 3452 > 0x1044 > [ 481.141882] 880413da38e8 0002 88041e60c460 > 8800c3ea3380 > [ 481.141882] 880413da38d8 8108603f c5a8 > c5c8 > [ 481.141883] 81c19460 8800c3ea3380 000c > 8800c3ea3380 > [ 481.186228] Call Trace: > [ 481.191114] [] ? queue_delayed_work_on+0xff/0x110 > [ 481.191118] [] schedule+0x34/0xa0 > [ 481.191119] [] schedule_timeout+0x204/0x270 > [ 481.191148] [] ? radeon_fence_process+0x16/0x40 [radeon] > [ 481.191157] [] ? > radeon_fence_any_seq_signaled+0x44/0x90 [radeon] > [ 481.191165] [] >
[OSADL QA 3.18.9-rt4 #1] Radeon driver hangs
Hi Michel, >> [..] >> The most striking problem of kernel 3.18.9-rt4 affects all systems that >> are equipped with Radeon graphics (irrespective whether PCIe cards or >> APUs with on-chip graphics). They suffer from a hanging radeon driver. >> The block occurs when accelerated graphics load is created by x11perf or >> gltestperf. Sometimes only the graphics are frozen while ssh login still >> is possible, somtimes the entire box is no longer accessible at all. In >> any case, a reboot is needed to recover from this situation. >> >> Here is a selection of kernel messages: > [...] > The commits from > http://cgit.freedesktop.org/~airlied/linux/commit/?h=drm-fixes=f957063fee6392bb9365370db6db74dc0b2dce0a > to > http://cgit.freedesktop.org/~airlied/linux/commit/?h=drm-fixes=cffefd9bb31cd35ab745d3b49005d10616d25bdc > and > http://cgit.freedesktop.org/~airlied/linux/commit/?h=drm-fixes=b6610101718d4ab90d793c482625e98eb1262cad > might help for this. Thanks a lot. I have applied these patches to a number of systems: # quilt applied | tail -7 patches/drm-radeon-do-a-posting-read-in-r100_set_irq.patch patches/drm-radeon-do-a-posting-read-in-rs600_set_irq.patch patches/drm-radeon-do-a-posting-read-in-r600_set_irq.patch patches/drm-radeon-do-a-posting-read-in-evergreen_set_irq.patch patches/drm-radeon-do-a-posting-read-in-si_set_irq.patch patches/drm-radeon-do-a-posting-read-in-cik_set_irq.patch patches/drm-radeon-fix-wait-to-actually-occur-after-the-signaling-callback.patch The graphic boards still crash and freeze the screen, but in contrast to the earlier situation the systems remain accessible, and the X Window server can be restarted after the offensive programs are removed. The crashes were reliably triggered by - gltestperf or - x11perf -repeat 3 -subs 25 -time 2 -rect10 but the crashes also occur several times per day during normal work such as browsing the Internet or writing a text document. If you wish me to provide additional diagnostic information such as running test programs while the graphic boards are unresponsive, I certainly can do that. Below are the related kernel messages. Thanks, -Carsten. Rack #0/Slot #3 [AMD/ATI] RV730 XT [Radeon HD 4670]: [21001.244036] INFO: task kworker/u24:6:267 blocked for more than 120 seconds. [21001.257773] Not tainted 3.18.9-rt4 #27 [21001.266284] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [21001.281911] kworker/u24:6 D 88081ed8b340 0 267 2 0x1000 [21001.281937] Workqueue: radeon-crtc radeon_flip_work_func [radeon] [21001.281940] 880805d2fbe8 0046 88081ed0c700 [21001.281941] 9000 c920 8808112fb420 880035254e30 [21001.281943] c280 0100c280 0003 880035254e30 [21001.281945] Call Trace: [21001.281950] [] schedule+0x34/0xa0 [21001.281953] [] schedule_timeout+0x22c/0x2d0 [21001.281962] [] ? radeon_fence_process+0x16/0x40 [radeon] [21001.281971] [] ? radeon_fence_any_seq_signaled+0x44/0x90 [radeon] [21001.281979] [] radeon_fence_wait_seq_timeout.constprop.8+0x2e7/0x340 [radeon] [21001.281982] [] ? __wake_up_sync+0x20/0x20 [21001.281991] [] radeon_fence_wait+0x86/0xc0 [radeon] [21001.282000] [] radeon_flip_work_func+0x15c/0x190 [radeon] [21001.282003] [] process_one_work+0x154/0x450 [21001.282004] [] worker_thread+0x6b/0x4d0 [21001.282006] [] ? rescuer_thread+0x290/0x290 [21001.282007] [] ? rescuer_thread+0x290/0x290 [21001.282009] [] kthread+0xcd/0xf0 [21001.282010] [] ? kthread_worker_fn+0x1d0/0x1d0 [21001.282013] [] ret_from_fork+0x7c/0xb0 [21001.282014] [] ? kthread_worker_fn+0x1d0/0x1d0 Rack #0/Slot #7 [AMD/ATI] Cayman XT [Radeon HD 6970] [ 481.091132] INFO: task Xorg:3459 blocked for more than 120 seconds. [ 481.103594] Not tainted 3.18.9-rt4 #28 [ 481.112101] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 481.127746] XorgD 88041e68ab40 0 3459 3452 0x1044 [ 481.141882] 880413da38e8 0002 88041e60c460 8800c3ea3380 [ 481.141882] 880413da38d8 8108603f c5a8 c5c8 [ 481.141883] 81c19460 8800c3ea3380 000c 8800c3ea3380 [ 481.186228] Call Trace: [ 481.191114] [] ? queue_delayed_work_on+0xff/0x110 [ 481.191118] [] schedule+0x34/0xa0 [ 481.191119] [] schedule_timeout+0x204/0x270 [ 481.191148] [] ? radeon_fence_process+0x16/0x40 [radeon] [ 481.191157] [] ? radeon_fence_any_seq_signaled+0x44/0x90 [radeon] [ 481.191165] [] radeon_fence_wait_seq_timeout.constprop.7+0x227/0x330 [radeon] [ 481.191167] [] ? prepare_to_wait_event+0x110/0x110 [ 481.191175] [] radeon_fence_wait_any+0x57/0x70 [radeon] [ 481.191191] [] radeon_sa_bo_new+0x2cf/0x4e0 [radeon] [ 481.191194] [] ? debug_smp_processor_id+0x17/0x20 [ 481.191207] [] radeon_ib_get+0x37/0xf0 [radeon] [ 481.191218] []
[OSADL QA 3.18.9-rt4 #1] Radeon driver hangs
On 03/13/2015 03:23 AM, Michel Dänzer wrote: > The commits from > http://cgit.freedesktop.org/~airlied/linux/commit/?h=drm-fixes=f957063fee6392bb9365370db6db74dc0b2dce0a > to > http://cgit.freedesktop.org/~airlied/linux/commit/?h=drm-fixes=cffefd9bb31cd35ab745d3b49005d10616d25bdc > and > http://cgit.freedesktop.org/~airlied/linux/commit/?h=drm-fixes=b6610101718d4ab90d793c482625e98eb1262cad > might help for this. Thanks. I can't reproduce this myself but I pulled in the commits you mentioned and "drm/radeon: only enable kv/kb dpm interrupts once v3" to avoid a reject. The box runs, glxgears and so on seem to do something, can't look at the screen :) All of those commits (and a ton more) are marked stable so I will probably get them anyway⦠Sebastian
[OSADL QA 3.18.9-rt4 #1] Radeon driver hangs
On 13.03.2015 08:23, Carsten Emde wrote: > (About 30 OSADL QA Farm systems are now running 3.18.9-rt4. BTW: To > check out what kernels are under test you may sort the kernel list > (https://www.osadl.org/?id=933) by kernel version > (https://www.osadl.org/?id=1001) and scroll down the page.) > > The most striking problem of kernel 3.18.9-rt4 affects all systems that > are equipped with Radeon graphics (irrespective whether PCIe cards or > APUs with on-chip graphics). They suffer from a hanging radeon driver. > The block occurs when accelerated graphics load is created by x11perf or > gltestperf. Sometimes only the graphics are frozen while ssh login still > is possible, somtimes the entire box is no longer accessible at all. In > any case, a reboot is needed to recover from this situation. > > Here is a selection of kernel messages: [...] The commits from http://cgit.freedesktop.org/~airlied/linux/commit/?h=drm-fixes=f957063fee6392bb9365370db6db74dc0b2dce0a to http://cgit.freedesktop.org/~airlied/linux/commit/?h=drm-fixes=cffefd9bb31cd35ab745d3b49005d10616d25bdc and http://cgit.freedesktop.org/~airlied/linux/commit/?h=drm-fixes=b6610101718d4ab90d793c482625e98eb1262cad might help for this. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer
[OSADL QA 3.18.9-rt4 #1] Radeon driver hangs
(About 30 OSADL QA Farm systems are now running 3.18.9-rt4. BTW: To check out what kernels are under test you may sort the kernel list (https://www.osadl.org/?id=933) by kernel version (https://www.osadl.org/?id=1001) and scroll down the page.) The most striking problem of kernel 3.18.9-rt4 affects all systems that are equipped with Radeon graphics (irrespective whether PCIe cards or APUs with on-chip graphics). They suffer from a hanging radeon driver. The block occurs when accelerated graphics load is created by x11perf or gltestperf. Sometimes only the graphics are frozen while ssh login still is possible, somtimes the entire box is no longer accessible at all. In any case, a reboot is needed to recover from this situation. Here is a selection of kernel messages: Rack #0/Slot #3 [AMD/ATI] RV730 XT [Radeon HD 4670]: [16081.272035] INFO: task kworker/u24:4:268 blocked for more than 120 seconds. [16081.285776] Not tainted 3.18.9-rt4 #26 [16081.294286] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [16081.309901] kworker/u24:4 D 88081ed8b340 0 268 2 0x1000 [16081.309938] Workqueue: radeon-crtc radeon_flip_work_func [radeon] [16081.309960] 880805ccfbe8 0046 88081ed0c700 [16081.309962] 9000 c920 8808112fb420 880805cc1a10 [16081.309963] 880805ccfbf8 01008108a0da 880805ccfc98 880805cc1a10 [16081.309966] Call Trace: [16081.309972] [] schedule+0x34/0xa0 [16081.309974] [] schedule_timeout+0x22c/0x2d0 [16081.309984] [] ? radeon_fence_process+0x16/0x40 [radeon] [16081.309993] [] ? radeon_fence_any_seq_signaled+0x44/0x90 [radeon] [16081.310001] [] radeon_fence_wait_seq_timeout.constprop.8+0x2e7/0x340 [radeon] [16081.310004] [] ? __wake_up_sync+0x20/0x20 [16081.310013] [] radeon_fence_wait+0x86/0xc0 [radeon] [16081.310023] [] radeon_flip_work_func+0x15c/0x190 [radeon] [16081.310025] [] process_one_work+0x154/0x450 [16081.310026] [] worker_thread+0x6b/0x4d0 [16081.310028] [] ? rescuer_thread+0x290/0x290 [16081.310029] [] kthread+0xcd/0xf0 [16081.310031] [] ? kthread_worker_fn+0x1d0/0x1d0 [16081.310034] [] ret_from_fork+0x7c/0xb0 [16081.310035] [] ? kthread_worker_fn+0x1d0/0x1d0 Rack #0/Slot #7 [AMD/ATI] Cayman XT [Radeon HD 6970]: INFO: task Xorg:10038 blocked for more than 120 seconds. Not tainted 3.18.9-rt4 #25 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. XorgD 816b7f88 0 10038 10032 0x1044 8800c5ad78e8 0002 88041e80c460 c5c8 88041e80c5c8 0002 c5a8 c5c8 880417728000 88041401 000c 88041401 Call Trace: [] schedule+0x34/0xa0 [] schedule_timeout+0x204/0x270 [] ? radeon_fence_process+0x16/0x40 [radeon] [] ? radeon_fence_any_seq_signaled+0x44/0x90 [radeon] [] radeon_fence_wait_seq_timeout.constprop.7+0x227/0x330 [radeon] [] ? prepare_to_wait_event+0x110/0x110 [] radeon_fence_wait_any+0x57/0x70 [radeon] [] radeon_sa_bo_new+0x2cf/0x4e0 [radeon] [] ? debug_smp_processor_id+0x17/0x20 [] radeon_ib_get+0x37/0xf0 [radeon] [] radeon_cs_ioctl+0x22d/0x820 [radeon] [] drm_ioctl+0x1a4/0x630 [drm] [] ? debug_smp_processor_id+0x17/0x20 [] ? unpin_current_cpu+0x1a/0x70 [] ? migrate_enable+0xb0/0x1b0 [] radeon_drm_ioctl+0x4b/0x80 [radeon] [] do_vfs_ioctl+0x2e0/0x4d0 [] ? __fget+0x72/0xa0 [] SyS_ioctl+0x81/0xa0 [] tracesys_phase2+0xd4/0xd9 Rack #4/Slot #1 Chipset: "KAVERI" (ChipID = 0x130c) [ 600.266245] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 600.281856] XorgD 0002 0 3821 3812 0x00400080 [ 600.281865] 880223ddf908 0082 c1c0 c328 [ 600.281867] 88023720c328 0002 c308 c328 [ 600.281869] 81c1b480 880036cfcb60 000c 880036cfcb60 [ 600.281873] Call Trace: [ 600.281882] [] schedule+0x34/0xa0 [ 600.281885] [] schedule_timeout+0x204/0x270 [ 600.281929] [] ? radeon_fence_process+0x16/0x40 [radeon] [ 600.281949] [] ? radeon_fence_any_seq_signaled+0x44/0x90 [radeon] [ 600.281968] [] radeon_fence_wait_seq_timeout.constprop.7+0x227/0x330 [radeon] [ 600.281972] [] ? prepare_to_wait_event+0x110/0x110 [ 600.281992] [] radeon_fence_wait_any+0x57/0x70 [radeon] [ 600.282023] [] radeon_sa_bo_new+0x2cf/0x4e0 [radeon] [ 600.282027] [] ? dequeue_task_fair+0x43e/0x650 [ 600.282055] [] radeon_ib_get+0x37/0xf0 [radeon] [ 600.282078] [] radeon_cs_ioctl+0x22d/0x820 [radeon] [ 600.282098] [] drm_ioctl+0x1a4/0x630 [drm] [ 600.282104] [] ? do_futex+0x109/0xb20 [ 600.282106] [] ? put_prev_entity+0x96/0x3f0 [ 600.282122] [] radeon_drm_ioctl+0xe/0x10 [radeon] [ 600.282125] [] do_vfs_ioctl+0x2e0/0x4d0 [ 600.282128] [] ? __fget+0x72/0xa0 [ 600.282131] [] SyS_ioctl+0x81/0xa0 [ 600.282134] [] ?