Just a bit of follow-up: it seems that on my first system on which I
noticed the error, and solved by swapping in an older Radeon X1300/X1550
card, I ended up doing in hardware what the fix below (implemented on
the second system where I had the error) did in software. The Xorg.0.log
on the first system shows that it's not using glamor but rather exa for
acceleration, and DRI3 is disabled. Looks like I'll be deploying this
20-ati-radeon.conf file to all my Radeon 3000-based SL7 systems, at
least until I see an ati driver update (which SL7.6 doesn't provide).
On 2018-12-03 14:20, Gilles Detillieux wrote:
Thanks, Pat.
Yes, I've done more digging since speculating on the cause of this,
and that confirms that it's something deeper than that. It seems the
problem is the ati driver hasn't kept up with changes to the Xorg 1.20
server. There is an upstream fix to the driver
(https://urldefense.proofpoint.com/v2/url?u=https-3A__cgit.freedesktop.org_xorg_driver_xf86-2Dvideo-2Dati_commit_-3Fid-3D3c4c0213c11d623cba7adbc28dde652694f2f758&d=DwIDbA&c=gRgGjJ3BkIsb5y6s49QqsA&r=gd8BzeSQcySVxr0gDWSEbN-P-pgDXkdyCtaMqdCgPPdW1cyL5RIpaIYrCn8C5x2A&m=omDRH4VQ2M6IrUHKoOlbZSJqy2HXLP0EYPdTi6xloG4&s=v1zDV9U8je8N_mzq4_2MOkyYsjfHwp4dt5K-Zd6dQgo&e=)
that hasn't made its way down through Red Hat yet, but it updates
glamor/gbm handling for the new server version.
Two workarounds seem to have been suggested. One is to remove the ati
driver altogether and let the kernel or Xorg server pick a different
one (https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.archlinux.org_task_50397&d=DwIDbA&c=gRgGjJ3BkIsb5y6s49QqsA&r=gd8BzeSQcySVxr0gDWSEbN-P-pgDXkdyCtaMqdCgPPdW1cyL5RIpaIYrCn8C5x2A&m=omDRH4VQ2M6IrUHKoOlbZSJqy2HXLP0EYPdTi6xloG4&s=JutPC6NdCAnncNKVECeq4F0-DH9UHiF6Bwot4HYytvs&e=). This was for Arch Linux.
I don't know if RHEL/SL will handle this seemingly drastic fix, but I
haven't been able to try it yet. A less drastic fix seems to be to
disable glamor and use exa acceleration instead, and drop DRI to level
2
(https://urldefense.proofpoint.com/v2/url?u=https-3A__www.linuxquestions.org_questions_slackware-2D14_xorg-2Dserver-2D1-2D20-2D0-2Dstarts-2Dwith-2Dlast-2Dshutdown-2Dscreenshot-2D4175632139_&d=DwIDbA&c=gRgGjJ3BkIsb5y6s49QqsA&r=gd8BzeSQcySVxr0gDWSEbN-P-pgDXkdyCtaMqdCgPPdW1cyL5RIpaIYrCn8C5x2A&m=omDRH4VQ2M6IrUHKoOlbZSJqy2HXLP0EYPdTi6xloG4&s=QfOYRR1SPAS16rg69e0mzp_oV99So-Me6doEvsxkYgE&e=).
I tried this second fix by creating a
/etc/X11/xorg.conf.d/20-ati-radeon.conf file with the following:
Section "Device"
Identifier "Radeon"
Driver "radeon"
# Option "AccelMethod" "glamor"
Option "AccelMethod" "exa"
Option "DRI" "2"
EndSection
The system rebooted fine and so far the kernel logs are clean, but I'm
not on-campus today (working remotely) so I won't know until I test
further, tomorrow, if the fix worked. Andreas, you may want to give
this a try too and let us know how it goes for you.
Gilles
On 2018-12-03 13:41, Pat Riehecky wrote:
I'd be surprised in the patch[1] made any difference on Radeon
systems. The code there is really only related to udev probing.
There was a large jump in the ati driver from 7.5 to 7.6. My initial
thoughts are in that direction...
Pat
[1]
https://urldefense.proofpoint.com/v2/url?u=https-3A__gitlab.freedesktop.org_xorg_xserver_commit_0816e8fca6194dfb4cc94c3a7fcb2c7f2a921386&d=DwIDbA&c=gRgGjJ3BkIsb5y6s49QqsA&r=gd8BzeSQcySVxr0gDWSEbN-P-pgDXkdyCtaMqdCgPPdW1cyL5RIpaIYrCn8C5x2A&m=omDRH4VQ2M6IrUHKoOlbZSJqy2HXLP0EYPdTi6xloG4&s=_L222MY2piF4ySMoYFvZn6QKkmOhPLz72_42z9AHC68&e=
On 12/3/18 11:29 AM, Gilles Detillieux wrote:
Thanks for the feedback, Andreas. That saves me some testing time.
It looks like the security bug
(https://urldefense.proofpoint.com/v2/url?u=https-3A__access.redhat.com_security_cve_cve-2D2018-2D14665&d=DwIDbA&c=gRgGjJ3BkIsb5y6s49QqsA&r=gd8BzeSQcySVxr0gDWSEbN-P-pgDXkdyCtaMqdCgPPdW1cyL5RIpaIYrCn8C5x2A&m=VStRhNQ03emrmYLjFAIDre-xzZT0Ho_w1biFWTQSbtc&s=pGQ8kJ1ceFEdtE90n9wkzNyf0BcBKbllIvhnPdeLWjc&e=)
needs physical access to the console to exploit. Fortunately that
shouldn't be a problem in our environment, where the users are more
interested in hacking neurons.
I noticed in the Errata message for the xorg-x11-server update, it
says "The SL Team added a fix for upstream bug 1650634". I'm
wondering if this bug fix
(https://urldefense.proofpoint.com/v2/url?u=https-3A__bugzilla.redhat.com_show-5Fbug.cgi-3Fid-3D1650634&d=DwIDbA&c=gRgGjJ3BkIsb5y6s49QqsA&r=gd8BzeSQcySVxr0gDWSEbN-P-pgDXkdyCtaMqdCgPPdW1cyL5RIpaIYrCn8C5x2A&m=VStRhNQ03emrmYLjFAIDre-xzZT0Ho_w1biFWTQSbtc&s=vz7Ui2PF_YcxXFRgPhPceXRL1HuDpy8255vaqI-vmbE&e=)
broke things for the Radeon drivers. Seems a more likely cause than
the CVE fix for an argument handling issue. That fix was to support
the SL 7.6 upgrade on nVidia hardware. I'll have to see if the 7.6
upgrade (now available) makes the problem go away for the radeon
3000. If not, I'll likely downgrade Xorg to 1.19 on our systems till
I hear of a fix for this.
Thanks again,
Gilles
On 2018-12-03 10:52, Andreas Nowack wrote:
I would exclude the kernel as source of the problem since downgrade
to Xorg 1.19 removes all the error messages and problems. (But due
to security bugs, Xorg 1.19 is not a real option).
Best regards,
Andreas
--
Gilles R. Detillieux E-mail: <[email protected]>
Spinal Cord Research Centre WWW:
https://urldefense.proofpoint.com/v2/url?u=http-3A__www.scrc.umanitoba.ca_&d=DwIDbA&c=gRgGjJ3BkIsb5y6s49QqsA&r=gd8BzeSQcySVxr0gDWSEbN-P-pgDXkdyCtaMqdCgPPdW1cyL5RIpaIYrCn8C5x2A&m=omDRH4VQ2M6IrUHKoOlbZSJqy2HXLP0EYPdTi6xloG4&s=MHODL5LaF9xyotD0njo77RlMr-r06_JDK7OFGEeVJmQ&e=
Dept. of Physiology and Pathophysiology, Faculty of Health Sciences,
Univ. of Manitoba Winnipeg, MB R3E 0J9 (Canada)