[Bug 75211] Division error in radeon_compute_pll_avivo (X hang)
https://bugzilla.kernel.org/show_bug.cgi?id=75211 --- Comment #10 from Christian K?nig --- (In reply to Darren Salt from comment #9) > No crash now. Good, thanks for testing. > However, the kernel log is spammed with > [drm:drm_crtc_helper_set_config] *ERROR* failed to set mode on [CRTC:14] Well, X is sending totally nonsense to the kernel. It's actually good that the logs get spammed so somebody notices that something is really going wrong here. -- You are receiving this mail because: You are watching the assignee of the bug.
[Bug 75211] Division error in radeon_compute_pll_avivo (X hang)
https://bugzilla.kernel.org/show_bug.cgi?id=75211 --- Comment #9 from Darren Salt --- No crash now. However, the kernel log is spammed with [drm:drm_crtc_helper_set_config] *ERROR* failed to set mode on [CRTC:14] I did notice what looked like some X spam too, but that's been lost over an X restart; fairly sure that it was ?invalid argument? (which would correspond to that -EINVAL). I had to kill it (telling it that it had segfaulted worked nicely) and there's no evidence of that in Xorg.0.log.old. I'd say that that DRM error needs to be rate-limited and/or X needs to give up, or at least do something different than what it currently does, if it can't set the mode. (I may be able to get more information about what X is doing if it's needed, but it seemed more important to report these findings first; and that information belongs elsewhere anyway.) -- You are receiving this mail because: You are watching the assignee of the bug.
[Bug 75211] Division error in radeon_compute_pll_avivo (X hang)
https://bugzilla.kernel.org/show_bug.cgi?id=75211 --- Comment #8 from Christian K?nig --- Created attachment 134701 --> https://bugzilla.kernel.org/attachment.cgi?id=134701=edit Possible fix. Please try to reproduce the issue with the attached patch applied. It would still not work correctly (specifying no clock can't work correctly), but it shouldn't crash any more. -- You are receiving this mail because: You are watching the assignee of the bug.
[Bug 75211] Division error in radeon_compute_pll_avivo (X hang)
https://bugzilla.kernel.org/show_bug.cgi?id=75211 --- Comment #7 from Darren Salt --- Can't do that (radeon module is built in, and no debug symbols ? should switch that on), but ksymoops output (below) is clear enough for me to determine that the marked line is where the bug makes itself known: if (pll->flags & RADEON_PLL_USE_FRAC_FB_DIV) { vco_min *= 10; vco_max *= 10; } /* here */ post_div_min = vco_min / target_clock; if ((target_clock * post_div_min) < vco_min) ++post_div_min; if (post_div_min < pll->min_post_div) post_div_min = pll->min_post_div; Which means that target_clock is zero, which means that freq is 0 or freq/10 is 0. >>RIP; 812513d0<= >>RAX; 000927c0 <__per_cpu_end+804c0/fedd00> >>RBX; 880236dc8210 >>RCX; 00124f80 <__per_cpu_end+112c80/fedd00> >>RSI; 000927c0 <__per_cpu_end+804c0/fedd00> >>RBP; 8800bd6d2c00 >>R08; 88021b08fb08 >>R09; 88021b08fb00 >>R12; 880236dc8000 >>R14; 4ff6 Trace; 81227da9 Trace; 8123ed95 Trace; 812d6725 Trace; 812d68ed Trace; 81213ccd Trace; 812145a5 Trace; 81250650 Trace; 81222183 Trace; 81224843 Trace; 81250dac Trace; 81219f06 Trace; 81224470 Trace; 81468fcc <_raw_spin_unlock_irqrestore+f/21> Trace; 81234864 Trace; 810b10a7 Trace; 810b8747 <__fget+64/6c> Trace; 810b11a4 Trace; 81469ca6 Code; 812513a5 <_RIP>: Code; 812513a5 0: 09 8b 6b 08 89 6c or %ecx,0x6c89086b(%rbx) Code; 812513ab 6: 24 20 and$0x20,%al Code; 812513ad 8: eb 5f jmp69 <_RIP+0x69> Code; 812513af a: 80 e5 20 and$0x20,%ch Code; 812513b2 d: 74 08 je 17 <_RIP+0x17> Code; 812513b4 f: 8b 73 1c mov0x1c(%rbx),%esi Code; 812513b7 12: 8b 4b 20 mov0x20(%rbx),%ecx Code; 812513ba 15: eb 06 jmp1d <_RIP+0x1d> Code; 812513bc 17: 8b 73 14 mov0x14(%rbx),%esi Code; 812513bf 1a: 8b 4b 18 mov0x18(%rbx),%ecx Code; 812513c2 1d: 85 ff test %edi,%edi Code; 812513c4 1f: 74 06 je 27 <_RIP+0x27> Code; 812513c6 21: 6b f6 0a imul $0xa,%esi,%esi Code; 812513c9 24: 6b c9 0a imul $0xa,%ecx,%ecx Code; 812513cc 27: 31 d2 xor%edx,%edx Code; 812513ce 29: 89 f0 mov%esi,%eax Code; 812513d0 <= 2b: 41 f7 f5 div%r13d <= Code; 812513d3 2e: 89 c2 mov%eax,%edx Code; 812513d5 30: 8d 68 01 lea0x1(%rax),%ebp Code; 812513d8 33: 41 0f af d5 imul %r13d,%edx Code; 812513dc 37: 39 f2 cmp%esi,%edx Code; 812513de 39: 8b 53 30 mov0x30(%rbx),%edx
[Bug 75211] Division error in radeon_compute_pll_avivo (X hang)
https://bugzilla.kernel.org/show_bug.cgi?id=75211 --- Comment #6 from Christian K?nig --- And please provide the output of: gdb /lib/modules/$(uname -r)/kernel/drivers/gpu/drm/radeon/radeon.ko -ex 'list *(radeon_compute_pll_avivo+0xb6)' -ex q Thanks in advance, Christian. -- You are receiving this mail because: You are watching the assignee of the bug.
[Bug 75211] Division error in radeon_compute_pll_avivo (X hang)
https://bugzilla.kernel.org/show_bug.cgi?id=75211 --- Comment #5 from Christian K?nig --- (In reply to Darren Salt from comment #4) > It appears that you do actually need mismatched builds of xserver git and > xf86-video-ati git, the latter built against (ideally) 1.15.99.902, to > trigger this kernel bug... I wonder why it's gone apparently unnoticed until > now. ? I've reworked that function for 3.15. It's likely that with mismatched DDX and X you pass some parameters as zero into the kernel and so trigger a divide by zero error. Allready digging into this, just give me a day or two to catch up with the bugs. -- You are receiving this mail because: You are watching the assignee of the bug.
[Bug 75211] Division error in radeon_compute_pll_avivo (X hang)
https://bugzilla.kernel.org/show_bug.cgi?id=75211 --- Comment #4 from Darren Salt --- It appears that you do actually need mismatched builds of xserver git and xf86-video-ati git, the latter built against (ideally) 1.15.99.902, to trigger this kernel bug... I wonder why it's gone apparently unnoticed until now. ? -- You are receiving this mail because: You are watching the assignee of the bug.
[Bug 75211] Division error in radeon_compute_pll_avivo (X hang)
https://bugzilla.kernel.org/show_bug.cgi?id=75211 --- Comment #3 from Darren Salt --- I didn't rebuild the DDX. (May be worth doing just to see if the kernel bug is still triggered.) -- You are receiving this mail because: You are watching the assignee of the bug.
[Bug 75211] Division error in radeon_compute_pll_avivo (X hang)
https://bugzilla.kernel.org/show_bug.cgi?id=75211 Christian K?nig changed: What|Removed |Added CC||deathsimple at vodafone.de --- Comment #2 from Christian K?nig --- Apart from the X problem the kernel shouldn't run into a div zero error but return -EINVAL instead when some of the parameters are invalid. Going to take a look into that direction as well. -- You are receiving this mail because: You are watching the assignee of the bug.
[Bug 75211] Division error in radeon_compute_pll_avivo (X hang)
https://bugzilla.kernel.org/show_bug.cgi?id=75211 --- Comment #1 from Michel D?nzer --- (In reply to Darren Salt from comment #0) > I'm seeing a divide error during X startup, causing X to hang (requiring > reboot to clear). The trigger is a commit in xserver git: > > commit 4c3932620c29c91dfbbc8eb09c84efcaa7ec873e > Author: Keith Packard > Date: Fri Apr 25 08:22:15 2014 -0700 > > hw/xfree86: Restore API compatibility for cursor loading functions That commit broke the Xorg video driver ABI. Did you rebuild xf86-video-ati Git[0] when crossing this xserver Git commit? -- You are receiving this mail because: You are watching the assignee of the bug.