[Bug 75211] Division error in radeon_compute_pll_avivo (X hang)

2014-05-03 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=75211

--- Comment #10 from Christian K?nig  ---
(In reply to Darren Salt from comment #9)
> No crash now.

Good, thanks for testing.

> However, the kernel log is spammed with
>   [drm:drm_crtc_helper_set_config] *ERROR* failed to set mode on [CRTC:14]

Well, X is sending totally nonsense to the kernel. It's actually good that the
logs get spammed so somebody notices that something is really going wrong here.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.


[Bug 75211] Division error in radeon_compute_pll_avivo (X hang)

2014-05-02 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=75211

--- Comment #9 from Darren Salt  ---
No crash now.

However, the kernel log is spammed with
  [drm:drm_crtc_helper_set_config] *ERROR* failed to set mode on [CRTC:14]

I did notice what looked like some X spam too, but that's been lost over an X
restart; fairly sure that it was ?invalid argument? (which would correspond to
that -EINVAL). I had to kill it (telling it that it had segfaulted worked
nicely) and there's no evidence of that in Xorg.0.log.old.

I'd say that that DRM error needs to be rate-limited and/or X needs to give up,
or at least do something different than what it currently does, if it can't set
the mode. (I may be able to get more information about what X is doing if it's
needed, but it seemed more important to report these findings first; and that
information belongs elsewhere anyway.)

-- 
You are receiving this mail because:
You are watching the assignee of the bug.


[Bug 75211] Division error in radeon_compute_pll_avivo (X hang)

2014-05-02 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=75211

--- Comment #8 from Christian K?nig  ---
Created attachment 134701
  --> https://bugzilla.kernel.org/attachment.cgi?id=134701=edit
Possible fix.

Please try to reproduce the issue with the attached patch applied.

It would still not work correctly (specifying no clock can't work correctly),
but it shouldn't crash any more.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.


[Bug 75211] Division error in radeon_compute_pll_avivo (X hang)

2014-05-01 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=75211

--- Comment #7 from Darren Salt  ---
Can't do that (radeon module is built in, and no debug symbols ? should switch
that on), but ksymoops output (below) is clear enough for me to determine that
the marked line is where the bug makes itself known:

 if (pll->flags & RADEON_PLL_USE_FRAC_FB_DIV) {
vco_min *= 10;
vco_max *= 10;
}

/* here */  post_div_min = vco_min / target_clock;
if ((target_clock * post_div_min) < vco_min)
++post_div_min;
if (post_div_min < pll->min_post_div)
post_div_min = pll->min_post_div;

Which means that target_clock is zero, which means that freq is 0 or freq/10 is
0.


>>RIP; 812513d0    <=

>>RAX; 000927c0 <__per_cpu_end+804c0/fedd00>
>>RBX; 880236dc8210 
>>RCX; 00124f80 <__per_cpu_end+112c80/fedd00>
>>RSI; 000927c0 <__per_cpu_end+804c0/fedd00>
>>RBP; 8800bd6d2c00 
>>R08; 88021b08fb08 
>>R09; 88021b08fb00 
>>R12; 880236dc8000 
>>R14; 4ff6 

Trace; 81227da9 
Trace; 8123ed95 
Trace; 812d6725 
Trace; 812d68ed 
Trace; 81213ccd 
Trace; 812145a5 
Trace; 81250650 
Trace; 81222183 
Trace; 81224843 
Trace; 81250dac 
Trace; 81219f06 
Trace; 81224470 
Trace; 81468fcc <_raw_spin_unlock_irqrestore+f/21>
Trace; 81234864 
Trace; 810b10a7 
Trace; 810b8747 <__fget+64/6c>
Trace; 810b11a4 
Trace; 81469ca6 

Code;  812513a5 
 <_RIP>:
Code;  812513a5 
   0:   09 8b 6b 08 89 6c or %ecx,0x6c89086b(%rbx)
Code;  812513ab 
   6:   24 20 and$0x20,%al
Code;  812513ad 
   8:   eb 5f jmp69 <_RIP+0x69>
Code;  812513af 
   a:   80 e5 20  and$0x20,%ch
Code;  812513b2 
   d:   74 08 je 17 <_RIP+0x17>
Code;  812513b4 
   f:   8b 73 1c  mov0x1c(%rbx),%esi
Code;  812513b7 
  12:   8b 4b 20  mov0x20(%rbx),%ecx
Code;  812513ba 
  15:   eb 06 jmp1d <_RIP+0x1d>
Code;  812513bc 
  17:   8b 73 14  mov0x14(%rbx),%esi
Code;  812513bf 
  1a:   8b 4b 18  mov0x18(%rbx),%ecx
Code;  812513c2 
  1d:   85 ff test   %edi,%edi
Code;  812513c4 
  1f:   74 06 je 27 <_RIP+0x27>
Code;  812513c6 
  21:   6b f6 0a  imul   $0xa,%esi,%esi
Code;  812513c9 
  24:   6b c9 0a  imul   $0xa,%ecx,%ecx
Code;  812513cc 
  27:   31 d2 xor%edx,%edx
Code;  812513ce 
  29:   89 f0 mov%esi,%eax
Code;  812513d0    <=
  2b:   41 f7 f5  div%r13d   <=
Code;  812513d3 
  2e:   89 c2 mov%eax,%edx
Code;  812513d5 
  30:   8d 68 01  lea0x1(%rax),%ebp
Code;  812513d8 
  33:   41 0f af d5   imul   %r13d,%edx
Code;  812513dc 
  37:   39 f2 cmp%esi,%edx
Code;  812513de 
  39:   8b 53 30  mov0x30(%rbx),%edx

[Bug 75211] Division error in radeon_compute_pll_avivo (X hang)

2014-05-01 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=75211

--- Comment #6 from Christian K?nig  ---
And please provide the output of:

gdb /lib/modules/$(uname -r)/kernel/drivers/gpu/drm/radeon/radeon.ko -ex 'list
*(radeon_compute_pll_avivo+0xb6)' -ex q

Thanks in advance,
Christian.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.


[Bug 75211] Division error in radeon_compute_pll_avivo (X hang)

2014-05-01 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=75211

--- Comment #5 from Christian K?nig  ---
(In reply to Darren Salt from comment #4)
> It appears that you do actually need mismatched builds of xserver git and
> xf86-video-ati git, the latter built against (ideally) 1.15.99.902, to
> trigger this kernel bug... I wonder why it's gone apparently unnoticed until
> now. ?

I've reworked that function for 3.15. It's likely that with mismatched DDX and
X you pass some parameters as zero into the kernel and so trigger a divide by
zero error.

Allready digging into this, just give me a day or two to catch up with the
bugs.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.


[Bug 75211] Division error in radeon_compute_pll_avivo (X hang)

2014-05-01 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=75211

--- Comment #4 from Darren Salt  ---
It appears that you do actually need mismatched builds of xserver git and
xf86-video-ati git, the latter built against (ideally) 1.15.99.902, to trigger
this kernel bug... I wonder why it's gone apparently unnoticed until now. ?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.


[Bug 75211] Division error in radeon_compute_pll_avivo (X hang)

2014-05-01 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=75211

--- Comment #3 from Darren Salt  ---
I didn't rebuild the DDX. (May be worth doing just to see if the kernel bug is
still triggered.)

-- 
You are receiving this mail because:
You are watching the assignee of the bug.


[Bug 75211] Division error in radeon_compute_pll_avivo (X hang)

2014-05-01 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=75211

Christian K?nig  changed:

   What|Removed |Added

 CC||deathsimple at vodafone.de

--- Comment #2 from Christian K?nig  ---
Apart from the X problem the kernel shouldn't run into a div zero error but
return -EINVAL instead when some of the parameters are invalid.

Going to take a look into that direction as well.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.


[Bug 75211] Division error in radeon_compute_pll_avivo (X hang)

2014-05-01 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=75211

--- Comment #1 from Michel D?nzer  ---
(In reply to Darren Salt from comment #0)
> I'm seeing a divide error during X startup, causing X to hang (requiring
> reboot to clear). The trigger is a commit in xserver git:
> 
> commit 4c3932620c29c91dfbbc8eb09c84efcaa7ec873e
> Author: Keith Packard 
> Date:   Fri Apr 25 08:22:15 2014 -0700
> 
> hw/xfree86: Restore API compatibility for cursor loading functions

That commit broke the Xorg video driver ABI. Did you rebuild xf86-video-ati
Git[0] when crossing this xserver Git commit?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.