On 01/18/2016 11:49 AM, Vlastimil Babka wrote:
> On 01/16/2016 05:24 AM, Mario Kleiner wrote:
>>
>>
>> On 01/15/2016 01:26 PM, Ville Syrjälä wrote:
>>> On Fri, Jan 15, 2016 at 11:34:08AM +0100, Vlastimil Babka wrote:
>>
>> I'm currently running...
>>
>> while xinit /usr/bin/ksplashqml --test -- :1 ; do echo yay; done
>>
>> ... in an endless loop on Linux 4.4 SMP PREEMPT on HD-5770  and so far i
>> can't trigger a hang after hundreds of runs.
>>
>> Does this also hang for you?
>
> No, test mode seems to be fine.
>
>> I think a drm.debug=0x21 setting and grep'ping the syslog for "vblank"
>> should probably give useful info around the time of the hang.
>
> Attached. Captured by having kdm running, switching to console, running
> "dmesg -C ; dmesg -w > /tmp/dmesg", switch to kdm, enter password, see
> frozen splashscreen, switch back, terminate dmesg. So somewhere around
> the middle there should be where ksplashscreen starts...
>
>> Maybe also check XOrg.0.log for (WW) warnings related to flip.
>
> No such warnings there.
>
>> thanks,
>> -mario
>>
>>
>>>> Thanks,
>>>> Vlastimil
>>>
>

Thanks. So the problem is that AMDs hardware frame counters reset to 
zero during a modeset. The old DRM code dealt with drivers doing that by 
keeping vblank irqs enabled during modesets and incrementing vblank 
count by one during each vblank irq, i think that's what 
drm_vblank_pre_modeset() and drm_vblank_post_modeset() were meant for.

The new code in drm_update_vblank_count() breaks this. The reset of the 
counter to zero is treated as counter wraparound, so our software vblank 
counter jumps forward by up to 2^24 counts in response (in case of AMD's 
24 bit hw counters), and then the vblank event handling code in 
drm_handle_vblank_events() and other places detects the counter being 
more than 2^23 counts ahead of queued vblank events and as part of its 
own wraparound handling for the 32-Bit software counter doesn't deliver 
these queued events for a long time -> no vblank swap trigger event -> 
no swap -> client hangs waiting for swap completion.

I think i remember seeing the ksplash progress screen occasionally 
blanking half way through login, i guess that's when kwin triggers a 
modeset in parallel to ksplash doing its OpenGL animations. So depending 
on the hw vblank count at the time of login ksplash would or wouldn't 
hang, apparently i got "lucky" with my counts at login.

-mario

Reply via email to