On 01/18/2016 11:49 AM, Vlastimil Babka wrote: > On 01/16/2016 05:24 AM, Mario Kleiner wrote: >> >> >> On 01/15/2016 01:26 PM, Ville Syrjälä wrote: >>> On Fri, Jan 15, 2016 at 11:34:08AM +0100, Vlastimil Babka wrote: >> >> I'm currently running... >> >> while xinit /usr/bin/ksplashqml --test -- :1 ; do echo yay; done >> >> ... in an endless loop on Linux 4.4 SMP PREEMPT on HD-5770 and so far i >> can't trigger a hang after hundreds of runs. >> >> Does this also hang for you? > > No, test mode seems to be fine. > >> I think a drm.debug=0x21 setting and grep'ping the syslog for "vblank" >> should probably give useful info around the time of the hang. > > Attached. Captured by having kdm running, switching to console, running > "dmesg -C ; dmesg -w > /tmp/dmesg", switch to kdm, enter password, see > frozen splashscreen, switch back, terminate dmesg. So somewhere around > the middle there should be where ksplashscreen starts... > >> Maybe also check XOrg.0.log for (WW) warnings related to flip. > > No such warnings there. > >> thanks, >> -mario >> >> >>>> Thanks, >>>> Vlastimil >>> >
Thanks. So the problem is that AMDs hardware frame counters reset to zero during a modeset. The old DRM code dealt with drivers doing that by keeping vblank irqs enabled during modesets and incrementing vblank count by one during each vblank irq, i think that's what drm_vblank_pre_modeset() and drm_vblank_post_modeset() were meant for. The new code in drm_update_vblank_count() breaks this. The reset of the counter to zero is treated as counter wraparound, so our software vblank counter jumps forward by up to 2^24 counts in response (in case of AMD's 24 bit hw counters), and then the vblank event handling code in drm_handle_vblank_events() and other places detects the counter being more than 2^23 counts ahead of queued vblank events and as part of its own wraparound handling for the 32-Bit software counter doesn't deliver these queued events for a long time -> no vblank swap trigger event -> no swap -> client hangs waiting for swap completion. I think i remember seeing the ksplash progress screen occasionally blanking half way through login, i guess that's when kwin triggers a modeset in parallel to ksplash doing its OpenGL animations. So depending on the hw vblank count at the time of login ksplash would or wouldn't hang, apparently i got "lucky" with my counts at login. -mario