Re: [PATCH] drm/vkms: Don't warn hrtimer_forward_now failure.

2020-05-27 Thread Daniel Vetter
On Tue, May 26, 2020 at 01:11:28PM +0200, Daniel Vetter wrote:
> On Tue, May 26, 2020 at 6:39 AM Tetsuo Handa
>  wrote:
> >
> > On 2020/05/26 13:18, Tetsuo Handa wrote:
> > > due to mode->crtc_clock <= 0. Thus, somehow initializing mode->crtc_clock 
> > > > 0 might be able
> > > to solve this problem.
> >
> > Well, I came to think that vkms_enable_vblank() should return an error to 
> > the caller
> > when drm_calc_timestamping_constants() failed...
> 
> If my memory is right we shouldn't even get there. crtc->mode being
> all zeros sounds like the simulated output isn't on (no surprise,
> syzbot doesn't enable it and fbcon is probably on card0), so something
> higher up should reject this. I'll see whether I can figure out what
> vkms isn't doing right (or whether there's a higher level bug in
> drm_vblank.c code), yesterday my machine died and already evening
> anyway.
> 
> Thanks for digging into this a bit more meanwhile.

Ok good news, I think I have a fix. Bad news it's code used by like 50
drivers, so this will take some time to get reviewed and merged (and I
think almost all these other drivers are buggy too, which is why I decided
to fix the shared code).

Patch below, can you pls confirm this fixes it?

Thanks, Daniel

commit b882894ac0fce412b67db79a5c92f2b599ec5069
Author: Daniel Vetter 
Date:   Tue May 26 14:29:00 2020 +0200

drm/atomic-helper: reset vblank on crtc reset

Only when vblanks are supported ofc.

Some drivers do this already, but most unfortunately missed it. This
opens up bugs after driver load, before the crtc is enabled for the
first time. syzbot spotted this when loading vkms as a secondary
output. Given how many drivers are buggy it's best to solve this once
and for all in shared helper code.

Aside from moving the few existing calls to drm_crtc_vblank_reset into
helpers (i915 doesn't use helpers, so keeps its own) I think the
regression risk is minimal: atomic helpers already rely on drivers
calling drm_crtc_vblank_on/off correctly in their hooks when they
support vblanks. And driver that's failing to handle vblanks after
this is missing those calls already, and vblanks could only work by
accident when enabling a CRTC for the first time right after boot.

FIXME: Fix up drivers and audit them all.

Big thanks to Tetsuo for helping track down what's going wrong here.

Link: 
https://syzkaller.appspot.com/bug?id=0ba17d70d062b2595e1f061231474800f076c7cb
Reported-by: Tetsuo Handa 
Reported-by: syzbot+0871b14ca2e2fb64f...@syzkaller.appspotmail.com
Cc: sta...@vger.kernel.org
Cc: Tetsuo Handa 
Signed-off-by: Daniel Vetter 

diff --git a/drivers/gpu/drm/drm_atomic_state_helper.c 
b/drivers/gpu/drm/drm_atomic_state_helper.c
index 8fce6a115dfe..843964252239 100644
--- a/drivers/gpu/drm/drm_atomic_state_helper.c
+++ b/drivers/gpu/drm/drm_atomic_state_helper.c
@@ -32,6 +32,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -93,6 +94,9 @@ __drm_atomic_helper_crtc_reset(struct drm_crtc *crtc,
if (crtc_state)
__drm_atomic_helper_crtc_state_reset(crtc_state, crtc);
 
+   if (crtc->dev->num_crtcs)
+   drm_crtc_vblank_reset(crtc);
+
crtc->state = crtc_state;
 }
 EXPORT_SYMBOL(__drm_atomic_helper_crtc_reset);
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/vkms: Don't warn hrtimer_forward_now failure.

2020-05-26 Thread Daniel Vetter
On Tue, May 26, 2020 at 6:39 AM Tetsuo Handa
 wrote:
>
> On 2020/05/26 13:18, Tetsuo Handa wrote:
> > due to mode->crtc_clock <= 0. Thus, somehow initializing mode->crtc_clock > 
> > 0 might be able
> > to solve this problem.
>
> Well, I came to think that vkms_enable_vblank() should return an error to the 
> caller
> when drm_calc_timestamping_constants() failed...

If my memory is right we shouldn't even get there. crtc->mode being
all zeros sounds like the simulated output isn't on (no surprise,
syzbot doesn't enable it and fbcon is probably on card0), so something
higher up should reject this. I'll see whether I can figure out what
vkms isn't doing right (or whether there's a higher level bug in
drm_vblank.c code), yesterday my machine died and already evening
anyway.

Thanks for digging into this a bit more meanwhile.

Cheers, Daniel

>
> static int vkms_enable_vblank(struct drm_crtc *crtc)
> {
> struct drm_device *dev = crtc->dev;
> unsigned int pipe = drm_crtc_index(crtc);
> struct drm_vblank_crtc *vblank = >vblank[pipe];
> struct vkms_output *out = drm_crtc_to_vkms_output(crtc);
>
> drm_calc_timestamping_constants(crtc, >mode);
>
> hrtimer_init(>vblank_hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
> out->vblank_hrtimer.function = _vblank_simulate;
> out->period_ns = ktime_set(0, vblank->framedur_ns);
> hrtimer_start(>vblank_hrtimer, out->period_ns, HRTIMER_MODE_REL);
>
> return 0;
> }



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/vkms: Don't warn hrtimer_forward_now failure.

2020-05-26 Thread Tetsuo Handa
On 2020/05/26 2:00, Daniel Vetter wrote:
> Forgot to add: I did run this quickly with vkms as secondary output.
> Good fireworks show, but there's an entire army of additional splats
> after the first one. The WARN_ON you're removing is only the
> messenger, the real bug is probably one of the later backtraces. Or at
> least points more clearly at the real bug.
> -Daniel
> 
>>
>> Debug printk() patch shows that hrtimer_forward_now() can return larger than 
>> 1.
>> What is the reason you are expecting hrtimer_forward_now() to always return 
>> 1 ?
>>
>>>
>>> If the vkms driver is loaded, and there's nothing else going on, then what
>>> I expect to happen is that this virtual hw is entirely off. And in that
>>> case, the vblank ioctl should be rejected outright. So there's definitely
>>> something fishy going on to begin with.
>>>
>>> If otoh the virtual hw is somehow on (maybe fbcon gets loaded, no idea),
>>> then the vblank wait shouldn't just blow up like this.

I see. Another debug printk() patch ( 
https://syzkaller.appspot.com/text?tag=Patch=135f254a10 ) found that

  [   86.613767][ T7837] [drm:drm_stub_open] 
  [   86.621592][ T7837] [drm:drm_open] pid = 7837, minor = 1
  [   86.651632][ T7837] [drm:drm_ioctl] pid=7837, dev=0xe201, auth=1, 
DRM_IOCTL_WAIT_VBLANK
  [   86.663391][ T7837] [drm:drm_calc_timestamping_constants] *ERROR* crtc 32: 
Can't calculate constants, dotclock = 0!
  [   86.679151][ T7837] [drm:drm_calc_timestamping_constants] crtc 32: hwmode: 
htotal 0, vtotal 0, vdisplay 0
  [   86.688865][ T7837] [drm:drm_calc_timestamping_constants] crtc 32: clock 0 
kHz framedur 0 linedur 0
  [   86.688899][ T7837] Starting out->vblank_hrtimer : out->period_ns=0 
vblank->framedur_ns=0
  [   86.689348][ T7837] [drm:drm_vblank_enable] enabling vblank on crtc 0, 
ret: 0
  [   86.706883][ T7837] [drm:drm_update_vblank_count] updating vblank count on 
crtc 0: current=0, diff=0, hw=0 hw_last=0
  [   86.706962][C1] now=86714983667 interval=0 ret_overrun=17693375

hrtimer_start(>vblank_hrtimer, 0, HRTIMER_MODE_REL) from 
vkms_enable_vblank() is
causing immediate vkms_vblank_simulate() callback. The reason of 
vblank->framedur_ns == 0
is that drm_calc_timestamping_constants() fails to calculate framedur_ns (which 
is 0 by default)
due to mode->crtc_clock <= 0. Thus, somehow initializing mode->crtc_clock > 0 
might be able
to solve this problem.

By the way, if you have an environment that can reproduce this problem, you can
test with debug printk() patch above applied.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/vkms: Don't warn hrtimer_forward_now failure.

2020-05-26 Thread Tetsuo Handa
On 2020/05/26 13:18, Tetsuo Handa wrote:
> due to mode->crtc_clock <= 0. Thus, somehow initializing mode->crtc_clock > 0 
> might be able
> to solve this problem.

Well, I came to think that vkms_enable_vblank() should return an error to the 
caller
when drm_calc_timestamping_constants() failed...

static int vkms_enable_vblank(struct drm_crtc *crtc)
{
struct drm_device *dev = crtc->dev;
unsigned int pipe = drm_crtc_index(crtc);
struct drm_vblank_crtc *vblank = >vblank[pipe];
struct vkms_output *out = drm_crtc_to_vkms_output(crtc);

drm_calc_timestamping_constants(crtc, >mode);

hrtimer_init(>vblank_hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
out->vblank_hrtimer.function = _vblank_simulate;
out->period_ns = ktime_set(0, vblank->framedur_ns);
hrtimer_start(>vblank_hrtimer, out->period_ns, HRTIMER_MODE_REL);

return 0;
}
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/vkms: Don't warn hrtimer_forward_now failure.

2020-05-26 Thread Tetsuo Handa
On 2020/05/26 0:21, Daniel Vetter wrote:
> On Mon, May 25, 2020 at 11:38:49PM +0900, Tetsuo Handa wrote:
>> Commit 3a0709928b172a41 ("drm/vkms: Add vblank events simulated by
>> hrtimers") introduced ret_overrun variable. And that variable was an
>> unused-but-set-variable until commit 09ef09b4ab95dc40 ("drm/vkms:
>> WARN when hrtimer_forward_now fails") added WARN_ON(ret_overrun != 1).
>>
>> Now, syzbot is hitting this WARN_ON() using a simple reproducer that
>> does open("/dev/dri/card1") followed by ioctl(DRM_IOCTL_WAIT_VBLANK),
>> and a debug printk() patch says that syzbot is getting
>>
>>output->vblank_hrtimer.base->get_time()=93531904774 (which is uptime)
>>output->period_ns=0
>>ret_overrun=216994
>>
>> . I can't understand what "verify the hrtimer_forward_now return" in
>> that commit wants to say. hrtimer_forward_now() must return, and the
>> return value of hrtimer_forward_now() is not a boolean. Why comparing
>> with 1 ? Anyway, this failure is not something that worth crashing the
>> system. Let's remove the ret_overrun variable and WARN_ON() test.
> 
> Uh we're not crashing the system, it's a warning backtrace.

syzbot uses panic_on_warn=1, and this bug is currently the 8th top crasher.

> 
> And we've spent a few months hunting the races here, so just removing that
> check isn't really a good idea. The correct thing to do is figure out why
> we're hitting this. It could be that we're having a missing check
> somewhere, or missing initialization, and that's what syzbot is hitting.
> Removing this check here just papers over the bug.

Here is a reproducer which syzbot is using.

--
#include 
#include 
#include 
#include 
#include 

int main(int argc, char *argv[])
{
union drm_wait_vblank arg;
int fd = open("/dev/dri/card1", O_RDONLY);

arg.request.type = 0;
arg.request.sequence = 0x;
arg.request.signal = 0x21;
ioctl(fd, DRM_IOCTL_WAIT_VBLANK, );
return 0;
}
--

Debug printk() patch shows that hrtimer_forward_now() can return larger than 1.
What is the reason you are expecting hrtimer_forward_now() to always return 1 ?

> 
> If the vkms driver is loaded, and there's nothing else going on, then what
> I expect to happen is that this virtual hw is entirely off. And in that
> case, the vblank ioctl should be rejected outright. So there's definitely
> something fishy going on to begin with.
> 
> If otoh the virtual hw is somehow on (maybe fbcon gets loaded, no idea),
> then the vblank wait shouldn't just blow up like this.
> -Daniel
> 
>>
>> Link: 
>> https://syzkaller.appspot.com/bug?id=0ba17d70d062b2595e1f061231474800f076c7cb
>> Signed-off-by: Tetsuo Handa 
>> Reported-by: syzbot+0871b14ca2e2fb64f...@syzkaller.appspotmail.com
>> Fixes: 09ef09b4ab95dc40 ("drm/vkms: WARN when hrtimer_forward_now fails")
>> ---
>>  drivers/gpu/drm/vkms/vkms_crtc.c | 5 +
>>  1 file changed, 1 insertion(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c 
>> b/drivers/gpu/drm/vkms/vkms_crtc.c
>> index ac85e17428f8..cc1811ce6092 100644
>> --- a/drivers/gpu/drm/vkms/vkms_crtc.c
>> +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
>> @@ -13,12 +13,9 @@ static enum hrtimer_restart vkms_vblank_simulate(struct 
>> hrtimer *timer)
>>vblank_hrtimer);
>>  struct drm_crtc *crtc = >crtc;
>>  struct vkms_crtc_state *state;
>> -u64 ret_overrun;
>>  bool ret;
>>  
>> -ret_overrun = hrtimer_forward_now(>vblank_hrtimer,
>> -  output->period_ns);
>> -WARN_ON(ret_overrun != 1);
>> +hrtimer_forward_now(>vblank_hrtimer, output->period_ns);
>>  
>>  spin_lock(>lock);
>>  ret = drm_crtc_handle_vblank(crtc);
>> -- 
>> 2.18.2
>>
> 

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/vkms: Don't warn hrtimer_forward_now failure.

2020-05-25 Thread Daniel Vetter
On Mon, May 25, 2020 at 5:34 PM Tetsuo Handa
 wrote:
>
> On 2020/05/26 0:21, Daniel Vetter wrote:
> > On Mon, May 25, 2020 at 11:38:49PM +0900, Tetsuo Handa wrote:
> >> Commit 3a0709928b172a41 ("drm/vkms: Add vblank events simulated by
> >> hrtimers") introduced ret_overrun variable. And that variable was an
> >> unused-but-set-variable until commit 09ef09b4ab95dc40 ("drm/vkms:
> >> WARN when hrtimer_forward_now fails") added WARN_ON(ret_overrun != 1).
> >>
> >> Now, syzbot is hitting this WARN_ON() using a simple reproducer that
> >> does open("/dev/dri/card1") followed by ioctl(DRM_IOCTL_WAIT_VBLANK),
> >> and a debug printk() patch says that syzbot is getting
> >>
> >>output->vblank_hrtimer.base->get_time()=93531904774 (which is uptime)
> >>output->period_ns=0
> >>ret_overrun=216994
> >>
> >> . I can't understand what "verify the hrtimer_forward_now return" in
> >> that commit wants to say. hrtimer_forward_now() must return, and the
> >> return value of hrtimer_forward_now() is not a boolean. Why comparing
> >> with 1 ? Anyway, this failure is not something that worth crashing the
> >> system. Let's remove the ret_overrun variable and WARN_ON() test.
> >
> > Uh we're not crashing the system, it's a warning backtrace.
>
> syzbot uses panic_on_warn=1, and this bug is currently the 8th top crasher.
>
> >
> > And we've spent a few months hunting the races here, so just removing that
> > check isn't really a good idea. The correct thing to do is figure out why
> > we're hitting this. It could be that we're having a missing check
> > somewhere, or missing initialization, and that's what syzbot is hitting.
> > Removing this check here just papers over the bug.
>
> Here is a reproducer which syzbot is using.
>
> --
> #include 
> #include 
> #include 
> #include 
> #include 
>
> int main(int argc, char *argv[])
> {
> union drm_wait_vblank arg;
> int fd = open("/dev/dri/card1", O_RDONLY);
>
> arg.request.type = 0;
> arg.request.sequence = 0x;
> arg.request.signal = 0x21;
> ioctl(fd, DRM_IOCTL_WAIT_VBLANK, );
> return 0;
> }
> --

Forgot to add: I did run this quickly with vkms as secondary output.
Good fireworks show, but there's an entire army of additional splats
after the first one. The WARN_ON you're removing is only the
messenger, the real bug is probably one of the later backtraces. Or at
least points more clearly at the real bug.
-Daniel

>
> Debug printk() patch shows that hrtimer_forward_now() can return larger than 
> 1.
> What is the reason you are expecting hrtimer_forward_now() to always return 1 
> ?
>
> >
> > If the vkms driver is loaded, and there's nothing else going on, then what
> > I expect to happen is that this virtual hw is entirely off. And in that
> > case, the vblank ioctl should be rejected outright. So there's definitely
> > something fishy going on to begin with.
> >
> > If otoh the virtual hw is somehow on (maybe fbcon gets loaded, no idea),
> > then the vblank wait shouldn't just blow up like this.
> > -Daniel
> >
> >>
> >> Link: 
> >> https://syzkaller.appspot.com/bug?id=0ba17d70d062b2595e1f061231474800f076c7cb
> >> Signed-off-by: Tetsuo Handa 
> >> Reported-by: syzbot+0871b14ca2e2fb64f...@syzkaller.appspotmail.com
> >> Fixes: 09ef09b4ab95dc40 ("drm/vkms: WARN when hrtimer_forward_now fails")
> >> ---
> >>  drivers/gpu/drm/vkms/vkms_crtc.c | 5 +
> >>  1 file changed, 1 insertion(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c 
> >> b/drivers/gpu/drm/vkms/vkms_crtc.c
> >> index ac85e17428f8..cc1811ce6092 100644
> >> --- a/drivers/gpu/drm/vkms/vkms_crtc.c
> >> +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
> >> @@ -13,12 +13,9 @@ static enum hrtimer_restart vkms_vblank_simulate(struct 
> >> hrtimer *timer)
> >>vblank_hrtimer);
> >>  struct drm_crtc *crtc = >crtc;
> >>  struct vkms_crtc_state *state;
> >> -u64 ret_overrun;
> >>  bool ret;
> >>
> >> -ret_overrun = hrtimer_forward_now(>vblank_hrtimer,
> >> -  output->period_ns);
> >> -WARN_ON(ret_overrun != 1);
> >> +hrtimer_forward_now(>vblank_hrtimer, output->period_ns);
> >>
> >>  spin_lock(>lock);
> >>  ret = drm_crtc_handle_vblank(crtc);
> >> --
> >> 2.18.2
> >>
> >
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/vkms: Don't warn hrtimer_forward_now failure.

2020-05-25 Thread Daniel Vetter
On Mon, May 25, 2020 at 5:34 PM Tetsuo Handa
 wrote:
>
> On 2020/05/26 0:21, Daniel Vetter wrote:
> > On Mon, May 25, 2020 at 11:38:49PM +0900, Tetsuo Handa wrote:
> >> Commit 3a0709928b172a41 ("drm/vkms: Add vblank events simulated by
> >> hrtimers") introduced ret_overrun variable. And that variable was an
> >> unused-but-set-variable until commit 09ef09b4ab95dc40 ("drm/vkms:
> >> WARN when hrtimer_forward_now fails") added WARN_ON(ret_overrun != 1).
> >>
> >> Now, syzbot is hitting this WARN_ON() using a simple reproducer that
> >> does open("/dev/dri/card1") followed by ioctl(DRM_IOCTL_WAIT_VBLANK),
> >> and a debug printk() patch says that syzbot is getting
> >>
> >>output->vblank_hrtimer.base->get_time()=93531904774 (which is uptime)
> >>output->period_ns=0
> >>ret_overrun=216994
> >>
> >> . I can't understand what "verify the hrtimer_forward_now return" in
> >> that commit wants to say. hrtimer_forward_now() must return, and the
> >> return value of hrtimer_forward_now() is not a boolean. Why comparing
> >> with 1 ? Anyway, this failure is not something that worth crashing the
> >> system. Let's remove the ret_overrun variable and WARN_ON() test.
> >
> > Uh we're not crashing the system, it's a warning backtrace.
>
> syzbot uses panic_on_warn=1, and this bug is currently the 8th top crasher.
>
> >
> > And we've spent a few months hunting the races here, so just removing that
> > check isn't really a good idea. The correct thing to do is figure out why
> > we're hitting this. It could be that we're having a missing check
> > somewhere, or missing initialization, and that's what syzbot is hitting.
> > Removing this check here just papers over the bug.
>
> Here is a reproducer which syzbot is using.
>
> --
> #include 
> #include 
> #include 
> #include 
> #include 
>
> int main(int argc, char *argv[])
> {
> union drm_wait_vblank arg;
> int fd = open("/dev/dri/card1", O_RDONLY);
>
> arg.request.type = 0;
> arg.request.sequence = 0x;
> arg.request.signal = 0x21;
> ioctl(fd, DRM_IOCTL_WAIT_VBLANK, );
> return 0;
> }
> --
>
> Debug printk() patch shows that hrtimer_forward_now() can return larger than 
> 1.
> What is the reason you are expecting hrtimer_forward_now() to always return 1 
> ?

The timer simulates display refresh running at 60Hz. If we miss one,
the simulation is not accurate against hardware. Missing a few
thousands either means there's a bug somewhere, or the machine has
missed a few seconds of worth of timers (which is equally bad).
-Daniel

>
> >
> > If the vkms driver is loaded, and there's nothing else going on, then what
> > I expect to happen is that this virtual hw is entirely off. And in that
> > case, the vblank ioctl should be rejected outright. So there's definitely
> > something fishy going on to begin with.
> >
> > If otoh the virtual hw is somehow on (maybe fbcon gets loaded, no idea),
> > then the vblank wait shouldn't just blow up like this.
> > -Daniel
> >
> >>
> >> Link: 
> >> https://syzkaller.appspot.com/bug?id=0ba17d70d062b2595e1f061231474800f076c7cb
> >> Signed-off-by: Tetsuo Handa 
> >> Reported-by: syzbot+0871b14ca2e2fb64f...@syzkaller.appspotmail.com
> >> Fixes: 09ef09b4ab95dc40 ("drm/vkms: WARN when hrtimer_forward_now fails")
> >> ---
> >>  drivers/gpu/drm/vkms/vkms_crtc.c | 5 +
> >>  1 file changed, 1 insertion(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c 
> >> b/drivers/gpu/drm/vkms/vkms_crtc.c
> >> index ac85e17428f8..cc1811ce6092 100644
> >> --- a/drivers/gpu/drm/vkms/vkms_crtc.c
> >> +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
> >> @@ -13,12 +13,9 @@ static enum hrtimer_restart vkms_vblank_simulate(struct 
> >> hrtimer *timer)
> >>vblank_hrtimer);
> >>  struct drm_crtc *crtc = >crtc;
> >>  struct vkms_crtc_state *state;
> >> -u64 ret_overrun;
> >>  bool ret;
> >>
> >> -ret_overrun = hrtimer_forward_now(>vblank_hrtimer,
> >> -  output->period_ns);
> >> -WARN_ON(ret_overrun != 1);
> >> +hrtimer_forward_now(>vblank_hrtimer, output->period_ns);
> >>
> >>  spin_lock(>lock);
> >>  ret = drm_crtc_handle_vblank(crtc);
> >> --
> >> 2.18.2
> >>
> >
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/vkms: Don't warn hrtimer_forward_now failure.

2020-05-25 Thread Daniel Vetter
On Mon, May 25, 2020 at 11:38:49PM +0900, Tetsuo Handa wrote:
> Commit 3a0709928b172a41 ("drm/vkms: Add vblank events simulated by
> hrtimers") introduced ret_overrun variable. And that variable was an
> unused-but-set-variable until commit 09ef09b4ab95dc40 ("drm/vkms:
> WARN when hrtimer_forward_now fails") added WARN_ON(ret_overrun != 1).
> 
> Now, syzbot is hitting this WARN_ON() using a simple reproducer that
> does open("/dev/dri/card1") followed by ioctl(DRM_IOCTL_WAIT_VBLANK),
> and a debug printk() patch says that syzbot is getting
> 
>output->vblank_hrtimer.base->get_time()=93531904774 (which is uptime)
>output->period_ns=0
>ret_overrun=216994
> 
> . I can't understand what "verify the hrtimer_forward_now return" in
> that commit wants to say. hrtimer_forward_now() must return, and the
> return value of hrtimer_forward_now() is not a boolean. Why comparing
> with 1 ? Anyway, this failure is not something that worth crashing the
> system. Let's remove the ret_overrun variable and WARN_ON() test.

Uh we're not crashing the system, it's a warning backtrace.

And we've spent a few months hunting the races here, so just removing that
check isn't really a good idea. The correct thing to do is figure out why
we're hitting this. It could be that we're having a missing check
somewhere, or missing initialization, and that's what syzbot is hitting.
Removing this check here just papers over the bug.

If the vkms driver is loaded, and there's nothing else going on, then what
I expect to happen is that this virtual hw is entirely off. And in that
case, the vblank ioctl should be rejected outright. So there's definitely
something fishy going on to begin with.

If otoh the virtual hw is somehow on (maybe fbcon gets loaded, no idea),
then the vblank wait shouldn't just blow up like this.
-Daniel

> 
> Link: 
> https://syzkaller.appspot.com/bug?id=0ba17d70d062b2595e1f061231474800f076c7cb
> Signed-off-by: Tetsuo Handa 
> Reported-by: syzbot+0871b14ca2e2fb64f...@syzkaller.appspotmail.com
> Fixes: 09ef09b4ab95dc40 ("drm/vkms: WARN when hrtimer_forward_now fails")
> ---
>  drivers/gpu/drm/vkms/vkms_crtc.c | 5 +
>  1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c 
> b/drivers/gpu/drm/vkms/vkms_crtc.c
> index ac85e17428f8..cc1811ce6092 100644
> --- a/drivers/gpu/drm/vkms/vkms_crtc.c
> +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
> @@ -13,12 +13,9 @@ static enum hrtimer_restart vkms_vblank_simulate(struct 
> hrtimer *timer)
> vblank_hrtimer);
>   struct drm_crtc *crtc = >crtc;
>   struct vkms_crtc_state *state;
> - u64 ret_overrun;
>   bool ret;
>  
> - ret_overrun = hrtimer_forward_now(>vblank_hrtimer,
> -   output->period_ns);
> - WARN_ON(ret_overrun != 1);
> + hrtimer_forward_now(>vblank_hrtimer, output->period_ns);
>  
>   spin_lock(>lock);
>   ret = drm_crtc_handle_vblank(crtc);
> -- 
> 2.18.2
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel