> Date: Thu, 6 May 2021 21:59:12 +1000
> From: Jonathan Gray <[email protected]>
> 
> On Wed, May 05, 2021 at 04:47:50PM -0500, Scott Cheloha wrote:
> > Hi,
> > 
> > On a hunch I added additional parameter checks to task_add(9) and
> > task_del(9) and caught intel(4) doing something strange.
> > 
> > The patch is straightforward: check that the taskq pointer tq is not
> > NULL.  In the current code we return early if a flag is set or cleared
> > in the task w, in which case we don't catch bogus taskq inputs, which
> > is why the machine boots fine without the extra checks.
> > 
> > The patch:
> > 
> > Index: kern_task.c
> > ===================================================================
> > RCS file: /cvs/src/sys/kern/kern_task.c,v
> > retrieving revision 1.31
> > diff -u -p -r1.31 kern_task.c
> > --- kern_task.c     1 Aug 2020 08:40:20 -0000       1.31
> > +++ kern_task.c     5 May 2021 21:29:08 -0000
> > @@ -354,6 +354,9 @@ task_add(struct taskq *tq, struct task *
> >  {
> >     int rv = 0;
> >  
> > +   if (tq == NULL)
> > +           panic("%s: NULL taskq", __func__);
> > +
> >     if (ISSET(w->t_flags, TASK_ONQUEUE))
> >             return (0);
> >  
> > @@ -378,6 +381,9 @@ int
> >  task_del(struct taskq *tq, struct task *w)
> >  {
> >     int rv = 0;
> > +
> > +   if (tq == NULL)
> > +           panic("%s: NULL taskq", __func__);
> >  
> >     if (!ISSET(w->t_flags, TASK_ONQUEUE))
> >             return (0);
> > 
> > And here is the panic on my machine.  I had to reconstruct it from
> > OCR, the machine has no serial port, sorry if there are typos.
> 
> boot crash can be helpful for such machines
> 
> > 
> > panic: task_del: NULL taskq
> > Stopped at db_enter+0xa: popq %rbp
> >     TID        PID  UID     PRFLAGS    PFLAGS  CPU  COMMAND
> >  513524  44883    0     Ox14000     0x200    3  update
> >  352928  82402    0     0x14000     0x200    2  cleaner
> >  382195  66035    0     Ox14000     0x200    1  reaper
> > ...
> > db_enter() at db_enter+Oxa
> > panic(ffffffff81db24fb) at panic+0x12f
> > task del(0,ffff8000010633e0) at task_del+Oxa8
> > edp_panel vdd_on(ffff800001063128) at edp_panel_vdd_on+0x6a
> > intel_dp_aux_xfer(ffff800001063128, ffffffff82512a20,4, 
> > ffffffff82512400,2,0) at intel_dp_aux_xfer+0x18b
> > intel_dp_aux_transfer(ffff8000010631e8, ffffffff82512a88) at 
> > intel_dp_aux_transfer+0x183
> > drm_dp_dpcd_access(ffff8000010631e8,9,0,ffff80000106313a, 1) at 
> > drm_dp_dpcd_access+Oxa9
> > drm_dp_dpcd_read(ffff8000010631e8,0,ffff80000106313a, f) at 
> > drm_dp_dpcd_read+0x61
> > intel_dp_read_dpcd(ffff800001063128) at intel_dp_read_dpcd+0x45
> > intel_dp_init_connector(ffff800001063000, ffff800001064000) at 
> > intel_dp_init_connector+0x988
> > intel_ddi_init(ffff800000272000,0) at intel_ddi_init+0x454
> > intel_modeset_init(ffff800000272000) at intel_modeset_init+0x1c9f
> > i915_driver_probe(ffff800000272000, ffffffff82052f98) at 
> > i915_driver_probe+0x7df
> > inteldrm_attachhook(ffff800000272000) at inteldrm_attachhook+0x46
> > end trace frame: Oxffffffff82512700, count: 0
> > 
> > >From the backtrace, I gather the following:
> > 
> > edp_panel_vdd_on() calls clear_delayed_work() which is just a macro
> 
> cancel_delayed_work()
> 
> > that calls task_del().  And for whatever reason the taskq passed to
> > task_del() is NULL.  Maybe there is a missing INIT_DELAYED_WORK() call
> > somewhere prior to this point?
> 
> the call to that is in
> INIT_DELAYED_WORK(&intel_dp->panel_vdd_work, edp_panel_vdd_work);
> but tq doesn't get set until work is scheduled as there are interfaces
> to pick a tq when scheduling work.
> 
> So perhaps you want something like this to catch the cases where work is
> cancelled before it is scheduled.

Yeah, I think that makes sense.

ok kettenis@

> Index: sys/dev/pci/drm/include/linux/workqueue.h
> ===================================================================
> RCS file: /cvs/src/sys/dev/pci/drm/include/linux/workqueue.h,v
> retrieving revision 1.4
> diff -u -p -r1.4 workqueue.h
> --- sys/dev/pci/drm/include/linux/workqueue.h 14 Feb 2021 03:42:55 -0000      
> 1.4
> +++ sys/dev/pci/drm/include/linux/workqueue.h 6 May 2021 11:51:14 -0000
> @@ -95,7 +95,8 @@ queue_work(struct workqueue_struct *wq, 
>  static inline void
>  cancel_work_sync(struct work_struct *work)
>  {
> -     task_del(work->tq, &work->task);
> +     if (work->tq != NULL)
> +             task_del(work->tq, &work->task);
>  }
>  
>  #define work_pending(work)   task_pending(&(work)->task)
> @@ -169,6 +170,8 @@ mod_delayed_work(struct workqueue_struct
>  static inline bool
>  cancel_delayed_work(struct delayed_work *dwork)
>  {
> +     if (dwork->tq == NULL)
> +             return false;
>       if (timeout_del(&dwork->to))
>               return true;
>       return task_del(dwork->tq, &dwork->work.task);
> @@ -177,6 +180,8 @@ cancel_delayed_work(struct delayed_work 
>  static inline bool
>  cancel_delayed_work_sync(struct delayed_work *dwork)
>  {
> +     if (dwork->tq == NULL)
> +             return false;
>       if (timeout_del(&dwork->to))
>               return true;
>       return task_del(dwork->tq, &dwork->work.task);
> 
> 

Reply via email to