On Mon, 2013-05-13 at 18:00 -0400, Jörn Engel wrote:
> On Mon, 13 May 2013 16:00:03 -0700, Nicholas A. Bellinger wrote:
> > On Mon, 2013-05-13 at 16:30 -0400, Joern Engel wrote:
> > > The second parameter was always 0, leading to effectively dead code.  It
> > > called list_del() and se_cmd->se_tfo->release_cmd(), and had to set a
> > > flag to prevent target_release_cmd_kref() from doing the same.
> > 
> > Look again.  The call to transport_wait_for_tasks() was dead code, but
> > the wait_for_completion(&se_cmd->cmd_wait_comp) most certainly is not.
> 
> See "totally wrong" below.
> 
> > > But most
> > > of all, it iterated the list without taking se_sess->sess_cmd_lock,
> > > leading to races against ABORT and LUN_RESET.
> > 
> > Ugh.  You'll recall that target_wait_for_sess_cmds() originally did not
> > have to take the lock because the list was spliced into
> > sess_wait_list.  
> > 
> > When Roland removed sess_wait_list in commit 1c7b13fe, no one re-added
> > the lock here.
> 
> Interesting point.  That seems to imply that reverting 1c7b13fe would
> be an alternative approach.
> 
> > > Since the whole point of the function is to wait for the list to drain,
> > > and potentially print a bit of debug information in case that never
> > > happens, I've replaced the wait_for_completion() with 100ms sleep.  The
> > > only callpath that would get delayed by this is rmmod, afaics, so I
> > > didn't want the overhead of a waitqueue.
> > 
> > This seems totally wrong..
> 
> The wait_for_completion() was not dead code, but it was just one
> possible implementation of "wait for the list to drain".  I dislike
> that particular implementation because you have to drop the spinlock
> before waiting and at the same time wait for a specific command.
> Since you no longer hold any locks, the command can say *poof* and
> disappear from under you at any point.

So I'd rather re-instate the list splice within
target_sess_cmd_list_set_waiting(), keep target_wait_for_sess_cmds()
lock-less performing wait_for_completions() on cmd_wait_comp, and keep
the existing cmd_wait_set assignment + check in place.


> Indeed it has to.  So maybe
> you could take a refcount while waiting for this command to prevent
> that, which implies you have to check for this special refcount
> elsewhere and...

I don't think that will be necessary..

> 
> At this point most readers should shudder in disgust and look for some
> alternate implementation.  I don't say mine is perfect, but at least
> it does not care about any particular command.
> 
> > > -         if (!rc) {
> > > -                 wait_for_completion(&se_cmd->cmd_wait_comp);
> > > -                 pr_debug("After cmd_wait_comp: se_cmd: %p t_state: %d"
> > > -                         " fabric state: %d\n", se_cmd, se_cmd->t_state,
> > > -                         se_cmd->se_tfo->get_cmd_state(se_cmd));
> > > + spin_lock_irqsave(&se_sess->sess_cmd_lock, flags);
> > > + while (!list_empty(&se_sess->sess_cmd_list)) {
> > > +         se_cmd = list_entry(se_sess->sess_cmd_list.next, struct se_cmd,
> > > +                 se_cmd_list);
> > > +         if (se_cmd != last_cmd) { /* print this only once per command */
> > > +                 pr_debug("Waiting for se_cmd: %p t_state: %d, fabric 
> > > state: %d\n",
> > > +                                 se_cmd, se_cmd->t_state,
> > > +                                 se_cmd->se_tfo->get_cmd_state(se_cmd));
> > > +                 last_cmd = se_cmd;
> > >           }
> > > -
> > > -         se_cmd->se_tfo->release_cmd(se_cmd);
> > > +         spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags);
> > > +         msleep_interruptible(100);
> > > +         spin_lock_irqsave(&se_sess->sess_cmd_lock, flags);
> > >   }
> > > + spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags);
> > >  }
> > 
> > So what happens when the backend se_cmd I/O does not complete before the
> > msleep finishes..?
> 
> You take the lock, check list_empty() and go to sleep again.  Repeat
> until the backend I/O does complete or you get a hung task, whichever
> comes first.
> 
> Which is exactly the same behaviour you had before.
> 
> > It seems totally wrong to drop the initial cmd_wait_set =1 assignment,
> > target_release_cmd_kref() completion for cmd_wait_comp, and wait on
> > cmd_wait_comp to allow se_cmd to finish processing here.
> > 
> > Who cares about waitqueue overhead in a shutdown slow path when the
> > replacement doesn't address long outstanding commands..?
> 
> I agree that the overhead doesn't matter.  The msleep(100) spells this
> out rather explicitly.  What does matter is that a) the patch retains
> old behaviour with much simpler code and b) it fixes a race that kills
> the machine.  I can live without a, but very much want to keep b. ;)
> 

Fucking around with ->sess_cmd_lock during each loop of ->sess_cmd_list
in target_wait_for_sess_cmds is not simpler code..

Please re-spin a patch that re-instates the list splice part of commit
1c7b13fe6, and only drops the wait_for_tasks case check in
target_wait_for_sess_cmds()

--nab

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to