Re: [linux-usb-devel] Re: usb_clear_halt <-> disconnect race

Alan Stern Wed, 23 Oct 2002 10:33:07 -0700

On Wed, 23 Oct 2002, Oliver Neukum wrote:

> > I'm going to guess that you will object to the race condition that exists
> > between the time the flag is tested and the URB is submitted.  Yes,
> > there's a race.  But it is unavoidable, unless you want to surround every
> > call to the usb core with a spinlock_irq or the equivalent.  And in any
>
> That's exactly what's needed. But why irq? Disconnect() runs from task
> context.


Yes it does.  But completion routines get called at interrupt time.
However, this is not really relevant to the problem at hand.

> > case it doesn't matter, since we only want to guarantee that no URB will
> > be submitted after disconnect() _returns_.  We don't need to (and in fact
> > cannot) guarantee that no URB will be submitted after disconnect() _is
> > called_.
>
> Right. But why should this matter? If there's a race there's a race.
> Unless you can guarantee that disconnect() setting a flag for disconnection
> and an urb being submitted or a synchronous helper running don't race
> it's a bug.

That's not necessarily true.  There are plenty of examples where races are
not bugs.  (There's even one in usb-storage.)  It's just a question of
writing the code so that no matter who wins the race, it does the right
thing.

> > Perhaps the real question is: Do we need to guarantee that no URBs will
> > remain outstanding after disconnect() returns?  This seems to be what you
>
>       if (!(dev = urb->dev) || !dev->bus || dev->devnum <= 0)
>               return -ENODEV;
>       if (!(op = dev->bus->op) || !op->submit_urb)
>               return -ENODEV;
>
> from usb_submit_urb(). This is deadly. Unlinking after disconnect() is
> deadly as well.
> But for an indeterminate amount of time after usb_submit_urb is called,
> it's dangerous. Shortly, it's safe only if they are dead.

I don't understand what you have written here.  Regardless of that, I
gather from what you say elsewhere that we need to guarantee that after
disconnect() returns, there are no URBs still in progress and no new URBs
are submitted.

If that's so, then there is nothing to worry about as far as usb-storage
is concerned.  There already is a semaphore that protects against exactly
this sort of thing.  You didn't see it in usb_stor_clear_halt and
usb_stor_reset_common because it is tested higher up in the code chain.
For instance, look at device_reset() in scsiglue.c; you'll see a line that
does

        down(&(us->dev_semaphore));

before us->transport_reset() is called.  You can find the same semaphore
used in storage_disconnect() in usb.c.

> > I'm afraid you're right about that.  Maybe the synchronous call routines
> > should be eliminated entirely.
>
> Now, a week before freeze?

That shouldn't pose a problem.  Greg KH can correct me if I'm wrong, but
the freeze means only that no new features are to be added.  Existing ones
can still be fixed or removed.

> > > The solution you suggest only works if you can ensure that the
> > > synchronous call has given you a handle to unlink the urb before
> > > disconnect() runs. To do so requires the synchronous helper to be called
> > > with a lock against disconnect() already held.
> >
> > Well no, I only need to insure that the synchronous call has given me a
> > handle before disconnect() _returns_.  But I agree that it's a problem.
>
> You cannot do that as well. Furthermore you must not call a synchronous
> call after disconnect() has returned. The device pointer you pass is
> dangerous.

Agreed.

> > My feeling is that we need either to rethink the synchronous calls or get
> > rid of them altogether.  There's another aspect of usb-storage that runs
>
> It's a week till feature freeze.

As I said, that shouldn't matter.

> > into difficulties with regard to synchronous calls.  The problem is that
> > most of the things that usb-storage does come about because of calls from
> > the scsi layer, and most of these calls are protected by a timeout and
> > abort handler.  But not all of them are, and also there are a few things
> > usb-storage does on its own behalf (initialization, mostly) that also do
> > not have the protection of an abort handler.
> >
> > As a result, we have to be able to abort some calls on demand but not
> > others, and we have to use timeouts for some calls but not others.  It's a
> > real mess, and we haven't gotten it straightened out yet.  My vote goes
> > for making things as uniform as possible, but Matt Dharm doesn't agree.
> > He has said that the driver should use different code paths for the
> > different situations.  Maybe this discussion about the problems with
> > synchronous calls will make us rethink the whole question.
>
> I am afraid I don't know the storage driver well enough to comment on that.

Actually, I was sort of hoping to drag Matt Dharm into this discussion.
He certainly understands the issues involved and the storage driver,
probably better than anyone.

Alan Stern



-------------------------------------------------------
This sf.net email is sponsored by: Influence the future 
of Java(TM) technology. Join the Java Community 
Process(SM) (JCP(SM)) program now. 
http://ads.sourceforge.net/cgi-bin/redirect.pl?sunm0002en

_______________________________________________
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Re: [linux-usb-devel] Re: usb_clear_halt <-> disconnect race

Reply via email to