Re: [linux-usb-devel] [RFC] Eliminate usb_unlink_urb()'s return code

Alan Stern Thu, 14 Jun 2007 15:20:06 -0700

On Thu, 14 Jun 2007, David Brownell wrote:

> > The unlink status doesn't have to override hardware status reports, not 
> > even ones which haven't yet been collected.  If there is a relevant IRQ 
> > pending, the giveback must not occur until the IRQ has been serviced -- 
> > in which case the status from the hardware would be available in time 
> > to override the unlink status.
> 
> That implies you're removing unlink() return codes, yes?  Because
> if it doesn't override status that's not yet been reported, then
> you would be wrongly reporting success for the unlink() ... since
> the URB wouldn't report itself as unlinked in complete(), but the
> unlink would have reported success.


I should have been more clear about this...  Yes, there is a
possibility that an unlink call can apparently "succeed" and yet the
hardware could complete the URB normally before the "accelerated"  
completion kicks in.  In the spirit of giving the driver all the
information available from the hardware, I think that the URB's final
status should reflect the normal completion (i.e., should be 0 or a
hardware/protocol error code, not -ECONNRESET or -ENOENT).

At unlink time there's no way to know if this will happen.  Whether
reporting success for the unlink would be "wrong" is a matter of
judgment; after all, the unlink did what it was supposed to and the URB
completed promptly.  I don't like the idea of returning an error status
for an URB that completely successfully at the hardware level.

Neither alternative is very satisfying...  Which reminds me, ehci-hcd
doesn't try very hard (if at all!) to accelerate the completion of
unlinked Iso transfers.  So shouldn't it return an error code for the 
unlink attempt?


> > The reason for using the "both fail" approach is that it's much 
> > simpler.  A single test can fail the submit right at the beginning, 
> > with no need for a spurious code path which simulates a submit 
> > followed by an immediate giveback.  There's less to go wrong and it 
> > runs more quickly.
> 
> I guess I don't follow that argument.  The issue here is that
> the queue is partly managed by usbcore, so there's a window
> between usbcore marking the URB as queued and then the HCD
> sticking it onto the hardware queue.

Exactly.  That's what I referred to earlier as the somewhat artificial 
distinction between an URB being linked and enqueued.

> So long as that window exists, unlink requests can sneak into
> that window.  There's no "simulation" of a submit; the code
> path is the normal one, not a "spurious" one.

(Nitpicking: No, there really is a separate code path in many of the 
HCDs.  For example, in sl811h_urb_enqueue() there is:

        /* in case of unlink-during-submit */
        spin_lock(&urb->lock);
        if (urb->status != -EINPROGRESS) {
                spin_unlock(&urb->lock);
                finish_request(sl811, ep, urb, 0);
                retval = 0;
                goto fail;
        }

I don't know very much about that driver, but at first glance it seems
possible that this path doesn't stop the endpoint queue -- which would
be a bug.)

> At this point it might be possible to refactor things though...
> 
> With the CRIS HCD gone, the remaining legacy HCD infrastructure
> can finally be removed.

Hah!  You're talking about stuff from before I started working on 
usbcore.  What parts of the HCD infrastructure count as "legacy" 
nowadays?

(And speaking of bizarre HCDs, what's the story on u132-hcd.c?  Is it 
likely ever to be cleaned up?)

>  We *could* move responsibility for
> the per-endpoint queue entirely into HCDs, except for the head
> of the unlink() code path ... exposing the per-HCD spinlock
> used to protect that queue (and related stuff).

Yes.  Note that hcd_data_lock isn't per-HCD, even though it ought to
be.  If it were then we wouldn't have to expose anything.

>  And that's
> all without giving up the advantages of knowing where each
> endpoint's queue is located, so usbcore can look at it etc.
> 
> That would let us get rid of case #1 entirely, including its
> messiness for unlinking ... also get rid of corresponding
> oddness on the giveback() paths.  A net cleanup, which would
> unfortunately affect every HCD (including ones which have
> not yet merged upstream, sigh).

Well, the stuff I'm working on affects all the HCDs anyway.  This would 
just be more of the same.  On the whole I think it's a good thing to 
do.  It would remove cases 1 and 4, rendering much of our discussion 
moot.


> > So usbtest in particular needs to know whether an URB really did 
> > undergo "accelerated" completion.  Can't it tell from the final URB 
> > status?  Case #2 will be the only way for usb->status to be set to 
> > -ECONNRESET.
> 
> What matters here is testability of the interface.  What are you
> proposing here, exactly, for the test?
> 
> Remember that the reason for that loop resubmitting-in-completion
> is that it's hard to trigger unlinks.  And those unlink paths are
> in serious need of testing, since they don't happen all that often,
> and historically are *very* bug-prone parts of HCDs.

You're talking about unlink1() in usbtest, right?  The values it tests
for are -EBUSY and -EIDRM -- slightly odd since -EINPROGRESS is the
only documented value for usb_unlink_urb.

I would make it keep calling usb_unlink_urb() repeatedly (with
msleep(1) in between iterations) until the completion routine told it
that either a resubmit failed or urb->status == -ECONNRESET.  Put a cap
on the number of iterations (e.g., 5) to tell when the test fails.

> > If I take my preferred approach to #1 it would get its own code: race
> > with submission, submission preempted.  I'm still not sure whether that
> > should be considered a success, a fault, or an error; since submission
> > wasn't complete it ought to be the same as #0, but since the submission
> > was preempted the unlink wasn't a complete failure.  However I guess it
> > doesn't really matter what we call it.
> 
> It would be a second "fault due to race lost" case.  Arguably
> some test case should make sure each of the non-error paths
> happens.

Hmm.  It's very hard to provoke and test races without adding
artificial delays...


> > If you still think a return code is 
> > needed, my inclination is to change it to be as follows:
> > 
> > -EIRDM:       Case 0/5: error, URB was idle.
> > -ECANCELED:   Case 1: race with submit, submission was preempted.
> >  0:           Case 2: success, accelerated completion is underway.
> > -EALREADY:    Case 3/4: fault due to race with hardware completion.
> > 
> > What do you think?  Is the return code really needed?
> 
> Well, I'm not keen on adding the -ECANCELED (not -EPERM for symmetry?)
> case, but if you want to add it, then go ahead.  And EIDRM is odd;
> that seems more like an EINVAL kind of thing.

We won't need -ECANCELED if we take your suggestion regarding the HCD
infrastructure.  And as discussed above, the -EALREADY cases aren't
reliably detectable at unlink time.

As for -EIDRM -- I looked through the lists of error codes, and there
didn't seem to be anything along the lines of -ENOTBUSY!  But -EINVAL
will do just as well.

> This all gets down to testability, in my book.  It's clear to me
> how we can test the interface if unlink() returns fault codes.
> But if it doesn't, it's not at all clear.  And I'm not keen on
> giving up testability for a primitive that has caused so much
> trouble (oopsing etc) over time, and which is already tricky to
> get right inside the HCDs.

To be really thorough, the test would have to correlate the code from 
unlink() with the value of urb->status.  With proper synchronization 
between the completion handler and the main routine, this should be 
possible.

That still leaves the question of how to test all the various race 
outcomes.  I can't think of any good way to do it.

So what do you think?  Go ahead and move the endpoint-queue stuff into 
the HCDs?

Alan Stern


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Re: [linux-usb-devel] [RFC] Eliminate usb_unlink_urb()'s return code

Reply via email to