On Fri, 15 Jun 2007, David Brownell wrote: > > I should have been more clear about this... Yes, there is a > > possibility that an unlink call can apparently "succeed" and yet the > > hardware could complete the URB normally before the "accelerated" > > completion kicks in. > > That's the race I pointed out. Resolving it in favor of > providing the hardware status reports would change the > unlink() call semantics ... successful unlink would no > longer be visible in urb status during complete().
But would it in fact be a successful unlink? The _only_ difference in behavior (at both the driver level and the hardware level) is in the URB's completion status. And if we resolve the race in favor of the hardware status then even that distinction would be gone. I know I'm going to regret this, but I propose that the API be changed so that an URB's final status be set to -ECONNRESET or -ENOENT if and only if it was terminated early by an unlink request; in other words, only if it really was cancelled at the hardware level. As a corollary of this change, the return value from usb_unlink_urb would indicate only whether the HCD accepted the unlink request, not whether the request caused the URB to terminate with an error status. This leads to the possibility of an apparently successful unlink followed by a successful URB completion; I think this is acceptable. > That's the change in semantics... usb_unlink_urb() is now > defined to work differently (i.e. as I described it). Yes, this is a change. We can afford to make it because no drivers pay close attention to usb_unlink_urb's return value except for usbtest. In fact, only about five drivers pay any attention to it at all. > > At unlink time there's no way to know if this will happen. > > Not true. It's trivial: only return success from the unlink() > if the urb status could be changed from -EINPROGRESS. And no > code in the HCD changes status that's already -EINPROGRESS. A misunderstanding -- I meant that at unlink time there's no way to know if the URB will complete normally before the HCD can take it away from the hardware. > That's how all code should currently be working. I know it's > how OHCI and EHCI have worked for many years ... and I just > checked the usbcore unlink glue, and I see it's doing its part > of that: returning -EBUSY if the HCD is in the process of > returning the URB, but hasn't yet called giveback(). I think that _is_ how all the code currently works -- and I think all the code should be changed. > > Whether > > reporting success for the unlink would be "wrong" is a matter of > > judgment; after all, the unlink did what it was supposed to and the URB > > completed promptly. > > Not "judgement"; specification. If unlink succeeds, the urb status > will always reflect that. (Otherwise it gets hardware status). > > Unlinking isn't only about completing ASAP; it's also about knowing > that particular completion path was triggered. I'd agree with you if any drivers besides usbtest actually used that information. In my experience, async unlinking _is_ only about completing ASAP. That was part of the reason for writing usb_kill_urb. > > Neither alternative is very satisfying... Which reminds me, ehci-hcd > > doesn't try very hard (if at all!) to accelerate the completion of > > unlinked Iso transfers. So shouldn't it return an error code for the > > unlink attempt? > > It returns as fast as it can. How is that not accelerated? :) > > I don't see much point to returning a failure code there. The > issue is that cleaning up in those cases isn't as simple as for > transfers that have a QH. Every ISO TD would need to be disabled > and then the schedule would need to be cleaned up. It's just that > nobody has implemented such stuff. It's easy to believe that cleaning up would be complex. uhci-hcd does it, however. (Not that I intend to write analogous code for ehci-hcd!) > > /* in case of unlink-during-submit */ > > spin_lock(&urb->lock); > > if (urb->status != -EINPROGRESS) { > > spin_unlock(&urb->lock); > > finish_request(sl811, ep, urb, 0); > > retval = 0; > > goto fail; > > } > > > > I don't know very much about that driver, but at first glance it seems > > possible that this path doesn't stop the endpoint queue -- which would > > be a bug.) > > That driver is prototypical in that the only endpoint queue is the one > maintained by usbcore. There's nothing like a QH for the hardware to > maintain ... other controller drivers work that way, but they've not > gotten to the main kernel tree yet. That's correct as written. I had to think about this... The driver is relying on the fact that interrupts are disabled during giveback, right? So the endpoint queue _can't_ advance. But consider this pathological case. An URB is submitted to an empty queue and unlinked before the submission is complete. The completion handler then submits a second URB. Won't that URB start executing on the hardware before the completion handler returns? > > (And speaking of bizarre HCDs, what's the story on u132-hcd.c? Is it > > likely ever to be cleaned up?) > > Dammifino. It doesn't seem to be maintained, despite being > quite new. It's an amazing source file, full of style violations, repeated code sequences, unprotected data accesses, and probably much more... > > Well, the stuff I'm working on affects all the HCDs anyway. This would > > just be more of the same. On the whole I think it's a good thing to > > do. It would remove cases 1 and 4, rendering much of our discussion > > moot. > > It'd be a lot more work too. ;) Not that much more. It would boil down to adding a couple of subroutine calls to each enqueue routine and a subroutine call to the finish_request analog. (Except for u132, which calls giveback_urb in multiple places...) > > I would make it keep calling usb_unlink_urb() repeatedly (with > > msleep(1) in between iterations) until the completion routine told it > > that either a resubmit failed or urb->status == -ECONNRESET. Put a cap > > on the number of iterations (e.g., 5) to tell when the test fails. > > But hitting that cap wouldn't indicate failure, it would > just indicate lack of patience ... that would produce loads > of false failures. Figure the odds. How long does it take the test URB to complete compared to the amount of time needed for resubmitting? If the test URB transfers enough data (say more than 1 ms worth) then we can be nearly certain of hitting a time when the URB is still in progress. If 5 iterations isn't enough use 10 instead. > Why bother with the msleep()? For SMP systems; it's not needed on UP. Without it you'd be in a busy-loop, trying over and over again to unlink the URB. If you just happened to hit the wrong time then you'd rack up 5 or 10 unlink attempts very quickly without giving the resubmit pathway enough time to run. > > To be really thorough, the test would have to correlate the code from > > unlink() with the value of urb->status. With proper synchronization > > between the completion handler and the main routine, this should be > > possible. > > You mean, like the code now found in the unlink1() test, which > verifies that the async unlink always returns -ECONNRESET? Not exactly. The current code can do multiple unlinks and multiple submits; it doesn't check that -ECONNRESET is the status of the URB that was in flight at the time of the successful unlink. It only knows that _one_ of the unlinks succeeded and _one_ of the URBs got -ECONNRESET. > > > > If you still think a return code is > > > > needed, my inclination is to change it to be as follows: > > > > > > > > -EIRDM: Case 0/5: error, URB was idle. > > > > -ECANCELED: Case 1: race with submit, submission was preempted. > > > > 0: Case 2: success, accelerated completion is underway. > > > > -EALREADY: Case 3/4: fault due to race with hardware completion. > > > > > > > > What do you think? Is the return code really needed? > > > > > > Well, I'm not keen on adding the -ECANCELED (not -EPERM for symmetry?) > > > case, but if you want to add it, then go ahead. And EIDRM is odd; > > > that seems more like an EINVAL kind of thing. > > > > We won't need -ECANCELED if we take your suggestion regarding the HCD > > infrastructure. > > Ergo said fiendish suggestion. :) Come to think of it, there is one situation where -ECANCELED might be appropriate: if the URB was already unlinked. You'd probably call that -EBUSY, though. But see below... > > And as discussed above, the -EALREADY cases aren't > > reliably detectable at unlink time. > > Today they'd return -EBUSY ... see hcd_unlink_urb(). > Once the HCD maps hardware status into URB status, > the unlink code refuses to report success, since the > URB is already in the process of being unlinked. After spending a fair amount of time on reducing the reliance on urb->status, I realized that we ought to be able to eliminate it almost entirely. Here's what I mean: In the absence of an unlink, HCDs should not need to store anything in urb->status until they call giveback_urb. Right now they _do_ store error codes there, but AFAIK there's no reason why an URB shouldn't be given back at the time the error is discovered -- which means the HCD would be able to keep the error status in a local variable rather than storing it in the URB. (Iso URBs may present a minor problem, since errors don't force them to complete early.) Doing this would require some reorganization of the HCDs. The ones which could benefit from it the most are the ones which don't allocate an urb_priv structure -- which unfortunately are the ones I'm least familiar with. For example, if I read the isp116x code correctly, it determines the statuses of all completed URBs for an endpoint before giving any of them back. But there's no apparent reason why it couldn't give them back one by one, as the status is determined. Now this would not allow us to do away with urb->status entirely. It is still needed for storing unlink error codes, since URBs cannot be given back immediately when they are unlinked. It's also needed by usbcore, because endpoint_disable has to know which URBs have already been cancelled. Still, it's better to have a field just for storing unlink codes than one for overall status. In addition we won't need urb->lock any more, since there won't be any contention between usbcore and the HCD for setting the status. It's an overall win -- but I don't know enough about the drivers to do it. What's your advice? > > So what do you think? Go ahead and move the endpoint-queue stuff into > > the HCDs? > > If you're going to volunteer to do that, I think that would be > a Fine Thing. :) Good. Alan Stern ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ linux-usb-devel@lists.sourceforge.net To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel