Re: [linux-usb-devel] ehci-sched.c questions

2006-05-24 Thread David Brownell
On Tuesday 23 May 2006 12:27 pm, Christopher Montgomery wrote:
 Oh, a quick question that's been bugging me...
 
 I assume that intr_submit() in ehci-sched.c also handles high-speed
 endpoints (not just fs/ls endpoints).

Yes ...

 If so, why does 
 check_intr_schedule() not only blindly always assign a cmask, but also
 blindly check that cmask against TT scheduling slots?

For high speed, c_mask will be all-zeroes, right?  So such checks
will be NOPs.

- Dave


---
All the advantages of Linux Managed Hosting--Without the Cost and Risk!
Fully trained technicians. The highest number of Red Hat certifications in
the hosting industry. Fanatical Support. Click to learn more
http://sel.as-us.falkag.net/sel?cmd=lnkkid=107521bid=248729dat=121642
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] ehci-sched.c questions

2006-05-23 Thread Christopher Montgomery

Oh, a quick question that's been bugging me...

I assume that intr_submit() in ehci-sched.c also handles high-speed
endpoints (not just fs/ls endpoints).  If so, why does
check_intr_schedule() not only blindly always assign a cmask, but also
blindly check that cmask against TT scheduling slots?  That would seem
to end up erroneously rejecting quite a number of high speed
scheduling requests.

Monty


---
All the advantages of Linux Managed Hosting--Without the Cost and Risk!
Fully trained technicians. The highest number of Red Hat certifications in
the hosting industry. Fanatical Support. Click to learn more
http://sel.as-us.falkag.net/sel?cmd=lnkkid7521bid$8729dat1642
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] ehci-sched.c questions

2006-05-16 Thread Christopher Montgomery

On 5/13/06, David Brownell [EMAIL PROTECTED] wrote:

On Thursday 11 May 2006 8:49 am, Christopher Montgomery wrote:
 I also forgot the example of OHCI, which appears to do some/all of the
 actual intraframe scheduling in hardware.  Is it just sending out
 transaction packets in the order they're queued?

Starting 10% into the frame, yes.  Recall that no more than 90% of the
frame's bandwidth may be reserved for periodic transfers.


OK.  The diagrams I've seen have INTR then ISO (adding up to no more
than 90%) then async.  Meh, anyway, same basic concept.


 The ohci driver balances the initial
 transaction scheduling amongst frames, but there's no rebalancing code
 (indeed if OHCI is doing frame scheduling in hardware, there'd be no
 need).

Not true.  We don't need rebalancing because we've never had users
reporting they ran into the bandwidth limitations.  The most demanding
standard use case is that mouse tree case.


Illustration: a periodic frame with an INTR, an ISO and another ISO.
The ISO in the middle is unlinked.  The ISO after simply 'collapses
into the slot', that is, it's scheduled immediately after the INTR.
Any now ISO or QH is appended, prepended, whatever, there's no hole.
That's the 'no need for rebalancing' I'm speaking of.  [I'm aware this
might more accurately be referred to as 'rescheduling a frame', but
the Intel ehci docs use the term 'rebalancing' for both resceduling
within a uFrame as well as moving transactions to other periodic
slots.]


But it's easy to create a schedule where software rebalancing is needed.
Consider a schedule with two 400-usec transfers at period=2 ... they'd go
into alternate frames by default.  Now try to add a single transfer of
800-usec, also period=2.  There's enough bandwidth, if the schedule got
rebalanced so both 400-usec transfers went into the same frame.  The
way to get to that schedule from the initial one is to rebalance.


Yes.  We were referring to slightly different things, but your case is
obviously correct.

I'm more worried about bandwidth fragmentation within a given frame
than balancing problems across the schedule (as that's the case
currently preventing my hardware from working), but both obviously
must be considered.

Eventually I'll get all the terminology squared; we've been talking
past each other for much of this (but I greatly appreciate my
questions being answered).

Monty


---
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnkkid0709bid3057dat1642
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] ehci-sched.c questions

2006-05-16 Thread Christopher Montgomery

On 5/12/06, David Brownell [EMAIL PROTECTED] wrote:


Let me repeat that:  TT scheduling and highspeed scheduling interfere
with each other.  You can't perform one in ignorance of the other.


I believe we are vehemently agreeing with one another :-)


 Correct, but the reason we need to shift in the FS/LS case is a
 consequence of how the TTs are designed; ie, the FS frame is segmented
 when viewed through the TT, a situation that's not true in uhci or
 ohci or any 1.1 HC I know of.  They just run their full-frame queues
 in order and don't worry about resource contention.  They're not
 trying to expilictly schedule large multipart transfers into a
 segmented frame.

Right, which makes them an inadequate model to use with drivers that
DO need to worry about such resource contention ... like all high
speed host controller drivers.


Inadequate as they're implemented for FS/LS only, yes... not
fundamentally incorrect for the task of scheduling for a TT, simply
inadequate.  I wasn't suggesting just stuffing ohci code into ehci.  I
can't even imagine how one would try :-)

Anyway, I'm not going to disclaim further, I agree with everything you said.

Monty


---
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnkkid0709bid3057dat1642
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] ehci-sched.c questions

2006-05-16 Thread Christopher Montgomery

On 5/11/06, David Brownell [EMAIL PROTECTED] wrote:

On Thursday 11 May 2006 8:49 am, Christopher Montgomery wrote:
 (If so, the isos are occurring in totally random places in a given frame).

Nope.  Periodic transfers take up the second 90% of the frame (at most).
And this doesn't have any scheduling conflicts with high speed transfers.


I didn't mean that in practice they occur 'in totally random places'
but that the spec allows this, and it is not a violation of the USB
1.1 for the iso transfer to end up in completely different parts of a
FS frame from frame to frame, eg the beginning of one and the end of
another.  Again, I'm concerned with the FS frame first and fitting the
splits into the uFrame scheduling second, because things will nearly
always fit into the uFrame.




   The EHCI FS/LS sceduling should probably be aping the OHCI
 behavior as it's known to work well.

No; ohci-hcd does some best fit logic, which happens to be cheap for
that hardware.  Whereas ehci-hcd uses first fit, because it's costly
to do highspeed scheduling.


I don't mean we should emulate how the OHCI hardware is driven; I mean
that the end-result, the final produced FS frame produced by OHCI, is
efficient and that we should aspire to that level of efficiency in our
end result-- the FS frame produced by the TT.


The similarity is basic:  see how much space is available, try the next
schedule point if this one doesn't have enough (in all schedule slots).
And that's already coded.


Except that the way it's broken up now, I can't get a 1Mbps iso in
transfer to work through the TT, when ohci can push/pull 9Mbps to this
specific device without hiccups.  The current scheduler is putting at
least one and up to two full microframe-slots worth of holes between
every FS transaction, and when you have to fit three transactions into
a FS frame, you can see that it just won't fit. (to set up the iso in,
this device also needs two intr transactions to other endpoints).

For this to work, the holes have to go.  Every hole is immovably
wasted bandwidth.

Monty


---
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnkkid0709bid3057dat1642
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] ehci-sched.c questions

2006-05-16 Thread David Brownell
On Thursday 11 May 2006 8:49 am, Christopher Montgomery wrote:
 I also forgot the example of OHCI, which appears to do some/all of the
 actual intraframe scheduling in hardware.  Is it just sending out
 transaction packets in the order they're queued?

Starting 10% into the frame, yes.  Recall that no more than 90% of the
frame's bandwidth may be reserved for periodic transfers.

 (If so, the isos are 
 occurring in totally random places in a given frame).

Not totally random.  No periodic transfers in the first 100 usec, ever.
And then the periodic schedule is scanned, and one packet is transferred
according to each QH.

Now, it happens that the Linux OHCI driver uses a sparse scheduling
tree, so the only randomness is that if a lower frequency transfer
is scheduled, it will happen first and thus shift all the higher
frequency transfers to a later slot.

That can't matter much for interrupt transfers, except that lowspeed
transfers take a long time.  Which ISTR is one of the reason to use
the best fit scheduling algorithm ... even out those slow transfers,
and make it possible to pass standard lowspeed mouse tree tests and
still be able to include a bunch of normal fullspeed transfers.

It can however matter for isochronous transfers, which can have much
larger packets ... but which are normally at period 1, and thus sitting
at the end of the schedule, behind almost any interrupt transfer.


 The ohci driver balances the initial 
 transaction scheduling amongst frames, but there's no rebalancing code
 (indeed if OHCI is doing frame scheduling in hardware, there'd be no
 need).

Not true.  We don't need rebalancing because we've never had users
reporting they ran into the bandwidth limitations.  The most demanding
standard use case is that mouse tree case.

But it's easy to create a schedule where software rebalancing is needed.
Consider a schedule with two 400-usec transfers at period=2 ... they'd go
into alternate frames by default.  Now try to add a single transfer of
800-usec, also period=2.  There's enough bandwidth, if the schedule got
rebalanced so both 400-usec transfers went into the same frame.  The
way to get to that schedule from the initial one is to rebalance.




---
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] ehci-sched.c questions

2006-05-16 Thread Christopher Montgomery

On 5/12/06, David Brownell [EMAIL PROTECTED] wrote:


That is, your beef is with the weak TT scheduling, which isn't something
that OHCI (or UHCI) does, and which accordingly can't be borrowed from
either of those drivers.


Vehement agreement.  I'm looking to the other approaches and other
scheduling strategies and outcomes, not the code to any one driver.
I've been trying to illustrate several points and ask several
unrelated questions to understand the hardware in as much detail as
possible.

I never suggested at any point ohci code be borrowed... Merely the
ohci results looked to as an achievable goal.  We *can* pack FS frames
just as well in ehci, it merely requires more work.  For my use cases,
I'll have to.

I mentioned isoc transfers ending up in different parts of a frame
only because that allows substantial additional flexibility in
scheduling.

Monty


---
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnkkid0709bid3057dat1642
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] ehci-sched.c questions

2006-05-13 Thread David Brownell
On Friday 12 May 2006 12:22 pm, Christopher Montgomery wrote:

 I'm looking at things from the view of the FS/LB bus and the frame on
 that bus.  In the use cases I'm currently concerned about, the high
 speed bus (and any one uFrame) is never going to run out of bandwidth.

I'm more concerned with the big picture, which absolutely includes use
cass where bandwidth is an issue.  For example, the high bandwidth
periodic transfer modes can run at up to 24 KB/frame ... bandwidth
does start getting scarce.  Especially when folk want to run more than
one such high bandwidth transfer at a time, e.g. multiple high speed
webcams (and folk are starting to use those with Linux).

Regardless, the FS/LS scheduling does need to account for cases where
e.g. a given uframe may be fully scheduled with highspeed transfers,
and thus can't be used for any FS/LS transfers at all.  And vice versa;
scheduing an FS/LS transfer in some uframe may make it impossible to
put a given highspeed transfer there.

Let me repeat that:  TT scheduling and highspeed scheduling interfere
with each other.  You can't perform one in ignorance of the other.


  However, if I schedule a split in uFrame Y0 that uses only a fraction
 of of that uFrame's worth of the full-speed frame, idle the full speed
 bus, then schedule another split that runs in the next uFrame's (and
 thus in the Y1 section of the FS frame), there ends up being a hole in
 the FS frame that I'll likely not be able to schedule anything into,
 and places additional pressure on the scheduler being able to fit
 large isos. 

That's an entirely different issue though, one that's at least partially
addressed by Dan's patches (which I'll repost soon).  That issue is how
fully the TT schedule will be packed ... not the one we were discussing,
which is the bandwidth scheduling policy.


  The scheduling reduces how much shifting is going on.   Plus, don't
  forget that high speed scheduling concurrently uses many of the same
  resources, and that never gets shifted.
 
 Correct, but the reason we need to shift in the FS/LS case is a
 consequence of how the TTs are designed; ie, the FS frame is segmented
 when viewed through the TT, a situation that's not true in uhci or
 ohci or any 1.1 HC I know of.  They just run their full-frame queues
 in order and don't worry about resource contention.  They're not
 trying to expilictly schedule large multipart transfers into a
 segmented frame.

Right, which makes them an inadequate model to use with drivers that
DO need to worry about such resource contention ... like all high
speed host controller drivers.

- Dave



---
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] ehci-sched.c questions

2006-05-12 Thread Christopher Montgomery

On 5/11/06, David Brownell [EMAIL PROTECTED] wrote:

On Wednesday 10 May 2006 9:38 pm, Christopher Montgomery wrote:
 On 5/10/06, David Brownell [EMAIL PROTECTED] wrote:
  On Wednesday 10 May 2006 10:35 am, Christopher Montgomery wrote:
   In some
   ways it's easier to just stack requests into the full-speed frame and
   then plop the splits into whatever uFrame that happens to work out to,
   even if that means that, eg, an interval 1 iso request might be
   serviced in substantially different portions of the actual full-speed
   frame.
 
  So with interval 1 (one packet per frame), two consecutive packets
  might go into uframes 7 then 0?  That'd be more like interval 1/8, and
  the next might be 15/8, and the next ... the notion was avoiding such
  relatively non-isochronous behaviors.

 Yes.  the problem is that scheduling the high-speed frame with large
 random gaps (so that isos that exist in all of 256 o512 or 1024, etc,
 frames always start in the same uframe) places additional pressure on
 the scheduler, and complicates counting how many FS usecs are already
 budgeted.

I guess I don't quite see that.  The budget is per-uframe ...


I'm looking at things from the view of the FS/LB bus and the frame on
that bus.  In the use cases I'm currently concerned about, the high
speed bus (and any one uFrame) is never going to run out of bandwidth.
However, if I schedule a split in uFrame Y0 that uses only a fraction
of of that uFrame's worth of the full-speed frame, idle the full speed
bus, then schedule another split that runs in the next uFrame's (and
thus in the Y1 section of the FS frame), there ends up being a hole in
the FS frame that I'll likely not be able to schedule anything into,
and places additional pressure on the scheduler being able to fit
large isos.  In practice, I'm often seeing usbaudio.c schedule the
largest iso request last, so this isn't just theoretical.


The scheduling reduces how much shifting is going on.   Plus, don't
forget that high speed scheduling concurrently uses many of the same
resources, and that never gets shifted.


Correct, but the reason we need to shift in the FS/LS case is a
consequence of how the TTs are designed; ie, the FS frame is segmented
when viewed through the TT, a situation that's not true in uhci or
ohci or any 1.1 HC I know of.  They just run their full-frame queues
in order and don't worry about resource contention.  They're not
trying to expilictly schedule large multipart transfers into a
segmented frame.

Monty


---
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnkkid0709bid3057dat1642
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] ehci-sched.c questions

2006-05-12 Thread David Brownell
 The EHCI FS/LS sceduling should probably be aping the OHCI
   behavior as it's known to work well.
 
  No; ohci-hcd does some best fit logic, which happens to be cheap for
  that hardware.  Whereas ehci-hcd uses first fit, because it's costly
  to do highspeed scheduling.
 
 I don't mean we should emulate how the OHCI hardware is driven; I mean
 that the end-result, the final produced FS frame produced by OHCI, is
 efficient and that we should aspire to that level of efficiency in our
 end result-- the FS frame produced by the TT.

So that's a different issue than you had been discussing, and one that I
wouldn't initially describe using the word efficiency.  OHCI acts that
way because it doesn't have a TT scheduler.  The current TT scheduler output
is doesn't really understand TT bandwidth; the schedules it produces are
not dense, and your issue is that the schedules aren't dense.

For efficiency, one basic measure is CPU cost.  It's trivial to prove that
a first fit takes fewer CPU cycles than best fit, since it can stop
after finding the first valid solution.  Another measure is the electrical
cost to execute the resulting schedule.  I think that executing all schedules
with the same number and size of transfers is going to consume the same
wattage for its USB signaling, and Linux has DMA scheduling, so that's the
only hardware cost that's apparent.


  The similarity is basic:  see how much space is available, try the next
  schedule point if this one doesn't have enough (in all schedule slots).
  And that's already coded.
 
 Except that the way it's broken up now, I can't get a 1Mbps iso in
 transfer to work through the TT, when ohci can push/pull 9Mbps to this
 specific device without hiccups. 

And that's not related to the policy you were asking about -- whether to
schedule a given ISO transfer always starting in the same uframe.  It's an
entirely different issue ... whether the TT scheduler does dense schedules
or not.  It's been getting better at doing them, but only slowly since not
many folk need them.  (I'm glad you seem to have test cases that let you
construct and stress test dense periodic schedules.  That's a huge help.)

That is, your beef is with the weak TT scheduling, which isn't something
that OHCI (or UHCI) does, and which accordingly can't be borrowed from
either of those drivers.

- Dave



---
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] ehci-sched.c questions

2006-05-11 Thread Christopher Montgomery

On 5/11/06, Christopher Montgomery [EMAIL PROTECTED] wrote:

On 5/10/06, David Brownell [EMAIL PROTECTED] wrote:
 On Wednesday 10 May 2006 10:35 am, Christopher Montgomery wrote:
  In some
  ways it's easier to just stack requests into the full-speed frame and
  then plop the splits into whatever uFrame that happens to work out to,
  even if that means that, eg, an interval 1 iso request might be
  serviced in substantially different portions of the actual full-speed
  frame.

 So with interval 1 (one packet per frame), two consecutive packets
 might go into uframes 7 then 0?  That'd be more like interval 1/8, and
 the next might be 15/8, and the next ... the notion was avoiding such
 relatively non-isochronous behaviors.

[...]

Even if we restrict all the sitds of a given iso stream to
be scheduled in the same uframe, they're not guaranteed to actually
start in the same part of the uframe.. or even that uframe for that
matter, the TT is free to delay actually starting the transfer for up
to two additional uframes anyway.


I also forgot the example of OHCI, which appears to do some/all of the
actual intraframe scheduling in hardware.  Is it just sending out
transaction packets in the order they're queued?  (If so, the isos are
occurring in totally random places in a given frame). Or something
more complicated/opaque?  The ohci driver balances the initial
transaction scheduling amongst frames, but there's no rebalancing code
(indeed if OHCI is doing frame scheduling in hardware, there'd be no
need).The EHCI FS/LS sceduling should probably be aping the OHCI
behavior as it's known to work well.  I'll investigate this... unless
someone would like to jump in with a simple answer that will save me
the time :-)

Monty


---
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnkkid0709bid3057dat1642
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] ehci-sched.c questions

2006-05-11 Thread David Brownell
On Thursday 11 May 2006 8:49 am, Christopher Montgomery wrote:
 On 5/11/06, Christopher Montgomery [EMAIL PROTECTED] wrote:
  On 5/10/06, David Brownell [EMAIL PROTECTED] wrote:
   On Wednesday 10 May 2006 10:35 am, Christopher Montgomery wrote:
In some
ways it's easier to just stack requests into the full-speed frame and
then plop the splits into whatever uFrame that happens to work out to,
even if that means that, eg, an interval 1 iso request might be
serviced in substantially different portions of the actual full-speed
frame.
  
   So with interval 1 (one packet per frame), two consecutive packets
   might go into uframes 7 then 0?  That'd be more like interval 1/8, and
   the next might be 15/8, and the next ... the notion was avoiding such
   relatively non-isochronous behaviors.
 [...]
  Even if we restrict all the sitds of a given iso stream to
  be scheduled in the same uframe, they're not guaranteed to actually
  start in the same part of the uframe.. or even that uframe for that
  matter, the TT is free to delay actually starting the transfer for up
  to two additional uframes anyway.
 
 I also forgot the example of OHCI, which appears to do some/all of the
 actual intraframe scheduling in hardware.  Is it just sending out
 transaction packets in the order they're queued? 

Yes, but ...

 (If so, the isos are occurring in totally random places in a given frame). 

Nope.  Periodic transfers take up the second 90% of the frame (at most).
And this doesn't have any scheduling conflicts with high speed transfers.


   The EHCI FS/LS sceduling should probably be aping the OHCI
 behavior as it's known to work well.

No; ohci-hcd does some best fit logic, which happens to be cheap for
that hardware.  Whereas ehci-hcd uses first fit, because it's costly
to do highspeed scheduling.

The similarity is basic:  see how much space is available, try the next
schedule point if this one doesn't have enough (in all schedule slots).
And that's already coded.



---
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] ehci-sched.c questions

2006-05-11 Thread David Brownell
On Wednesday 10 May 2006 9:38 pm, Christopher Montgomery wrote:
 On 5/10/06, David Brownell [EMAIL PROTECTED] wrote:
  On Wednesday 10 May 2006 10:35 am, Christopher Montgomery wrote:
   In some
   ways it's easier to just stack requests into the full-speed frame and
   then plop the splits into whatever uFrame that happens to work out to,
   even if that means that, eg, an interval 1 iso request might be
   serviced in substantially different portions of the actual full-speed
   frame.
 
  So with interval 1 (one packet per frame), two consecutive packets
  might go into uframes 7 then 0?  That'd be more like interval 1/8, and
  the next might be 15/8, and the next ... the notion was avoiding such
  relatively non-isochronous behaviors.
 
 Yes.  the problem is that scheduling the high-speed frame with large
 random gaps (so that isos that exist in all of 256 o512 or 1024, etc,
 frames always start in the same uframe) places additional pressure on
 the scheduler, and complicates counting how many FS usecs are already
 budgeted.

I guess I don't quite see that.  The budget is per-uframe ...


 Even if we restrict all the sitds of a given iso stream to 
 be scheduled in the same uframe, they're not guaranteed to actually
 start in the same part of the uframe.. or even that uframe for that
 matter, the TT is free to delay actually starting the transfer for up
 to two additional uframes anyway.

The scheduling reduces how much shifting is going on.   Plus, don't
forget that high speed scheduling concurrently uses many of the same
resources, and that never gets shifted.





---
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] ehci-sched.c questions

2006-05-10 Thread Christopher Montgomery

One new question, hopefully this one is just rote arcana to one of you

The current scheduler strategy always tries to place a given periodic
request (be it iso or intr) into the same uFrame slot in all scheduled
HFrames.  Is this a requirement of the spec (I don't recall seeing
such a restriction) or merely convenient to the current code?  In some
ways it's easier to just stack requests into the full-speed frame and
then plop the splits into whatever uFrame that happens to work out to,
even if that means that, eg, an interval 1 iso request might be
serviced in substantially different portions of the actual full-speed
frame.

Perhaps the latest iteration of rereading the specs again will yield
yet another new layer of understanding ;-)  The corner cases are
looking less and less scary with each pass; the logic all feels
intuitive at this point.

Monty


---
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnkkid0709bid3057dat1642
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] ehci-sched.c questions

2006-05-10 Thread David Brownell
On Wednesday 10 May 2006 10:35 am, Christopher Montgomery wrote:
 
 The current scheduler strategy always tries to place a given periodic
 request (be it iso or intr) into the same uFrame slot in all scheduled
 HFrames.  Is this a requirement of the spec (I don't recall seeing
 such a restriction) or merely convenient to the current code?

Neither.


 In some 
 ways it's easier to just stack requests into the full-speed frame and
 then plop the splits into whatever uFrame that happens to work out to,
 even if that means that, eg, an interval 1 iso request might be
 serviced in substantially different portions of the actual full-speed
 frame.

So with interval 1 (one packet per frame), two consecutive packets
might go into uframes 7 then 0?  That'd be more like interval 1/8, and
the next might be 15/8, and the next ... the notion was avoiding such
relatively non-isochronous behaviors.

Another concern:  such an irregular schedules would facilitate hitting
configurations which need rebalancing to reclaim space wasted by misshapen
holes.  We have no rebalancing code (yet?); not that we seem to have hit
bandwidth limitations other than sub-optimal hardware, but it'd seem be
good to avoid such issues by design.


By the way, one of the next frontiers for high speed scheduling will
be making it more independent of EHCI.  There's highspeed silicon that
doesn't use EHCI, and has an even stronger need for software scheduling
because the host talks to hardware at the level of a USB FIFO.  EHCI
effectively does fifo scheduling in hardware, but sometimes it must
be done in software.

So that means high speed scheduling is multi-level:  fifo, uframe, TT.
I'm glad to hear it's sounding simpler to you.  ;)

- Dave



---
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] ehci-sched.c questions

2006-05-10 Thread Christopher Montgomery

On 5/10/06, David Brownell [EMAIL PROTECTED] wrote:

On Wednesday 10 May 2006 10:35 am, Christopher Montgomery wrote:
 In some
 ways it's easier to just stack requests into the full-speed frame and
 then plop the splits into whatever uFrame that happens to work out to,
 even if that means that, eg, an interval 1 iso request might be
 serviced in substantially different portions of the actual full-speed
 frame.

So with interval 1 (one packet per frame), two consecutive packets
might go into uframes 7 then 0?  That'd be more like interval 1/8, and
the next might be 15/8, and the next ... the notion was avoiding such
relatively non-isochronous behaviors.


Yes.  the problem is that scheduling the high-speed frame with large
random gaps (so that isos that exist in all of 256 o512 or 1024, etc,
frames always start in the same uframe) places additional pressure on
the scheduler, and complicates counting how many FS usecs are already
budgeted.  Even if we restrict all the sitds of a given iso stream to
be scheduled in the same uframe, they're not guaranteed to actually
start in the same part of the uframe.. or even that uframe for that
matter, the TT is free to delay actually starting the transfer for up
to two additional uframes anyway.  Given the already substantial
shifting going on, I had to wonder if eliminating the 'always the same
uframe' restriction was really all that much of a practical
difference... or what the spec actually has to say on the matter.  It
seems odd there'd be no official suggestion or mandated behavior.


Another concern:  such an irregular schedules would facilitate hitting
configurations which need rebalancing to reclaim space wasted by misshapen
holes.  We have no rebalancing code (yet?); not that we seem to have hit
bandwidth limitations other than sub-optimal hardware, but it'd seem be
good to avoid such issues by design.


I had thought of this and had assumed rebalancing code was in place.
Given that it is not, it looks like I have to deal with holes
anyway or also write rebalancing code :-(


By the way, one of the next frontiers for high speed scheduling will
be making it more independent of EHCI.  There's highspeed silicon that
doesn't use EHCI, and has an even stronger need for software scheduling
because the host talks to hardware at the level of a USB FIFO.  EHCI
effectively does fifo scheduling in hardware, but sometimes it must
be done in software.


Well, so far what I'm doing is all at a level above the silicon so the
conceptual implementation is abstract.

Monty


---
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnkkid0709bid3057dat1642
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] ehci-sched.c questions

2006-05-09 Thread David Brownell
On Wednesday 03 May 2006 1:03 pm, Christopher Montgomery wrote:
 I've done some of the requisite homework myself for getting full speed
 audio devices to work properly through ehci and TTs on a 2.0 hub
 (playback *and* capture).  At this point it all works although my
 changes are not production-ready.

Good; the ISO stuff has always been a bit iffy, in that some things
have worked and others haven't.  Capture has been iffiest.


 I have three specific questions for now.
 
 The primary reason DIR_IN iso modes didn't work comes down to schedule
 mask bugs. However, there's also an explicit test to see that IN
 csplits are not wrapped into a following frame; if they would have to
 be, the schedule attempt is rejected.  I see no reason for this
 restriction anywhere in the spec docs (and indeed, ISO IN can't work
 properly unless the restriction is removed; the 'start capture' OUT
 plus the 'get capture data' IN don't fit in one frame for  stereo
 capture).  AFAICT, the restriction was there to hide/prevent a c_mask
 overflow.

And the question is .. why?  Probably because what you see is the
very first implementation for the split iso support, and the goal
was to have something work at all (for the test devices at hand,
mostly USB speakers) and defer the more complex scheduling issues
until later when they were better understood.

So that restriction is a simplification.  And you're wrong about
the can't work; the start split plus N* complete split messages
can certainly fit into a single frame in cases like SSPLIT in uframe 0
and needing only a couple CSPLITs.


 Second question: The official USB 2.0 spec shows no examples that do
 this, but the Intel echi spec examples all pipeline transfers heavily
 by overlapping the final c-split of the current transfer and the
 initial s-split of the following transfer into the same uframe.  I've
 implemented the same strategy myself and it works as the Intel docs
 suggest... but is there a reason the preexisting scheduler does not do
 this?  Is there a contention issue I'm not seing (eg, is this only
 working because my final csplit and overlapping intiial ssplit are
 small enough to fit in one uframe, but it's possible that wouldn't
 work in the future?  Or is the final c-split always padding anyway?)

That sort of pipelining was always on the to do list, but the
initial scheduling for periodic split transfers (iso and interrupt)
was simplistic, in order to get something working.

I have some patches from Dan Streetman that I need to go back and
re-review.  As you implied, the USB 2.0 spec and the EHCI spec are
not quite in lockstep ... not that there's necessarily any kind of
conflic, but split iso scheduling is not at all simple, and time
has been lacking.  (I'll forward his patches to you, with some
comments about one issue I need to make time to investigate.)


 Third: In ehci-sched.c:check_intr_sched():
 
 if (qh-c_usecs  uframe = 6){   /* FSTN territory? */
   goto done;
 }
 
 Huh?  Reject any csplit that falls into uframe 6 or 7 for an intr? 
 That doesn't make any sense (yes, a s-split can't fall into uframe 6,
 but that's something different that, ironically, the code doesn't
 check for).

At this point I don't recall what was going on there.  Likely it's
the same sort of take some shortcuts to get something working
thing I mentioned already.  That might even predate the existence
of the interrupt schedule tree.

- Dave



---
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] ehci-sched.c questions

2006-05-09 Thread Christopher Montgomery

Hi David,

I thought my post got lost; I never saw it appear and hadn't seen
replies.  Scratch that, I was *hoping* it got lost, as more code
research later showed each question was at least partially
nonsensical.

At this point, the actual ISO transfer machinery all looks perfectly
solid.  The problem is entirely the scheduler.  It's not that the
scheduler is buggy, it's that it burns burns burns bandwidth as if
there's an infinite number of uFrames to stick requests into.  Once
it's faced with, eg, full duplex audio using two endpoints each with a
data interval of one, it can't usually fit both requests into the
Hframe cmask/ssmask despite the fact they'd fit comfortably in half of
one full-speed frame.

But this we know.


 The primary reason DIR_IN iso modes didn't work comes down to schedule
 mask bugs.

And the question is .. why?


Actually, I retract this.  The schedule mask code is not buggy, it's
just, as you say, an initial design.


That sort of pipelining was always on the to do list, but the
initial scheduling for periodic split transfers (iso and interrupt)
was simplistic, in order to get something working.

I have some patches from Dan Streetman that I need to go back and
re-review.  As you implied, the USB 2.0 spec and the EHCI spec are
not quite in lockstep ... not that there's necessarily any kind of
conflic, but split iso scheduling is not at all simple, and time
has been lacking.  (I'll forward his patches to you, with some
comments about one issue I need to make time to investigate.)


I've also been reviewing the schedulers for other OSes to help with
spec understanding.  They primarily schedule by counting microseconds
used in uFrames and full speed frames.  SSPLITS and CSPLITS overlap
freely (so long as each endpoint only sees one per frame and the other
limitation restrictions are obeyed).


 Third: In ehci-sched.c:check_intr_sched():

 if (qh-c_usecs  uframe = 6){   /* FSTN territory? */
   goto done;
 }

 Huh?  Reject any csplit that falls into uframe 6 or 7 for an intr?
 That doesn't make any sense (yes, a s-split can't fall into uframe 6,
 but that's something different that, ironically, the code doesn't
 check for).

At this point I don't recall what was going on there.  Likely it's
the same sort of take some shortcuts to get something working
thing I mentioned already.  That might even predate the existence
of the interrupt schedule tree.


[After more meditation] the code is for avoiding the need to
potentially schedule an FSTN, but I'm still not quite sure that just
checking uFrame number against '6' was sufficient... I'd have though
'4' was correct, or you'd have to look at the highest bit in the mask,
or count microseconds

Popping back up in the conversation, I have a device that will never
fit all it needs to do in one Hframe using the bitmask scheduling
method.  It needs to 'overlap' too tightly with two interval 1
endpoints and two interval 32 endpoints all in the periodic schedule. 
The timing fits (obviously, it works fine under UHCI), but not if you

can't overlap transactions.

So, send me that patch and I'll read up on the spec sections detailing
budgeting timing.  I need this bugger to work, or I need to buy a
MacMini ;-)

Monty


---
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnkkid0709bid3057dat1642
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel