Re: [PATCH 2/2] NET: Accurate packet scheduling for ATM/ADSL (userspace)

2006-06-23 Thread Patrick McHardy
jamal wrote:
> On Tue, 2006-20-06 at 18:51 +0200, Patrick McHardy wrote:
> 
>> [..]
>>
>>contrary to a local link that would be best managed
>>in work-conserving mode. And I think for better accuracy it is
>>necessary to manage effective throughput, especially if you're
>>interested in guaranteed delays.
>>
> 
> 
> Indeed - but "fixing" the scheduler to achieve such management is not
> the first choice (would be fine if it is generic and non-intrusive)

I have a patch that introduces "sizetables" similar to ratetables
and performs the mapping once and stores the calculated size in the
cb. The schedulers take the size from the cb. Its not very large and
only has minimum overhead. I got distracted during testing by
inaccuracies in the 100mbit range with small packets caused by the
clock source resolution, so I've added a ktime() clocksource and am
currently busy auditing for integer overflows caused by the increased
clock rate. I'll clean up the patch once I'm done with that and post it.

>> [..]
>>
>>I think that point can be used to argue in favour of that Linux should
>>be able to manage effective throughput :)
>>
> 
> I think you have convinced me this is valuable I even suggest probes
> above to discover goodput;-). I hope i have convinced you how rude it
> would be to make extensive changes to compensate for goodput;->

Sure :) So far I haven't been able to measure any improvement by
accounting for link layer overhead, but probably because my test
scenario was chosen badly (very small overhead, large speed) and the
differences were lost in the noise.

>>>I am saying that #2 is the choice to go with hence my assertion earlier,
>>>it should be fine to tell the scheduler all it has is 1Mbps and nobody
>>>gets hurt. #1 if i could do it with minimal intrusion and still get to
>>>use it when i have 802.11g. 
>>>
>>>Not sure i made sense.
>>
>>HFSC is actually capable of handling this quite well. If you use it
>>in work-conserving mode (and the card doesn't do (much) internal
>>queueing) it will get clocked by successful transmissions. Using
>>link-sharing classes you can define proportions for use of available
>>bandwidth, possibly with upper limits. No hacks required :)
>>
> 
> 
> HFSC sounds very interesting - I should go and study it a little more.
> My understanding is though that it is a bit of a CPU pig, true?

It does more calculations at runtime than token-bucket based schedulers,
but it does perform comparable to HTB with a large number of classes,
in which case the constant overhead is probably not visible anymore
because much more time is spent searching, walking lists and trees and
so on. I didn't do any comparisons of constant costs.

>>Anyway, this again goes more in the direction of handling link speed
>>changes.
>>
> 
> 
> The more we discuss this, the more i think they are the same thing ;->

Not really. Link speed changes can be expressed by constant factors
that apply to bandwidth and delay (bandwidth *= f, delay /= f). Link
layer overhead usually can't be expressed this way.

>>>ip dev add compensate_header 100 bytes
>>
>>[...]
>>
>>Unforunately I can't think of a way to handle the ATM case without
>>a division .. or iteration.
>
> 
> I am not thinking straight right now but it does not sound like a big
> change to me i.e within reason.

I also got rid of the division ..

> Note, it may be valuable to think of
> this as related to the speed changing daemon as i stated earlier.
> Only in this case it is "static" discovery of link layer
> goodput/throughput vs some other way to dynamically discover things. 

I still think these are two quite different things. Link speed
changes also can't be handled very well by scaling packet sizes
since TBF-based qdiscs have configured maxima for the packet sizes
they can handle.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] NET: Accurate packet scheduling for ATM/ADSL (userspace)

2006-06-22 Thread jamal
On Tue, 2006-20-06 at 18:51 +0200, Patrick McHardy wrote:
> jamal wrote:

> > The issue is really is whether Linux should be interested in the
> > throughput it is told about or the goodput (also known as effective
> > throughput) the service provider offers. Two different issues by
> > definition. 

> In the case of PPPoE non-work-conserving qdiscs are already used
> to manage a link that is non-local with knowledge of the its
> bandwidth, 

I think that is a different issue though- you are managing a
point-to-point link then you will be working under the assumption of 
throughput not goodput. 

If you had knowledge of the goodput you should use that for a working
assumption; i think in practise that approach is valuable.
My arguement is against trying to make complex changes to compensate
the scheduler for such changes. Therefore i am not feeling sorry
for the poor guy who has to go and tell their PPP device "bandwith is 
only 1Mbps" when their ISP is claiming it is 2Mbps i.e
 
The ADSL case i have seen thus far is you trying manage something
because a BRAS 3-4 hops down the path uses ATM. To use my earlier
example the arguement is no different than saying 3-4 hops downlink
there is a wireless link which is 20% lossy. Armed with knowledge
like that you can tell something to the scheduler to resolve thing.
The daemon in user space for example could be sending bandwidth
measuring probes and telling the kernel of the new goodput.

> contrary to a local link that would be best managed
> in work-conserving mode. And I think for better accuracy it is
> necessary to manage effective throughput, especially if you're
> interested in guaranteed delays.
> 

Indeed - but "fixing" the scheduler to achieve such management is not
the first choice (would be fine if it is generic and non-intrusive)

> >>>Yes, Linux cant tell if your service provider is lying to you.
> >>
> >>I wouldn't call it lying as long as they don't say "1.5mbps IP
> >>layer throughput". 
> > 
> > 
> > It is a scam for sure.
> > By definition of what throughput is - you are telling the truth; just
> > not the whole truth. Most users think in terms of goodput and not
> > throughput. 
> > i.e you are not telling the whole truth by not saying "it is 1.5Mbps ATM
> > throughput". Tpyically not an issue until somebody finds that by leaving
> > out "ATM" you meant throughput and not goodput. 
> 
> 
> I think that point can be used to argue in favour of that Linux should
> be able to manage effective throughput :)
> 

I think you have convinced me this is valuable I even suggest probes
above to discover goodput;-). I hope i have convinced you how rude it
would be to make extensive changes to compensate for goodput;->

> > I am saying that #2 is the choice to go with hence my assertion earlier,
> > it should be fine to tell the scheduler all it has is 1Mbps and nobody
> > gets hurt. #1 if i could do it with minimal intrusion and still get to
> > use it when i have 802.11g. 
> > 
> > Not sure i made sense.
> 
> HFSC is actually capable of handling this quite well. If you use it
> in work-conserving mode (and the card doesn't do (much) internal
> queueing) it will get clocked by successful transmissions. Using
> link-sharing classes you can define proportions for use of available
> bandwidth, possibly with upper limits. No hacks required :)
> 

HFSC sounds very interesting - I should go and study it a little more.
My understanding is though that it is a bit of a CPU pig, true?

> Anyway, this again goes more in the direction of handling link speed
> changes.
> 

The more we discuss this, the more i think they are the same thing ;->


> > ip dev add compensate_header 100 bytes
> 
> Something like that, but its a bit more complicated.
> For ATM we need some mapping:
> [0-48]  -> 53
> [49-96] -> 106
> ...
> 
> for Ethernet we need:
> [0-60] -> 64
> [60-n] -> n + 4
> 

an upper bound check against MTU would be reasonable. 

> We could do something like this (feel free to imagine nicer names):
> 

The name should reflect that the table exists to "compensate for
goodput".

> ATM:
> table = {
>   .step = 53,
>   .map = {
>   [0..48] = 53,
>   [49..96] = 106,
>   ...
>   }
> };
> 
> Requiring a table of size 32 for typical MTUs.
> 
> Ethernet:
> 
> table = {
>   .step = 60,
>   .map = {
>   [0..60] = 60,
>   [...] = 0,
>   },
>   .fixed_overhead = 4,
> };
> 
> static inline unsigned int
> skb_wire_len(struct sk_buff *skb, struct net_device *dev)
> {
>   unsigned int idx, len;
> 
>   if (dev->lengthtable == NULL)
>   return skb->len;
>   idx = skb->len / dev->lengthtable->step;
>   len = dev->lengthtable->map[idx];
>   return dev->lengthtable->fixed_overhead + len ? len : skb->len;
> }
> 
> Unforunately I can't think of a way to handle the ATM case without
> a division .. or iteration.
> 

I am not thinking straight right now but it does not sound lik

Re: [PATCH 2/2] NET: Accurate packet scheduling for ATM/ADSL (userspace)

2006-06-20 Thread Patrick McHardy
jamal wrote:
> On Tue, 2006-20-06 at 16:45 +0200, Patrick McHardy wrote:
> 
>>Actually in the PPPoE case Linux doesn't know about ethernet
>>headers either, since shaping is usually done on the PPP device.
>>But that doesn't really matter since the ethernet link is not
>>the bottleneck - although it does add some delay for packetization.
> 
> 
> good point. But one could argue that is within linux (local) as opposed
> to something downstream at the ISP i.e. i have knowledge of it and i
> could do clever things. The other is: I have to know that the ISP is
> using pigeons as the link layer downstream and compensate for it.
> 
> The issue is really is whether Linux should be interested in the
> throughput it is told about or the goodput (also known as effective
> throughput) the service provider offers. Two different issues by
> definition. 


In the case of PPPoE non-work-conserving qdiscs are already used
to manage a link that is non-local with knowledge of the its
bandwidth, contrary to a local link that would be best managed
in work-conserving mode. And I think for better accuracy it is
necessary to manage effective throughput, especially if you're
interested in guaranteed delays.

>>>Yes, Linux cant tell if your service provider is lying to you.
>>
>>I wouldn't call it lying as long as they don't say "1.5mbps IP
>>layer throughput". 
> 
> 
> It is a scam for sure.
> By definition of what throughput is - you are telling the truth; just
> not the whole truth. Most users think in terms of goodput and not
> throughput. 
> i.e you are not telling the whole truth by not saying "it is 1.5Mbps ATM
> throughput". Tpyically not an issue until somebody finds that by leaving
> out "ATM" you meant throughput and not goodput. 


I think that point can be used to argue in favour of that Linux should
be able to manage effective throughput :)

>>Ethernet doesn't provide 100mbit IP layer
>>throughput either, and with minimum sized IP packets its actually
>>well below that.
>
> 
> OTOH, nobody has ethernet MTUs of 64 bytes.


Sure, but I might now want my HFSC class with guaranteed delay of 140us
to be distrurbed by someone sending small packets, that need more time
on the wire than HFSC thinks.

> To be academic and pedantic: The schedulers should be focusing on
> throughput and not goodput.
> Look at it from another angle related to the nature of the link layer
> used:
> If i buy a 1.5 Mbps 802.11JHS (such a link layer technology doesnt
> exist, but assume for the sake of arguement it does) from a wireless
> service provider, ethernet headers etc - but in this case the link is so
> bad (because of the link layer technology) i have to retransmit so much
> that 0.5 Mbps is wasted on retransmits, the question becomes: 
> 1)Do i fix the scheduler to compensate for this link layer retransmit?
> or
> 2)Do i find some other creative way to tell the scheduler that
> without making any changes to it that my ftp (despite the retransmits)
> should only chew 100Kbps.?
> 
> I am saying that #2 is the choice to go with hence my assertion earlier,
> it should be fine to tell the scheduler all it has is 1Mbps and nobody
> gets hurt. #1 if i could do it with minimal intrusion and still get to
> use it when i have 802.11g. 
> 
> Not sure i made sense.

HFSC is actually capable of handling this quite well. If you use it
in work-conserving mode (and the card doesn't do (much) internal
queueing) it will get clocked by successful transmissions. Using
link-sharing classes you can define proportions for use of available
bandwidth, possibly with upper limits. No hacks required :)

Anyway, this again goes more in the direction of handling link speed
changes.

>>A non intrusive way is prefered of course, but I can't really see
>>one if you want more than just a special-case solution that only
>>covers qdiscs using rate-tables and even ignores inner qdiscs.
>>HFSC and SFQ for example both need to calculate the wire length
>>at runtime.
>>
> 
> Agreed. That would be equivalent to #1 above.
> 
> 
>>Handling all qdiscs would mean adding a pointer to a mapping table
>>to struct net_device and using something like "skb_wire_len(skb, dev)"
>>instead of skb->len in the queueing layer. 
> 
> 
> That does seem sensible and simpler. I would suspect then that you will
> do this one time with something like
> ip dev add compensate_header 100 bytes

Something like that, but its a bit more complicated.
For ATM we need some mapping:
[0-48]  -> 53
[49-96] -> 106
...

for Ethernet we need:
[0-60] -> 64
[60-n] -> n + 4

We could do something like this (feel free to imagine nicer names):

ATM:
table = {
.step = 53,
.map = {
[0..48] = 53,
[49..96] = 106,
...
}
};

Requiring a table of size 32 for typical MTUs.

Ethernet:

table = {
.step = 60,
.map = {
[0..60] = 60,
[...] = 0,
},
.fixed_overhead = 4,
};

static inline unsigned int
sk

Re: [PATCH 2/2] NET: Accurate packet scheduling for ATM/ADSL (userspace)

2006-06-20 Thread jamal
On Tue, 2006-20-06 at 16:45 +0200, Patrick McHardy wrote:
> jamal wrote:

[..]

> Actually in the PPPoE case Linux doesn't know about ethernet
> headers either, since shaping is usually done on the PPP device.
> But that doesn't really matter since the ethernet link is not
> the bottleneck - although it does add some delay for packetization.

good point. But one could argue that is within linux (local) as opposed
to something downstream at the ISP i.e. i have knowledge of it and i
could do clever things. The other is: I have to know that the ISP is
using pigeons as the link layer downstream and compensate for it.

The issue is really is whether Linux should be interested in the
throughput it is told about or the goodput (also known as effective
throughput) the service provider offers. Two different issues by
definition. 

> > Yes, Linux cant tell if your service provider is lying to you.
> 
> I wouldn't call it lying as long as they don't say "1.5mbps IP
> layer throughput". 

It is a scam for sure.
By definition of what throughput is - you are telling the truth; just
not the whole truth. Most users think in terms of goodput and not
throughput. 
i.e you are not telling the whole truth by not saying "it is 1.5Mbps ATM
throughput". Tpyically not an issue until somebody finds that by leaving
out "ATM" you meant throughput and not goodput. 

> Ethernet doesn't provide 100mbit IP layer
> throughput either, and with minimum sized IP packets its actually
> well below that.
> 

OTOH, nobody has ethernet MTUs of 64 bytes.

> >>The issue here is, that ATM does not have fixed overhead (due to alignment 
> >>and padding).  This means that a fixed reduction of the bandwidth is not 
> >>the solution.  We could reduce the bandwidth to the worst-case overhead, 
> >>which is 62%, I do not think that is a good solution...
> >>
> > 
> > I dont see it as wrong to be honest with you. Your mileage may vary.
> 
> Its wasteful, and it can be avoided.
> 

If it can be avoided by being generic and without being intrusive, then
by all means.

> > Dont have time to read your doc and dont get me wrong, there is a
> > "quark" practical problem: As practical as the hard disk manufacturer
> > who claims that they have 11G drive when it is 10G. It needs to be
> > resolved - but not in an intrusive way in my opinion.
> 
> Not sure what a "quark" problem is .. but I think you're focusing
> too much on the aspect of "somebody is lying, not our fault".

No no - that is not my intent; sorry if it comes out that way. 
I am saying there is a practical "problem". The problem being someone is
equating throughput to effective throughput (also know as goodput).

To be academic and pedantic: The schedulers should be focusing on
throughput and not goodput.
Look at it from another angle related to the nature of the link layer
used:
If i buy a 1.5 Mbps 802.11JHS (such a link layer technology doesnt
exist, but assume for the sake of arguement it does) from a wireless
service provider, ethernet headers etc - but in this case the link is so
bad (because of the link layer technology) i have to retransmit so much
that 0.5 Mbps is wasted on retransmits, the question becomes: 
1)Do i fix the scheduler to compensate for this link layer retransmit?
or
2)Do i find some other creative way to tell the scheduler that
without making any changes to it that my ftp (despite the retransmits)
should only chew 100Kbps.?

I am saying that #2 is the choice to go with hence my assertion earlier,
it should be fine to tell the scheduler all it has is 1Mbps and nobody
gets hurt. #1 if i could do it with minimal intrusion and still get to
use it when i have 802.11g. 

Not sure i made sense.

> This is a real problem for any medium that adds link-layer headers.
> ATM is not even very special, the only thing special about it is
> that it has multiple "steps". But maybe I'm misunderstanding you,
> it has happened before :)
> 

I am not sure if i am making more sense now ;->

> A non intrusive way is prefered of course, but I can't really see
> one if you want more than just a special-case solution that only
> covers qdiscs using rate-tables and even ignores inner qdiscs.
> HFSC and SFQ for example both need to calculate the wire length
> at runtime.
> 

Agreed. That would be equivalent to #1 above.

> Handling all qdiscs would mean adding a pointer to a mapping table
> to struct net_device and using something like "skb_wire_len(skb, dev)"
> instead of skb->len in the queueing layer. 

That does seem sensible and simpler. I would suspect then that you will
do this one time with something like
ip dev add compensate_header 100 bytes

> That of course doesn't
> mean that we can't still provide pre-adjusted ratetables for qdiscs
> that use them.
> 

But what would the point be then if you can compensate as you did above?

Anyways, I have to go and meet The Man and i feel like i have hijacked
netdev this morning. So ttl.

cheers,
jamal

-
To unsubscribe from this list: send the line "unsu

Re: [PATCH 2/2] NET: Accurate packet scheduling for ATM/ADSL (userspace)

2006-06-20 Thread Patrick McHardy
jamal wrote:
> Heres the standard setup as i understand it(at least in north america, I
> know Europeans love their ATM with a little gravy on top):
> 
>
> |Linux| --ethernet-- |Modem| --DSL-- |DSLAM| --ATM-- |BRAS| 
> 
> 
> What this means is that Linux computes based on ethernet
> headers. Somewhere downstream ATM (refer to above) comes in and that
> causes mismatch in what Linux expects to be the bandwidth and what
> your service provider who doesnt account for the ATM overhead when
> they sell you "1.5Mbps".

Actually in the PPPoE case Linux doesn't know about ethernet
headers either, since shaping is usually done on the PPP device.
But that doesn't really matter since the ethernet link is not
the bottleneck - although it does add some delay for packetization.

> Yes, Linux cant tell if your service provider is lying to you.

I wouldn't call it lying as long as they don't say "1.5mbps IP
layer throughput". Ethernet doesn't provide 100mbit IP layer
throughput either, and with minimum sized IP packets its actually
well below that.

>>The patch is the solution to the classical problem people 
>>have when tryng to configure traffic control on an ADSL link?
>>
>>Q: The packet scheduling does not work all the time?
>>A: Try to decrease to bandwidth.
>>
>>
>>The issue here is, that ATM does not have fixed overhead (due to alignment 
>>and padding).  This means that a fixed reduction of the bandwidth is not 
>>the solution.  We could reduce the bandwidth to the worst-case overhead, 
>>which is 62%, I do not think that is a good solution...
>>
> 
> I dont see it as wrong to be honest with you. Your mileage may vary.

Its wasteful, and it can be avoided.

> Dont have time to read your doc and dont get me wrong, there is a
> "quark" practical problem: As practical as the hard disk manufacturer
> who claims that they have 11G drive when it is 10G. It needs to be
> resolved - but not in an intrusive way in my opinion.

Not sure what a "quark" problem is .. but I think you're focusing
too much on the aspect of "somebody is lying, not our fault".
This is a real problem for any medium that adds link-layer headers.
ATM is not even very special, the only thing special about it is
that it has multiple "steps". But maybe I'm misunderstanding you,
it has happened before :)

A non intrusive way is prefered of course, but I can't really see
one if you want more than just a special-case solution that only
covers qdiscs using rate-tables and even ignores inner qdiscs.
HFSC and SFQ for example both need to calculate the wire length
at runtime.

Handling all qdiscs would mean adding a pointer to a mapping table
to struct net_device and using something like "skb_wire_len(skb, dev)"
instead of skb->len in the queueing layer. That of course doesn't
mean that we can't still provide pre-adjusted ratetables for qdiscs
that use them.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] NET: Accurate packet scheduling for ATM/ADSL (userspace)

2006-06-20 Thread jamal

took off lartc off the list because it doesnt allow me to post
and i refuse to subscribe.

On Mon, 2006-19-06 at 21:31 +0200, Jesper Dangaard Brouer wrote:
> 
> On Thu, 15 Jun 2006, jamal wrote:
> > It is probably doable by just looking at netdevice->type and figuring
> > the link layer technology. Totally in user space and building the
> > compensated for tables there before telling the kernel (advantage is no
> > kernel changes and therefore it would work with older kernels as well).
> 
> I think you have got the setup all wrong.
> 
> The linux middlebox/router has two ethernet interfaces, one of the 
> ethernet interfaces is connected to the ADSL modem.  Thus, the linux 
> ethernet card cannot determine that it is connected to an ADSL line.
> 

Actually you may be making my point for me.

Heres the standard setup as i understand it(at least in north america, I
know Europeans love their ATM with a little gravy on top):

   
|Linux| --ethernet-- |Modem| --DSL-- |DSLAM| --ATM-- |BRAS| 


What this means is that Linux computes based on ethernet
headers. Somewhere downstream ATM (refer to above) comes in and that
causes mismatch in what Linux expects to be the bandwidth and what
your service provider who doesnt account for the ATM overhead when
they sell you "1.5Mbps".
Reminds me of hard disk vendors who define 1K to be 1000 to show
how large their drives are.
Yes, Linux cant tell if your service provider is lying to you.

> 
> The patch is the solution to the classical problem people 
> have when tryng to configure traffic control on an ADSL link?
> 
> Q: The packet scheduling does not work all the time?
> A: Try to decrease to bandwidth.
> 
>
> The issue here is, that ATM does not have fixed overhead (due to alignment 
> and padding).  This means that a fixed reduction of the bandwidth is not 
> the solution.  We could reduce the bandwidth to the worst-case overhead, 
> which is 62%, I do not think that is a good solution...
> 

I dont see it as wrong to be honest with you. Your mileage may vary.

> With the patch, you can now simply configure HTB to use the rate that was 
> specified by the ISP.
> 


Dont have time to read your doc and dont get me wrong, there is a
"quark" practical problem: As practical as the hard disk manufacturer
who claims that they have 11G drive when it is 10G. It needs to be
resolved - but not in an intrusive way in my opinion.

cheers,
jamal



-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] NET: Accurate packet scheduling for ATM/ADSL (userspace)

2006-06-19 Thread Jesper Dangaard Brouer



On Thu, 15 Jun 2006, jamal wrote:


On Thu, 2006-15-06 at 10:47 +1000, Russell Stuart wrote:

On Wed, 2006-06-14 at 11:57 +0100, Alan Cox wrote:

The other problem I see with this code is it is very tightly tied to ATM
cell sizes, not to solving the generic question of packetisation.


Others have made this point also.  I can't speak for Jesper,
but I did consider making it generic.


I also have considered to make it generic, but choose to make my patch as 
non-intrusive as possible to the kernel (and try to handle as much in 
userspace as possible).


Actually I do think that the kernel patch part is very generic.
The patch simply allow us to align the rate table/array.

With the kernel patch in place, we can work on the userspace TC program to 
support more and more types of exotic link layer modeling.




The issue was that
doing so would add more code, but I don't personally know
of any real world situation that would use the generic
solution.  I didn't fancy the thought of arguing on these
lists for code that no one would actually use.


;-)



If someone could put up their hand and say "Hey, I need
this," then expanding the patch to accommodate them would
be a pleasure.  I like generic code too.



It is probably doable by just looking at netdevice->type and figuring
the link layer technology. Totally in user space and building the
compensated for tables there before telling the kernel (advantage is no
kernel changes and therefore it would work with older kernels as well).


I think you have got the setup all wrong.

The linux middlebox/router has two ethernet interfaces, one of the 
ethernet interfaces is connected to the ADSL modem.  Thus, the linux 
ethernet card cannot determine that it is connected to an ADSL line.



The patch is the solution to the classical problem people 
have when tryng to configure traffic control on an ADSL link?


Q: The packet scheduling does not work all the time?
A: Try to decrease to bandwidth.

The issue here is, that ATM does not have fixed overhead (due to alignment 
and padding).  This means that a fixed reduction of the bandwidth is not 
the solution.  We could reduce the bandwidth to the worst-case overhead, 
which is 62%, I do not think that is a good solution...


With the patch, you can now simply configure HTB to use the rate that was 
specified by the ISP.


Please read chapter 6 ("Achieving Queue Control") page 55-65, where I 
demonstrate that the naive approach of reducing bandwidth does not work, 
when the packet distribution change on the link.


 http://www.adsl-optimizer.dk/thesis/

Cheers,
  Jesper Brouer

--
---
MSc. Master of Computer Science
Dept. of Computer Science, University of Copenhagen
Author of http://www.adsl-optimizer.dk
---
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] NET: Accurate packet scheduling for ATM/ADSL (userspace)

2006-06-15 Thread jamal
On Thu, 2006-15-06 at 10:47 +1000, Russell Stuart wrote:
> On Wed, 2006-06-14 at 11:57 +0100, Alan Cox wrote:
> > The other problem I see with this code is it is very tightly tied to ATM
> > cell sizes, not to solving the generic question of packetisation.
> 
> Others have made this point also.  I can't speak for Jesper,
> but I did consider making it generic.  The issue was that 
> doing so would add more code, but I don't personally know 
> of any real world situation that would use the generic 
> solution.  I didn't fancy the thought of arguing on these
> lists for code that no one would actually use.
> 
> If someone could put up their hand and say "Hey, I need
> this," then expanding the patch to accommodate them would
> be a pleasure.  I like generic code too.
> 

It is probably doable by just looking at netdevice->type and figuring
the link layer technology. Totally in user space and building the
compensated for tables there before telling the kernel (advantage is no
kernel changes and therefore it would work with older kernels as well).

cheers,
jamal



-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] NET: Accurate packet scheduling for ATM/ADSL (userspace)

2006-06-14 Thread Russell Stuart
On Wed, 2006-06-14 at 11:57 +0100, Alan Cox wrote:
> The other problem I see with this code is it is very tightly tied to ATM
> cell sizes, not to solving the generic question of packetisation.

Others have made this point also.  I can't speak for Jesper,
but I did consider making it generic.  The issue was that 
doing so would add more code, but I don't personally know 
of any real world situation that would use the generic 
solution.  I didn't fancy the thought of arguing on these
lists for code that no one would actually use.

If someone could put up their hand and say "Hey, I need
this," then expanding the patch to accommodate them would
be a pleasure.  I like generic code too.


Russell

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] NET: Accurate packet scheduling for ATM/ADSL (userspace)

2006-06-14 Thread Jesper Dangaard Brouer
On Wed, 2006-06-14 at 11:57 +0100, Alan Cox wrote:
> Ar Mer, 2006-06-14 am 11:40 +0200, ysgrifennodd Jesper Dangaard Brouer:
> > option to calculate traffic transmission times (rate table)
> > over all ATM links, including ADSL, with perfect accuracy.
>
> The other problem I see with this code is it is very tightly tied to ATM
> cell sizes, not to solving the generic question of packetisation. 

Well, we did consider to do so, but we though that it would be harder to
get it into the kernel.

Actually thats the reason for the defines:
 #defineATM_CELL_SIZE   53
 #defineATM_CELL_PAYLOAD48

Changing these should should make it possible to adapt to any other SAR
(Segment And Reasembly) link layer.

> I'm
> not sure if that matters but for modern processors I'm also sceptical
> that the clever computation is actually any faster than just doing the
> maths, especially if something cache intensive is also running.

I guess you are refering to the rate table lookup system, that is based
upon array lookups.  I do think that the rate table array lookup system
has been outdated, as memory access is the bottleneck on modern CPUs.
But its design by Alexey for a long time ago where the hardware
restrictions were different.  It also avoids floting point operations in
the kernel.

Thanks for your comments.

-- 
Med venlig hilsen / Best regards
  Jesper Brouer
  ComX Networks A/S
  Linux Network developer
  Cand. Scient Datalog / MSc.
  Author of http://adsl-optimizer.dk



signature.asc
Description: This is a digitally signed message part


Re: [PATCH 2/2] NET: Accurate packet scheduling for ATM/ADSL (userspace)

2006-06-14 Thread Alan Cox
Ar Mer, 2006-06-14 am 11:40 +0200, ysgrifennodd Jesper Dangaard Brouer:
> option to calculate traffic transmission times (rate table)
> over all ATM links, including ADSL, with perfect accuracy.


Only if the lowest level is encoded in a time linear manner. If you are
using NRZ, NRZI etc at the bottom level then you may still be out...



The other problem I see with this code is it is very tightly tied to ATM
cell sizes, not to solving the generic question of packetisation. I'm
not sure if that matters but for modern processors I'm also sceptical
that the clever computation is actually any faster than just doing the
maths, especially if something cache intensive is also running.

Alan

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] NET: Accurate packet scheduling for ATM/ADSL (userspace)

2006-06-14 Thread Jesper Dangaard Brouer

The Linux traffic's control engine inaccurately calculates
transmission times for packets sent over ADSL links.  For
some packet sizes the error rises to over 50%.  This occurs
because ADSL uses ATM as its link layer transport, and ATM
transmits packets in fixed sized 53 byte cells.

This changes the userspace tool iproute2/tc by adding an
option to calculate traffic transmission times (rate table)
over all ATM links, including ADSL, with perfect accuracy.

A longer presentation of the patch, its rational, what it
does and how to use it can be found here:
   http://www.stuart.id.au/russell/files/tc/tc-atm/

A earlier version of the patch, and a _detailed_ empirical
investigation of its effects can be found here:
   http://www.adsl-optimizer.dk/

Signed-off-by: Jesper Dangaard Brouer <[EMAIL PROTECTED]>
Signed-off-by: Russell Stuart <[EMAIL PROTECTED]>
---

diff -Nurp iproute2.orig/include/linux/pkt_sched.h 
iproute2/include/linux/pkt_sched.h
--- iproute2.orig/include/linux/pkt_sched.h 2005-12-10 09:27:44.0 
+1000
+++ iproute2/include/linux/pkt_sched.h  2006-06-13 11:53:27.0 +1000
@@ -77,8 +77,9 @@ struct tc_ratespec
 {
unsigned char   cell_log;
unsigned char   __reserved;
-   unsigned short  feature;
-   short   addend;
+   unsigned short  feature;/* Always 0 in pre-atm patch kernels */
+   charcell_align; /* Always 0 in pre-atm patch kernels */
+   unsigned char   __unused;
unsigned short  mpu;
__u32   rate;
 };
diff -Nurp iproute2.orig/tc/m_police.c iproute2/tc/m_police.c
--- iproute2.orig/tc/m_police.c 2005-01-19 08:11:58.0 +1000
+++ iproute2/tc/m_police.c  2006-06-13 11:53:27.0 +1000
@@ -35,7 +35,7 @@ struct action_util police_action_util = 
 static void explain(void)
 {
fprintf(stderr, "Usage: ... police rate BPS burst BYTES[/BYTES] [ mtu 
BYTES[/BYTES] ]\n");
-   fprintf(stderr, "[ peakrate BPS ] [ avrate BPS ]\n");
+   fprintf(stderr, "[ peakrate BPS ] [ avrate BPS ] [ 
overhead OVERHEAD ] [ atm ]\n");
fprintf(stderr, "[ ACTIONTERM ]\n");
fprintf(stderr, "Old Syntax ACTIONTERM := action 
[/NOTEXCEEDACT] \n"); 
fprintf(stderr, "New Syntax ACTIONTERM := conform-exceed 
[/NOTEXCEEDACT] \n"); 
@@ -134,7 +134,10 @@ int act_parse_police(struct action_util 
__u32 ptab[256];
__u32 avrate = 0;
int presult = 0;
-   unsigned buffer=0, mtu=0, mpu=0;
+   unsigned buffer=0, mtu=0;
+   __u8 mpu=0;
+   __s8 overhead=0;
+   int atm=0;
int Rcell_log=-1, Pcell_log = -1; 
struct rtattr *tail;
 
@@ -184,7 +187,7 @@ int act_parse_police(struct action_util 
fprintf(stderr, "Double \"mpu\" spec\n");
return -1;
}
-   if (get_size(&mpu, *argv)) {
+   if (get_u8(&mpu, *argv, 10)) {
explain1("mpu");
return -1;
}
@@ -198,6 +201,18 @@ int act_parse_police(struct action_util 
explain1("rate");
return -1;
}
+   } else if (strcmp(*argv, "overhead") == 0) {
+   NEXT_ARG();
+   if (p.rate.rate) {
+   fprintf(stderr, "Double \"overhead\" spec\n");
+   return -1;
+   }
+   if (get_s8(&overhead, *argv, 10)) {
+   explain1("overhead");
+   return -1;
+   }
+   } else if (strcmp(*argv, "atm") == 0) {
+   atm = 1;
} else if (strcmp(*argv, "avrate") == 0) {
NEXT_ARG();
if (avrate) {
@@ -264,22 +279,12 @@ int act_parse_police(struct action_util 
}
 
if (p.rate.rate) {
-   if ((Rcell_log = tc_calc_rtable(p.rate.rate, rtab, Rcell_log, 
mtu, mpu)) < 0) {
-   fprintf(stderr, "TBF: failed to calculate rate 
table.\n");
-   return -1;
-   }
+   tc_calc_ratespec(&p.rate, rtab, p.rate.rate, Rcell_log, mtu, 
mpu, atm, overhead);
p.burst = tc_calc_xmittime(p.rate.rate, buffer);
-   p.rate.cell_log = Rcell_log;
-   p.rate.mpu = mpu;
}
p.mtu = mtu;
if (p.peakrate.rate) {
-   if ((Pcell_log = tc_calc_rtable(p.peakrate.rate, ptab, 
Pcell_log, mtu, mpu)) < 0) {
-   fprintf(stderr, "POLICE: failed to calculate peak rate 
table.\n");
-   return -1;
-   }
-   p.peakrate.cell_log = Pcell_log;
-   p.peakrate.mpu = mpu;
+