date:20060719

Re: [PATCH 0/2] NET: Accurate packet scheduling for ATM/ADSL (RTAB BUG)

2006-07-19 Thread Russell Stuart

On Thu, 2006-07-20 at 01:00 +0400, Alexey Kuznetsov wrote:
> Hello!

So you really do exist?  I thought it was just
rumour.

> Well, if fixed point arithmetics is not a problem.

It shouldn't be.  Any decimal number can be expressed
as a fraction, eg:

  0.00123 = 123/10

Which can be calculated as a multiply and a divide. With
MTU's up to 2048, it should be possible to do this with
99.% accuracy (ie 2048/2^23).

With a bit more work in userspace (ie in tc), it can be
be reduced to a multiply and a shift.

> Plus, remember, the function is not R*size, it is at least
> R*size+addend, to account for link overhead. Plus account for padding
> of small packets. Plus, when policing it should deaccount already added
> link headers, QoS counts only network payload.

Yes, it is flexible - and has served us well up until
now.  It doesn't work well for ATM, but with a small
bit of extra calculation in the kernel it could.
However, it turns out that ATM is a special case.  If 
ATM's cell payload was 58 bytes instead of 48 bytes 
(say), then it would not be possible to produce a RTAB 
that had small errors (eg < 10%) for smallish packet 
sizes (< 290 bytes).  I seem to have trouble 
explaining why in a concise way that people understand, 
so I won't try here.

So when Alan Cox said our ATM patch didn't solve the 
packetisation problem in general, he was right as our
patch just built upon RTAB.  Patrick's STAB proposal 
in general either for that matter, as it is just another 
implementation of RTAB with the same limitations.  The 
only way I can think of to solve it in general is to 
move many more calculations into the kernel - as I 
proposed in a long winded answer to Patrick earlier 
in this thread.

But doing so would get rid of the table implementation 
and the flexibility it has given us to date.  For that 
reason I feel uncomfortable with it.

The engineering decision becomes this - are there any
other protocols like ATM out there that could justify 
such a change?  (In my more cynical moments I think of 
it differently - has/is the world going to make a 
second engineering fuck up on the scale of ATM again?  
How on earth did anyone decide that pushing data 
packets over ATM, as happens in ADSL, was a good 
idea?)  I know of no other such protocols.  But then
I don't have an encyclopedic knowledge of comms
protocols, so that doesn't mean much.  I suspect you
know a good deal more about them than I do.  What say
you?

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ipw2200: Driver lockup

2006-07-19 Thread Zhu Yi

On Wed, 2006-07-19 at 22:58 -0400, John W. Linville wrote:
> For what it is worth, there was a patch for this posted back
> in January.  It stirred-up a kerfluffle, so it never got merged.
> FWIW, it touches on 802.11e QoS and multiple TX queues -- my personal
> favorite wireless subject...NOT!
> 
> The thread is available here (first post not connected to follow-up
> thread for some reason):
> 
>   http://marc.theaimsgroup.com/?l=linux-netdev&m=113809246102858&w=2
>   http://marc.theaimsgroup.com/?l=linux-netdev&m=113814103024576&w=2
> 
> Given that half a year has passed, does anyone have any better ideas
> now?  Should I merge the patch?  Or is the cure worse than the disease?

The patch from Stefan Rompf in the second link has already been merged.
The first one was already merged (with slightly difference). We can just
remove the ieee80211 warning now.

[PATCH] ieee80211: remove ieee80211_tx() is_queue_full warning

Signed-off-by: Zhu Yi <[EMAIL PROTECTED]>
---

--- a/net/ieee80211/ieee80211_tx.c
+++ b/net/ieee80211/ieee80211_tx.c
@@ -533,13 +533,6 @@ int ieee80211_xmit(struct sk_buff *skb, 
return 0;
}
 
-   if (ret == NETDEV_TX_BUSY) {
-   printk(KERN_ERR "%s: NETDEV_TX_BUSY returned; "
-  "driver should report queue full via "
-  "ieee_device->is_queue_full.\n",
-  ieee->dev->name);
-   }
-
ieee80211_txb_free(txb);
}
 
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/2] NET: Accurate packet scheduling for ATM/ADSL

2006-07-19 Thread Russell Stuart

On Wed, 2006-07-19 at 16:50 +0200, Patrick McHardy wrote:
> Please excuse my silence, I was travelling and am still catching up
> with my mails.

Sorry.  Had I realised you were busy I would of
waited.

> > - As it stands, it doesn't help the qdiscs that use 
> >   RTAB.  So unless he proposes to remove RTAB entirely 
> >   the ATM patch as it will still have to go in.
> 
> Why? The length calculated by my STABs (or something similar)
> is used by _all_ qdiscs. Not only for transmission time calculation,
> but also for statistics and estimators.

Oh.  I didn't see where it is used for the time 
calculation in your patch.  Did I miss something,
or is that the unfinished bit?

This is possibly my stumbling block.  If you don't remove
RTAB the ATM patch as stands will be needed.  Your patch
didn't remove RTAB, and you didn't say it was intended to,
so I presume it wasn't going to.

>  If the length calculation
> doesn't fit for ATM, that can be fixed.

Yes of course.  Just to be clear: as far as I am concerned
this never was an issue.

> > - A bit of effort was put into making this current
> >   ATM patch both backwards and forwards compatible.
> >   Patricks patch would work with newer kernels,
> >   obviously.  Older kernels, and in particular the
> >   kernel that Debian is Etch is likely to distribute
> >   would miss out.
> 
> True, but it provides more consistency, and making current
> kernels behave better is more important than old kernels.

I guess provided the new "tc" works with older kernels this
is OK - although a disappoint to me.  Works here being defined
as "works as well as a previous the version of tc does".  For 
me not working would be OK as well provided "tc" issued a 
warning message to the effect that it "needs kernel version 
XXX or above"", but doing that would probably require it to 
look at the kernel version.  Looking at the kernel version 
in tc seems to be frowned upon.

> You seem to have misunderstood my patch. It doesn't need to
> touch RTABs, it just calculates the packet length as seen
> on the wire (whereever it is) and uses that thoughout the
> entire qdisc layer.

No, you have it in reverse - as I said above.  My problem is 
that your patch does not touch RTAB.  Several qdiscs really 
don't care about the length of a packet (other than for 
keeping track of stats) - they just care about how long 
it takes to send.  Off the top of my these are HTB, CBQ 
and TBF.  They use RTAB to make this calculation.  So unless
you replace RTAB with STAB the current ATM patch will still 
be needed.

> > One other point - the optimisation Patrick proposes
> > for STAB (over RTAB) was to make the number of entries
> > variable.  This seems like a good idea.  However there 
> > is no such thing as a free lunch, and if you did 
> > indeed reduce the number of entries to 16 for Ethernet 
> > (as I think Patrick suggested), then each entry would
> > cover 1500/16 = 93 different packet lengths.  Ie,
> > entry 0 would cover packet lengths 0..93, entry 1
> > 94..186, and so on.  A single entry can't be right
> > for all those packet lengths, so again we are back
> > to a average 30% error for typical VOIP length
> > packets.
> 
> My patch doesn't uses fixed sized cells, so it can deal
> with anything, worst case is you use one cell per packet
> size. Optimizing size and lookup speed for ethernet makes
> a lot more sense than optimizing for ADSL.

I was just responding to a point you made earlier, when
you said STAB could only use 16 entries as opposed to the
256 used by RTAB.  I suspect nobody would actually do that 
because of the inaccuracy it creates, so the comparison is
perhaps unfair.  I agree the flexibility of making STAB 
variable length is a good idea, and comes at 0 cost in 
the kernel.

Andy Furniss wrote:
> > Russell Stuart wrote:
> >> The kernel will have to do a shift and a division
> >> for each packet, which I assume is permissible.
> > 
> > 
> > I guess that is for others to decide :-) I think Patrick has a point
> > about sfq/htb drr, Like you I guess, I thought that alot of extra per
> > packet calculations would have got an instant NO.
> 
> Its only done once per packet (currently, it might be interesting to
> override the length for specific classes and their childs, for example
> if you do queueing on eth0 and have an DSL router one hop apart).
> The division is gone in my patch btw.

Unlike the packet length the time calculation can't be
cached in the skb.  Most classes in HTB/CBQ use different
packet transmission rates.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] via-velocity: fix reported speed and link detected status

2006-07-19 Thread Jay Cliburn

The via-velocity driver reports incorrect speed and link detected status as 
viewed by ethtool (and probably other tools). This patch fixes those incorrect 
reports and prettifies a long line.

Signed-off-by:  Jay Cliburn <[EMAIL PROTECTED]>

--- linux-2.6.17.x86_64/drivers/net/via-velocity.c.orig 2006-07-19 
18:34:15.0 -0500
+++ linux-2.6.17.x86_64/drivers/net/via-velocity.c  2006-07-19 
18:55:05.0 -0500
@@ -2742,7 +2742,7 @@ static u32 check_connection_type(struct 
 
if (PHYSR0 & PHYSR0_SPDG)
status |= VELOCITY_SPEED_1000;
-   if (PHYSR0 & PHYSR0_SPD10)
+   else if (PHYSR0 & PHYSR0_SPD10)
status |= VELOCITY_SPEED_10;
else
status |= VELOCITY_SPEED_100;
@@ -2851,8 +2851,17 @@ static int velocity_get_settings(struct 
u32 status;
status = check_connection_type(vptr->mac_regs);
 
-   cmd->supported = SUPPORTED_TP | SUPPORTED_Autoneg | 
SUPPORTED_10baseT_Half | SUPPORTED_10baseT_Full | SUPPORTED_100baseT_Half | 
SUPPORTED_100baseT_Full | SUPPORTED_1000baseT_Half | SUPPORTED_1000baseT_Full;
-   if (status & VELOCITY_SPEED_100)
+   cmd->supported = SUPPORTED_TP |
+SUPPORTED_Autoneg |
+SUPPORTED_10baseT_Half |
+SUPPORTED_10baseT_Full |
+SUPPORTED_100baseT_Half |
+SUPPORTED_100baseT_Full |
+SUPPORTED_1000baseT_Half |
+SUPPORTED_1000baseT_Full;
+   if (status & VELOCITY_SPEED_1000)
+   cmd->speed = SPEED_1000;
+   else if (status & VELOCITY_SPEED_100)
cmd->speed = SPEED_100;
else
cmd->speed = SPEED_10;
@@ -2896,7 +2905,7 @@ static u32 velocity_get_link(struct net_
 {
struct velocity_info *vptr = netdev_priv(dev);
struct mac_regs __iomem * regs = vptr->mac_regs;
-   return BYTE_REG_BITS_IS_ON(PHYSR0_LINKGD, ®s->PHYSR0)  ? 0 : 1;
+   return BYTE_REG_BITS_IS_ON(PHYSR0_LINKGD, ®s->PHYSR0) ? 1 : 0;
 }
 
 static void velocity_get_drvinfo(struct net_device *dev, struct 
ethtool_drvinfo *info)

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ipw2200: Driver lockup

2006-07-19 Thread John W. Linville

On Wed, Jul 19, 2006 at 08:49:57PM -0400, Dan Williams wrote:
> On Wed, 2006-07-19 at 10:28 -0400, Ralf Baechle wrote:
> > I got the driver to die several times under the extreme condition at the
> > KS / OLS with dozens to hundreds of other machines in the same room.  The
> > last kernel message I got from about the time when wireless died was
> > 
> >   eth1: NETDEV_TX_BUSY returned; driver should report queue full via 
> > ieee_device->is_queue_full.
> 
> I actually get this quite a bit too, at least a couple times per day.
> Sometimes the device goes down and you have to rmmod ipw2200, other
> times it recovers.  But quite annoying anyway.
> 
> Anyone know exactly what that message means, and possibly how to fix it?

For what it is worth, there was a patch for this posted back
in January.  It stirred-up a kerfluffle, so it never got merged.
FWIW, it touches on 802.11e QoS and multiple TX queues -- my personal
favorite wireless subject...NOT!

The thread is available here (first post not connected to follow-up
thread for some reason):

http://marc.theaimsgroup.com/?l=linux-netdev&m=113809246102858&w=2
http://marc.theaimsgroup.com/?l=linux-netdev&m=113814103024576&w=2

Given that half a year has passed, does anyone have any better ideas
now?  Should I merge the patch?  Or is the cure worse than the disease?

John
-- 
John W. Linville
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] drivers/net/wireless/d80211: Check configuration type in hw->config_interface.

2006-07-19 Thread Michael Wu

On Wednesday 19 July 2006 13:26, Jean-Mickael Guerin wrote:
> This patch prevents a NULL pointer dereferencing in AP mode:
> ieee80211_if_config will set conf->bssid only if device is of type STA
> or IBSS.
Why is that? Isn't there a BSSID in AP mode too? Perhaps it is calling 
config_interface before setting the BSSID?

adm8211 doesn't support AP mode yet, but it's good to know this crash won't 
occur when it does. :)

-Michael Wu


pgpWQw1uVOGGK.pgp
Description: PGP signature

Re: [IPROUTE2]: update documentation on mirred and IFB

2006-07-19 Thread Andy Furniss


jamal wrote:

About two more or so to complete these..

cheers,
jamal



+tc qdisc add dev lo eth0 ?

Andy.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ipw2200: Driver lockup

2006-07-19 Thread Dan Williams

On Wed, 2006-07-19 at 10:28 -0400, Ralf Baechle wrote:
> I got the driver to die several times under the extreme condition at the
> KS / OLS with dozens to hundreds of other machines in the same room.  The
> last kernel message I got from about the time when wireless died was
> 
>   eth1: NETDEV_TX_BUSY returned; driver should report queue full via 
> ieee_device->is_queue_full.

I actually get this quite a bit too, at least a couple times per day.
Sometimes the device goes down and you have to rmmod ipw2200, other
times it recovers.  But quite annoying anyway.

Anyone know exactly what that message means, and possibly how to fix it?

Dan

>   Ralf
> -
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: drivers/net/wireless/d80211: Check configuration type in hw->config_interface.

2006-07-19 Thread Jean-Mickael Guerin





When submitting a patch, please state what drivers you are really 
changing,

your mail subject suggested a change to the dscape stack itself.
Since I look at those patches at an irregular basis, while I am always
checking for rt2x00 related patches I could have missed this one.
Especially when you don't CC the driver maintainers about the patch
for their drivers.



Also I think most people would prefer if you split up patches when it
affects multiple drivers,
in this case rt2x00 and adm8211.
what I meant with this title is this patch is for all d80211-based 
current drivers of Linville's wireless-dev,

but I see your point...




diff --git a/drivers/net/wireless/d80211/rt2x00/rt2400pci.c
b/drivers/net/wireless/d80211/rt2x00/rt2400pci.c
index 946cf86..1d45851 100644
--- a/drivers/net/wireless/d80211/rt2x00/rt2400pci.c
+++ b/drivers/net/wireless/d80211/rt2x00/rt2400pci.c
@@ -1877,7 +1877,9 @@ rt2400pci_config_interface(struct net_de
 if (rt2x00pci->type == IEEE80211_IF_TYPE_MNTR)
 return 0;

-rt2400pci_config_bssid(rt2x00pci, conf->bssid);
+if (conf->type == IEEE80211_IF_TYPE_STA ||
+conf->type == IEEE80211_IF_TYPE_IBSS)
+rt2400pci_config_bssid(rt2x00pci, conf->bssid);


Should rt2400pci_config_bssid not simply run a check to see if the
bssid argument is valid?
This would prevent the risk of having a similar problem when the
function is called from somewhere else as well.

I was thinking a function named xxx_config_bssid() assumes a valid bssid 
pointer,

- I would even add BUG_ON(conf->bssid==NULL) in xxx_config_bssid().
And hw->interface already already some tests with conf->type.
And net/d80211 uses same kind of test before setting conf->bssid.

anyway I don't mind you make the patch the other way.

Same comment applies for rt2500pci and rt61pci.

Any particular reason why you applied this change to PCI drivers in
rt2x00 only and not to the USB drivers?


likely because I have only a pci card :-)
USB drivers needs same change too.

Thanks,

Jean-Mickael
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: drivers/net/wireless/d80211: Check configuration type in hw->config_interface.

2006-07-19 Thread Ivo Van Doorn


Hi,


This patch prevents a NULL pointer dereferencing in AP mode:
ieee80211_if_config will set conf->bssid only if device is of type STA
or IBSS.
I see it using following commands right after module loading (with rt61)
# iwconfig wlan0 mode Master
# ifconfig wlan0 up


When submitting a patch, please state what drivers you are really changing,
your mail subject suggested a change to the dscape stack itself.
Since I look at those patches at an irregular basis, while I am always
checking for rt2x00 related patches I could have missed this one.
Especially when you don't CC the driver maintainers about the patch
for their drivers.
Also I think most people would prefer if you split up patches when it
affects multiple drivers,
in this case rt2x00 and adm8211.

As for the patch itself, see my comments below. ;)


diff --git a/drivers/net/wireless/d80211/rt2x00/rt2400pci.c
b/drivers/net/wireless/d80211/rt2x00/rt2400pci.c
index 946cf86..1d45851 100644
--- a/drivers/net/wireless/d80211/rt2x00/rt2400pci.c
+++ b/drivers/net/wireless/d80211/rt2x00/rt2400pci.c
@@ -1877,7 +1877,9 @@ rt2400pci_config_interface(struct net_de
 if (rt2x00pci->type == IEEE80211_IF_TYPE_MNTR)
 return 0;

-rt2400pci_config_bssid(rt2x00pci, conf->bssid);
+if (conf->type == IEEE80211_IF_TYPE_STA ||
+conf->type == IEEE80211_IF_TYPE_IBSS)
+rt2400pci_config_bssid(rt2x00pci, conf->bssid);


Should rt2400pci_config_bssid not simply run a check to see if the
bssid argument is valid?
This would prevent the risk of having a similar problem when the
function is called from somewhere else as well.

Same comment applies for rt2500pci and rt61pci.

Any particular reason why you applied this change to PCI drivers in
rt2x00 only and not to the USB drivers?

Ivo
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/2] NET: Accurate packet scheduling for ATM/ADSL (RTAB BUG)

2006-07-19 Thread Alexey Kuznetsov

Hello!

> Guess that Alexey wrote these RTAB lookup in a time where array lookups 
> was faster... now we have that memory lookups are the bottleneck.

No, they were slower from the very beginning. If I remember correctly,
there is comment about this somewhere.

I just did not find any simple way to do 32 bit fixed point arithmetics
scaling from bps to Gbps and was lazy to investigate this further,
tables are much simpler and more flexible.


> What about removing the RTAB system entirely?

Well, if fixed point arithmetics is not a problem.

Plus, remember, the function is not R*size, it is at least
R*size+addend, to account for link overhead. Plus account for padding
of small packets. Plus, when policing it should deaccount already added
link headers, QoS counts only network payload.

Alexey
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Qlogic qla3xxx driver v2.02.00-k36 for upstream inclusion.

2006-07-19 Thread Jeff Garzik


Ron Mercer wrote:
 
Andrew,


Attached is a patch to the qla3xxx driver in your -mm test kernel.  This 
patch makes the following changes:


-Removed potential infinite loop in ql_sem_spinlock().
-Relaxed hardware locking granularity.
-Fixed irq_request() where shared flag was used in MSI environment.
-Removed queue containing TX control blocks. This resource has a one to 
one  correspondence to each entry in the TX queue.

-Removed unnecessary tx_lock.
-Changed version to v2.02.00-k36.

The above changes plus the changes from the k35 patch address Jeff 
Garzik's concerns from his response. His response can be reviewed at 
this URL:


http://marc.theaimsgroup.com/?l=linux-netdev&m=115101855424635&w=2 





This driver has been through several iterations on the netdev list and 
we feel this driver is ready for inclusion in the upstream kernel.


Send me a patch via private email (for size reasons), include a proper 
Signed-off-by line per http://linux.yyz.us/patch-format.html and I will 
merge this driver straightaway.


Jeff



-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/2] NET: Accurate packet scheduling for ATM/ADSL (RTAB BUG)

2006-07-19 Thread Jesper Dangaard Brouer



Russell Stuart wrote:


- As it stands, it doesn't help the qdiscs that use 
RTAB.  So unless he proposes to remove RTAB entirely the ATM patch as 
it will still have to go in.


Here is a very important point here:

 The RTAB (rate-table) in the kernel is NOT aligned, this is the ONLY
 reason why we need to patch the kernel.

This why the kernel RTAB patch is still needed even with Patrick's 
excelent STAB patch.  (Maybe the RTAB system should be removed entirely?)


I have discussed this RTAB issue with Patrick (off list) and he denoted 
this RTAB issue as a regular BUG.



This is the problem with the RTAB:
--
The "hash" function for packet size lookups is a simple binary shift
operation.

 rtab[pkt_len>>cell_log] = pkt_xmit_time;

With a cell_log of 3 (the default), this implies that:
pkt_len  0 to  7 goes into entry 0,
and pkt_len  8 to 15 goes into entry 1,
and pkt_len 16 to 23 goes into entry 2,
and pkt_len 24 to 31 goes into entry 3,
and so on...

Current mapping:
entry[0](maps: 0- 7)=xmit_size:0
entry[1](maps: 8-15)=xmit_size:8
entry[2](maps:16-23)=xmit_size:16
entry[3](maps:24-31)=xmit_size:24
entry[4](maps:32-39)=xmit_size:32
entry[5](maps:40-47)=xmit_size:40
entry[6](maps:48-55)=xmit_size:48

When the table is constructed (in tc_core.c) the pkt_xmit_time is 
calculated from the lower boundary.  Meaning that transmitting a 7 byte 
packet "costs" 0 bytes to transmit.  The zero transmit cost is properly 
not a real-world problem as the IP header bounds the minimum packet size.


for (i=0; i<256; i++) {
  unsigned sz = (i<>cell_log]

and adjusting the xmit_size accordingly.
Giving the table:

 entry[0](maps:1-8)=xmit_size:8
 entry[1](maps:9-16)=xmit_size:16
 entry[2](maps:15-24)=xmit_size:24
 entry[3](maps:24-32)=xmit_size:32
 entry[4](maps:33-40)=xmit_size:40
 entry[5](maps:41-48)=xmit_size:48
 entry[6](maps:47-56)=xmit_size:56

The xmit_size is done like this:

for (i=0; i<256; i++) {
  unsigned sz = ((i+1)<>cell_log]


Another fix for the RTAB:
-
Remove the RTAB array lookups and calculate the pkt_xmit_time instead.

Guess that Alexey wrote these RTAB lookup in a time where array lookups 
was faster... now we have that memory lookups are the bottleneck.


What about removing the RTAB system entirely?


Cheers,
  Jesper Brouer

--
---
MSc. Master of Computer Science
Dept. of Computer Science, University of Copenhagen
Author of http://www.adsl-optimizer.dk
---
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] drivers/net/wireless/d80211: Check configuration type in hw->config_interface.

2006-07-19 Thread Jean-Mickael Guerin


Hello,

This patch prevents a NULL pointer dereferencing in AP mode:
ieee80211_if_config will set conf->bssid only if device is of type STA 
or IBSS.

I see it using following commands right after module loading (with rt61)
# iwconfig wlan0 mode Master
# ifconfig wlan0 up


Signed-off-by: Jean-Mickael Guerin <[EMAIL PROTECTED]>


adm8211/adm8211.c  |4 +++-
rt2x00/rt2400pci.c |4 +++-
rt2x00/rt2500pci.c |4 +++-
rt2x00/rt61pci.c   |4 +++-
4 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/net/wireless/d80211/adm8211/adm8211.c 
b/drivers/net/wireless/d80211/adm8211/adm8211.c

index 9fc5da7..53f05c2 100644
--- a/drivers/net/wireless/d80211/adm8211/adm8211.c
+++ b/drivers/net/wireless/d80211/adm8211/adm8211.c
@@ -1469,7 +1469,9 @@ static int adm8211_config_interface(stru
{
struct adm8211_priv *priv = ieee80211_dev_hw_data(dev);

-if (memcmp(conf->bssid, priv->bssid, ETH_ALEN)) {
+if ((conf->type == IEEE80211_IF_TYPE_STA ||
+ conf->type == IEEE80211_IF_TYPE_IBSS) &&
+ memcmp(conf->bssid, priv->bssid, ETH_ALEN)) {
adm8211_set_bssid(dev, conf->bssid);
memcpy(priv->bssid, conf->bssid, ETH_ALEN);
}
diff --git a/drivers/net/wireless/d80211/rt2x00/rt2400pci.c 
b/drivers/net/wireless/d80211/rt2x00/rt2400pci.c

index 946cf86..1d45851 100644
--- a/drivers/net/wireless/d80211/rt2x00/rt2400pci.c
+++ b/drivers/net/wireless/d80211/rt2x00/rt2400pci.c
@@ -1877,7 +1877,9 @@ rt2400pci_config_interface(struct net_de
if (rt2x00pci->type == IEEE80211_IF_TYPE_MNTR)
return 0;

-rt2400pci_config_bssid(rt2x00pci, conf->bssid);
+if (conf->type == IEEE80211_IF_TYPE_STA ||
+conf->type == IEEE80211_IF_TYPE_IBSS)
+rt2400pci_config_bssid(rt2x00pci, conf->bssid);

return 0;
}
diff --git a/drivers/net/wireless/d80211/rt2x00/rt2500pci.c 
b/drivers/net/wireless/d80211/rt2x00/rt2500pci.c

index ca0edd5..8d2b3a7 100644
--- a/drivers/net/wireless/d80211/rt2x00/rt2500pci.c
+++ b/drivers/net/wireless/d80211/rt2x00/rt2500pci.c
@@ -2000,7 +2000,9 @@ rt2500pci_config_interface(struct net_de
if (conf->type == IEEE80211_IF_TYPE_MNTR)
return 0;

-rt2500pci_config_bssid(rt2x00pci, conf->bssid);
+if (conf->type == IEEE80211_IF_TYPE_STA ||
+conf->type == IEEE80211_IF_TYPE_IBSS)
+rt2500pci_config_bssid(rt2x00pci, conf->bssid);

return 0;
}
diff --git a/drivers/net/wireless/d80211/rt2x00/rt61pci.c 
b/drivers/net/wireless/d80211/rt2x00/rt61pci.c

index 0799f9f..47b2eaf 100644
--- a/drivers/net/wireless/d80211/rt2x00/rt61pci.c
+++ b/drivers/net/wireless/d80211/rt2x00/rt61pci.c
@@ -2463,7 +2463,9 @@ rt61pci_config_interface(struct net_devi
if (conf->type == IEEE80211_IF_TYPE_MNTR)
return 0;

-rt61pci_config_bssid(rt2x00pci, conf->bssid);
+if (conf->type == IEEE80211_IF_TYPE_STA ||
+conf->type == IEEE80211_IF_TYPE_IBSS)
+rt61pci_config_bssid(rt2x00pci, conf->bssid);

return 0;
}


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-19 Thread Stephen Hemminger

On Wed, 19 Jul 2006 13:01:50 -0700 (PDT)
David Miller <[EMAIL PROTECTED]> wrote:

> From: Stephen Hemminger <[EMAIL PROTECTED]>
> Date: Wed, 19 Jul 2006 15:52:04 -0400
> 
> > As a related note, I am looking into fixing inet hash tables to use RCU.
> 
> IBM had posted a patch a long time ago, which would be not
> so hard to munge into the current tree.  See if you can
> spot it in the archives :)

Ben posted a patch in March, and IBM did one a while ago.
I am looking at both.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-19 Thread David Miller

From: Stephen Hemminger <[EMAIL PROTECTED]>
Date: Wed, 19 Jul 2006 15:52:04 -0400

> As a related note, I am looking into fixing inet hash tables to use RCU.

IBM had posted a patch a long time ago, which would be not
so hard to munge into the current tree.  See if you can
spot it in the archives :)
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-19 Thread Stephen Hemminger

As a related note, I am looking into fixing inet hash tables to use RCU.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: 2.6.18-rc2 tg3 Dead loop on netdevice eth0 fix it urgently!

2006-07-19 Thread David Miller

From: Herbert Xu <[EMAIL PROTECTED]>
Date: Thu, 20 Jul 2006 01:30:39 +1000

> [NET]: Fix reversed error test in netif_tx_trylock
> 
> A non-zero return value indicates success from spin_trylock,
> not error.
> 
> Signed-off-by: Herbert Xu <[EMAIL PROTECTED]>

Applied, thanks Herbert.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Wireless statistics for bcm43xx-d80211

2006-07-19 Thread Larry Finger


Dan Williams wrote:


Actually, now that I think about it, why are _any_ applets
screen-scraping /proc/net/wireless anymore?  If they profess to be a
wireless applet, yet screenscrape /proc/net/wireless, that's suspect
right there.  The ioctls for status are quite well-defined and haven't
changed in a very long time (ie, SIOCGIWRANGE).

On the flip side, /proc/net/wireless has been supported since the dawn
of time (ok, not really) and is the textual interface for reporting
wireless statistic, but maybe that shouldn't be the case anymore.



I finally found the source for the applet, and it is using iwlib to get the statistics. If the 
kernel's version of WE is really old, it will use /proc/net/wireless. For newer kernels, it is using 
the appropriate ioctl. When I get time, I will fix the error and submit a patch to KDE.


As an interim workaround, I downloaded and built kwlaninfo, a different KDE kicker applet. I don't 
like it very much but it does show the needed info.


Larry
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

ipw2200: Driver lockup

2006-07-19 Thread Ralf Baechle

I got the driver to die several times under the extreme condition at the
KS / OLS with dozens to hundreds of other machines in the same room.  The
last kernel message I got from about the time when wireless died was

  eth1: NETDEV_TX_BUSY returned; driver should report queue full via 
ieee_device->is_queue_full.

  Ralf
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Don't call request_region() for 3C90x chips

2006-07-19 Thread Jeff Garzik


Sergei Shtylyov wrote:

Hello.

Jeff Garzik wrote:

It's generally not a good idea to call request_region() on an address 
returned by pci_iomap(), even less so on a MMIO address. And there 
was absolutely no point in claiming the region already claimed by the 
PCI core, especially with the same PCI generic owner's name. As this 
is the only case of the must_free_region flag being set, this flag 
may go away as well...



Signed-off-by: Sergei Shtylyov <[EMAIL PROTECTED]>



I agree you have identified a bug, but this is not a solution.


The current driver bug is that it calls request_region() potentially 
on an MMIO address, but the solution is _not_ to completely avoid 
reserving the resource.


   It's not even a MMIO/PIO address anymore after pci_iomap() -- it 
either went thru ioremap() or ioport_map() which both change the mapping 
from the physical to the virtual address (or some equivalent of it for 
I/O ports).


Yes.  _Obviously_ you must reserve the resource passed to 
pci_iomap/ioremap, not the cookie returned by such.



The region registered with the PCI core, but _not_ claimed by anyone. 
Someone still needs to either call pci_{request,release}_regions() or 
request_[mem_]region() to indicate that the resource is reserved.


   Sigh, it seems I've missed that difference. So, I'll recast...


IMO it would be easiest to do pci_{request,release}_regions() in the 
PCI-only code.  I believe this matches up well with the existing 
EISA-specific code, which also performs request_region().


Jeff


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: sky2 drops immediately after boot

2006-07-19 Thread Stephen Hemminger

There is a patch to add a debugging /proc/net/sky2/ethX interface to dump
status rings. This interface will not be part of released drivers and should
not be shipped in distribution kernels. It doesn't handle things like
device renaming and is purely released as is.

http://developer.osdl.org/shemminger/prototypes/sky2-proc-debug.patch

You can use this to see if status or tx ring are hanging.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Don't call request_region() for 3C90x chips

2006-07-19 Thread Sergei Shtylyov


Hello.

Jeff Garzik wrote:

It's generally not a good idea to call request_region() on an address 
returned by pci_iomap(), even less so on a MMIO address. And there was 
absolutely no point in claiming the region already claimed by the PCI 
core, especially with the same PCI generic owner's name. As this is 
the only case of the must_free_region flag being set, this flag may go 
away as well...



Signed-off-by: Sergei Shtylyov <[EMAIL PROTECTED]>



I agree you have identified a bug, but this is not a solution.


The current driver bug is that it calls request_region() potentially on 
an MMIO address, but the solution is _not_ to completely avoid reserving 
the resource.


   It's not even a MMIO/PIO address anymore after pci_iomap() -- it either 
went thru ioremap() or ioport_map() which both change the mapping from the 
physical to the virtual address (or some equivalent of it for I/O ports).


The region registered with the PCI core, but _not_ claimed by anyone. 
Someone still needs to either call pci_{request,release}_regions() or 
request_[mem_]region() to indicate that the resource is reserved.


   Sigh, it seems I've missed that difference. So, I'll recast...

This bug you have found was probably a missed detail during the 
conversion to the iomap API.


   Well, not only that: it was wrong once the driver started using MMIO 
(which is of course a preference).



Jeff


WBR, Sergei
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Don't call request_region() for 3C90x chips

2006-07-19 Thread Jeff Garzik


Sergei Shtylylov wrote:
It's generally not a good idea to call request_region() on an address returned 
by pci_iomap(), even less so on a MMIO address. And there was absolutely no 
point in claiming the region already claimed by the PCI core, especially with 
the same PCI generic owner's name. As this is the only case of the 
must_free_region flag being set, this flag may go away as well...


Signed-off-by: Sergei Shtylyov <[EMAIL PROTECTED]>


I agree you have identified a bug, but this is not a solution.

The current driver bug is that it calls request_region() potentially on 
an MMIO address, but the solution is _not_ to completely avoid reserving 
the resource.


The region registered with the PCI core, but _not_ claimed by anyone. 
Someone still needs to either call pci_{request,release}_regions() or 
request_[mem_]region() to indicate that the resource is reserved.


This bug you have found was probably a missed detail during the 
conversion to the iomap API.


Jeff


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Please pull 'upstream' branch of wireless-2.6

2006-07-19 Thread Jeff Garzik


John W. Linville wrote:

These patches are to be queued for 2.6.19...

---

The following changes since commit b312d799b324e895745ffe148def234fc60d5b74:
  Daniel Drake:
zd1211rw: usb_clear_halt not allowed in IRQ context

are found in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.git 
upstream

Daniel Drake:
  zd1211rw: Add Sagem device ID's

Larry Finger:
  bcm43xx: improved statistics

Michael Buesch:
  bcm43xx: opencoded locking
  bcm43xx: voluntary preemtion in the calibration loops

 drivers/net/wireless/bcm43xx/bcm43xx.h |   64 ++---
 drivers/net/wireless/bcm43xx/bcm43xx_debugfs.c |   34 +++--
 drivers/net/wireless/bcm43xx/bcm43xx_leds.c|   10 +
 drivers/net/wireless/bcm43xx/bcm43xx_main.c|   64 +
 drivers/net/wireless/bcm43xx/bcm43xx_phy.c |   33 +++--
 drivers/net/wireless/bcm43xx/bcm43xx_pio.c |4 -
 drivers/net/wireless/bcm43xx/bcm43xx_sysfs.c   |   34 +++--
 drivers/net/wireless/bcm43xx/bcm43xx_wx.c  |  162 ++--
 drivers/net/wireless/zd1211rw/zd_usb.c |2 


pulled


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [SLHC 1/4] Cleanup SLHC configuration

2006-07-19 Thread Jeff Garzik


applied updated patch #1, and patches 2-4

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Subject: [PATCH] sky2: add another PCI ID

2006-07-19 Thread Jeff Garzik


applied

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Weird TCP SACK problem. in Linux...

2006-07-19 Thread Oumer Teyeb


Oumer Teyeb wrote:


Hi,

Alexey Kuznetsov wrote:


Condition triggering start of fast retransmit is the same.
The behaviour while retransmit is different. FACKless code
behaves more like NewReno.
 

Ok, that is a good point!!  Now at least I can convince myself the 
CDFs for the first retransmissions showing that SACK leads to earlier 
retransmissions than no SACK are not wrongand I can even convince 
myself that this is the real reason behind sack/fack's performance 
degredation for the case of no timestamps,:-)... ...


Actually, then the increase in the number of retransmissions and the 
increase in teh download time from no SACK - SACK for timestamp case 
seems to make sense also...my reasoning is like this...if there is 
timestamps, that means there is reordering detection...hence the number 
retransmissions are reduced because we avoid the time spent in fast 
recovery when we introduce SACK on top of timestamps, we enter fast 
retransmits earlier than no SACK case as we seem to agree, and since the 
timestamp reduces the number of retransmission once we are in fast 
recovery, the retransmissions we see are basically the first few 
retransmissions that made us enter the false fast retransmits, so we 
have a little increase in the retransmissions and a little increase in 
the download times... but when no timestamps are used, there is no 
reordering detection and so SACK leads to less number of retransmissions 
because it retransmits selectively, but it doesnt improve the download 
time because it enters fast retransmit eralier than the no SACK and in 
this case the fast retransmits are very costly because they are not 
detected lead to window reduction am I making sense?:-) still 
the DSACK case is puzzling me


Regards,
Oumer
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: michael_mic in crypto api?

2006-07-19 Thread Herbert Xu

Jouni Malinen <[EMAIL PROTECTED]> wrote:
>
> However, at least for some time, there are two different TKIP
> implementations (net/ieee80211 and net/d80211) so this would mean
> duplicating Michael MIC implementation and I would rather not do that.

Good point, let's keep it for now.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Weird TCP SACK problem. in Linux...

2006-07-19 Thread Oumer Teyeb


Hi,

Alexey Kuznetsov wrote:


Condition triggering start of fast retransmit is the same.
The behaviour while retransmit is different. FACKless code
behaves more like NewReno.
 

Ok, that is a good point!!  Now at least I can convince myself the CDFs 
for the first retransmissions showing that SACK leads to earlier 
retransmissions than no SACK are not wrongand I can even convince 
myself that this is the real reason behind sack/fack's performance 
degredation for the case of no timestamps,:-)... ...


and it is disabled only when reordering is detected (and this is done 
either through timestamps or DSACK, right?)...
so if neither DSACK and timestamps are enabled we are unable to detect 
disorder, so basically there should be no difference between SACK and 
FACK, cause it is always FACK used... and that seems to make sense  from 
the results I have 
   



Yes. But FACKless tcp still retransmits less aggressively.

 

the # of retransmissions increases as shown in the second figure? isnt 
that odd? shouldnt it be the other way around?
   



The most odd is that I see no correlation between #of retransmits
and download time in you graphs. Actually, the correlation is negative. :-)

 


yeah, that was what confuses me the most... in
www.kom.auc.dk/~oumer/ret_vs_download.pdf
I have a plot of the summary of runs of two hundrend runs for the four 
combinations of SACK(ON/OFF), timestamps(ON/OFF)
I just collected the retransmission from each run, and averaged the 
download time for each retransmission count. I see no clear 
pattern...so that was why I was focusing more on when retransmissions 
are triggered rather than how many of them are they...because first, the 
earlier you are in the fast recovery phase (if you dont revert it ) the 
more time you spend on congestion avoidance, and it hurts the throughput 
quite a lot, also, the number of times you enter fast retransmit is more 
harmful than that of the number of retransmissions because more 
unncessary retransmissions during a fast recovery costs some bandwidth, 
but it doesnt damage the "future" of the connection as much as a 
retransmission that drives tcp into fast recovery


Also why does the # retransmissions in the timestamp case increases when 
we use SACK/FACK as compared with no SACK case?
   



Excessive retransmissions still happen. Undoing just restores cwnd
and tries to increase reordering metric to avoid false retransmits.

 

Hmmm... I dont understand thisso if reording can be detected, (i.e 
we use timestamps, DSACK), the dupthreshold is increased temporarily? Ok 
this adds to the explanation of  why the retransmissions are
less in the timestamp case than in the non timestamp case (in addition 
to the fact that with timestamps, we get out of fast recovery earlier 
than non timestamps case, and hence also less retransmissions)...but 
what I was referring to was if you use timestamps then why the increase 
in the number of retransmissions when we use FACK, SACK or DSACK as 
compared to the no SACK case...Is this dupthreshold increase documented 
somewhere properly? in the linux congestion paper by you and Pasi , you 
mention it briefly in section 5 "linux fast recovery does not fully 
follow RFC 2582.. the sender adjusts the threshold for triggering fast 
retransmit dynamically, based on the observerd reordering in the 
network..." but it doesnt exactly say how this dynamic adjustment is 
done 



1. Suppose, some segments, but not all, were delayed.
2. Senders sees dupack with a SACK. It is the first, SACK allows to open
  window for one segment, you send one segment with snd.nxt.
3. Receivers receives it before delayed segments arrived.
4. When senders sees this SACK, it assumes that all the delayed
  segments are lost.
 

Thanks! it is very clear now.! but it is basically the same effect (for 
the explanation that I am seeking)...as the trace you quoted, right, two 
duplicate acks leading to retransmission


OK ...but if timestamps are enabled, then I just couldnt figure out the 
use of  DSACK, can it tell us something more than we can find using 
timestamps??
   



It depends. Normally, no. If the network is fast, timestamps are just
too coarse to detect redundant retransmissions.

Plus, the heuristcs based on timestamps essentially relies on a bug
in our timestamps processing code. Another side could have it fixed. :-)
 

Ok, for my studies it shouldnt matter because I am using the buggy code 
on both the sender and receiver.. :-) (though I dont understand what 
this bug you are referring to is about :-)



Alexey
 




-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: michael_mic in crypto api?

2006-07-19 Thread Jouni Malinen

On Thu, Jul 20, 2006 at 01:39:05AM +1000, Herbert Xu wrote:
> Michael Wu <[EMAIL PROTECTED]> wrote:
> > Simplicity and consistency. Whereas the relatively simple mic part of the 
> > TKIP 
> > algorithm is in crypto API, the (more important, more complicated) key 
> > mixing 
> > part is not in crypto api.

> Sure, I don't mind either way.  I think Jouni wrote this originally,
> maybe he can share his thoughts with us?

I was more or less told that TKIP implementation cannot be included in
the kernel tree before this was moved into crypto api.. I don't really
care much where it is, but since it is now in crypto api, it would sound
easiest to just keep it there. If someone really wants to move it away
from there and into TKIP code in ieee80211/d80211, feel free to do that.
However, at least for some time, there are two different TKIP
implementations (net/ieee80211 and net/d80211) so this would mean
duplicating Michael MIC implementation and I would rather not do that.

-- 
Jouni MalinenPGP id EFC895FA
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] [NET] pci_module_init() removal on some net-drivers

2006-07-19 Thread Henne

From: Henrik Kretzschmar <[EMAIL PROTECTED]>

Removes pci_module_init() from 56 net-subsys-drivers and replaces it with
pci_register_driver(), if the initialistion function just returns the
return value of those functions. 
There are still some pci_module_init() left in the net drivers which require
a closer look.
Signed-off-by: Henrik Kretzschmar <[EMAIL PROTECTED]>

---

diff -ruN linux-2.6.18-rc2/drivers/net/8139cp.c linux/drivers/net/8139cp.c
--- linux-2.6.18-rc2/drivers/net/8139cp.c   2006-07-18 13:37:07.0 
+0200
+++ linux/drivers/net/8139cp.c  2006-07-19 17:41:08.0 +0200
@@ -2098,7 +2098,7 @@
 #ifdef MODULE
printk("%s", version);
 #endif
-   return pci_module_init (&cp_driver);
+   return pci_register_driver(&cp_driver);
 }
 
 static void __exit cp_exit (void)
diff -ruN linux-2.6.18-rc2/drivers/net/8139too.c linux/drivers/net/8139too.c
--- linux-2.6.18-rc2/drivers/net/8139too.c  2006-07-18 13:37:07.0 
+0200
+++ linux/drivers/net/8139too.c 2006-07-19 17:40:35.0 +0200
@@ -2629,7 +2629,7 @@
printk (KERN_INFO RTL8139_DRIVER_NAME "\n");
 #endif
 
-   return pci_module_init (&rtl8139_pci_driver);
+   return pci_register_driver(&rtl8139_pci_driver);
 }
 
 
diff -ruN linux-2.6.18-rc2/drivers/net/acenic.c linux/drivers/net/acenic.c
--- linux-2.6.18-rc2/drivers/net/acenic.c   2006-07-18 13:37:07.0 
+0200
+++ linux/drivers/net/acenic.c  2006-07-19 16:45:07.0 +0200
@@ -725,7 +725,7 @@
 
 static int __init acenic_init(void)
 {
-   return pci_module_init(&acenic_pci_driver);
+   return pci_register_driver(&acenic_pci_driver);
 }
 
 static void __exit acenic_exit(void)
diff -ruN linux-2.6.18-rc2/drivers/net/amd8111e.c linux/drivers/net/amd8111e.c
--- linux-2.6.18-rc2/drivers/net/amd8111e.c 2006-07-18 13:37:07.0 
+0200
+++ linux/drivers/net/amd8111e.c2006-07-19 17:40:08.0 +0200
@@ -2158,7 +2158,7 @@
 
 static int __init amd8111e_init(void)
 {
-   return pci_module_init(&amd8111e_driver);
+   return pci_register_driver(&amd8111e_driver);
 }
 
 static void __exit amd8111e_cleanup(void)
diff -ruN linux-2.6.18-rc2/drivers/net/arcnet/com20020-pci.c 
linux/drivers/net/arcnet/com20020-pci.c
--- linux-2.6.18-rc2/drivers/net/arcnet/com20020-pci.c  2006-07-18 
13:37:07.0 +0200
+++ linux/drivers/net/arcnet/com20020-pci.c 2006-07-19 17:39:43.0 
+0200
@@ -177,7 +177,7 @@
 static int __init com20020pci_init(void)
 {
BUGLVL(D_NORMAL) printk(VERSION);
-   return pci_module_init(&com20020pci_driver);
+   return pci_register_driver(&com20020pci_driver);
 }
 
 static void __exit com20020pci_cleanup(void)
diff -ruN linux-2.6.18-rc2/drivers/net/b44.c linux/drivers/net/b44.c
--- linux-2.6.18-rc2/drivers/net/b44.c  2006-07-18 13:37:07.0 +0200
+++ linux/drivers/net/b44.c 2006-07-19 17:38:59.0 +0200
@@ -2354,7 +2354,7 @@
dma_desc_align_mask = ~(dma_desc_align_size - 1);
dma_desc_sync_size = max_t(unsigned int, dma_desc_align_size, 
sizeof(struct dma_desc));
 
-   return pci_module_init(&b44_driver);
+   return pci_register_driver(&b44_driver);
 }
 
 static void __exit b44_cleanup(void)
diff -ruN linux-2.6.18-rc2/drivers/net/bnx2.c linux/drivers/net/bnx2.c
--- linux-2.6.18-rc2/drivers/net/bnx2.c 2006-07-18 13:37:07.0 +0200
+++ linux/drivers/net/bnx2.c2006-07-19 17:38:00.0 +0200
@@ -6015,7 +6015,7 @@
 
 static int __init bnx2_init(void)
 {
-   return pci_module_init(&bnx2_pci_driver);
+   return pci_register_driver(&bnx2_pci_driver);
 }
 
 static void __exit bnx2_cleanup(void)
diff -ruN linux-2.6.18-rc2/drivers/net/cassini.c linux/drivers/net/cassini.c
--- linux-2.6.18-rc2/drivers/net/cassini.c  2006-07-18 13:37:07.0 
+0200
+++ linux/drivers/net/cassini.c 2006-07-19 17:37:18.0 +0200
@@ -5245,7 +5245,7 @@
else
link_transition_timeout = 0;
 
-   return pci_module_init(&cas_driver);
+   return pci_register_driver(&cas_driver);
 }
 
 static void __exit cas_cleanup(void)
diff -ruN linux-2.6.18-rc2/drivers/net/chelsio/cxgb2.c 
linux/drivers/net/chelsio/cxgb2.c
--- linux-2.6.18-rc2/drivers/net/chelsio/cxgb2.c2006-07-18 
13:37:07.0 +0200
+++ linux/drivers/net/chelsio/cxgb2.c   2006-07-19 17:36:38.0 +0200
@@ -1243,7 +1243,7 @@
 
 static int __init t1_init_module(void)
 {
-   return pci_module_init(&driver);
+   return pci_register_driver(&driver);
 }
 
 static void __exit t1_cleanup_module(void)
diff -ruN linux-2.6.18-rc2/drivers/net/dl2k.c linux/drivers/net/dl2k.c
--- linux-2.6.18-rc2/drivers/net/dl2k.c 2006-07-18 13:37:07.0 +0200
+++ linux/drivers/net/dl2k.c2006-07-19 17:35:37.0 +0200
@@ -1815,7 +1815,7 @@
 static int __init
 rio_init (void)
 {
-   return pci_module_init (&rio_driver);
+   return pci_register_driver(&rio_driver);
 }
 
 static void __exit
diff -ruN linux-2.6.18-rc2/drivers/net/eepro10

Re: Weird TCP SACK problem. in Linux...

2006-07-19 Thread Alexey Kuznetsov

HellO!

> IsLost (SeqNum):
>  This routine returns whether the given sequence number is
>  considered to be lost.  The routine returns true when either
>  DupThresh discontiguous SACKed sequences have arrived above
>  'SeqNum' or (DupThresh * SMSS) bytes with sequence numbers greater
>  than 'SeqNum' have been SACKed.  Otherwise, the routine returns
>  false.

It is not used. The metric is just distance between snd.una and
the most forward sack.

It can be changed, but, to be honest, counting "discontiguous SACked sequences"
looks really weird and totally unjustified.

You can look for function tcp_time_to_recover() and replace
tcp_fackets_out(tp) > tp->reordering with something like
tp->sacked_out+1 > tp->reordering. It is not so weird as rfc
recommends, but it should make some difference.


> so you are saying, it doesnt matter whether I disable FACK or not, it is 
> basically set by default?

Condition triggering start of fast retransmit is the same.
The behaviour while retransmit is different. FACKless code
behaves more like NewReno.


> and it is disabled only when reordering is detected (and this is done 
> either through timestamps or DSACK, right?)...
> so if neither DSACK and timestamps are enabled we are unable to detect 
> disorder, so basically there should be no difference between SACK and 
> FACK, cause it is always FACK used... and that seems to make sense  from 
> the results I have 

Yes. But FACKless tcp still retransmits less aggressively.


> the # of retransmissions increases as shown in the second figure? isnt 
> that odd? shouldnt it be the other way around?

The most odd is that I see no correlation between #of retransmits
and download time in you graphs. Actually, the correlation is negative. :-)


> Also why does the # retransmissions in the timestamp case increases when 
> we use SACK/FACK as compared with no SACK case?

Excessive retransmissions still happen. Undoing just restores cwnd
and tries to increase reordering metric to avoid false retransmits.


> This one , I dont think I understood you. Could you please make it a bit 
> more clearer?

1. Suppose, some segments, but not all, were delayed.
2. Senders sees dupack with a SACK. It is the first, SACK allows to open
   window for one segment, you send one segment with snd.nxt.
3. Receivers receives it before delayed segments arrived.
4. When senders sees this SACK, it assumes that all the delayed
   segments are lost.


> OK ...but if timestamps are enabled, then I just couldnt figure out the 
> use of  DSACK, can it tell us something more than we can find using 
> timestamps??

It depends. Normally, no. If the network is fast, timestamps are just
too coarse to detect redundant retransmissions.

Plus, the heuristcs based on timestamps essentially relies on a bug
in our timestamps processing code. Another side could have it fixed. :-)

Alexey
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: michael_mic in crypto api?

2006-07-19 Thread Herbert Xu

Michael Wu <[EMAIL PROTECTED]> wrote:
>
> Simplicity and consistency. Whereas the relatively simple mic part of the 
> TKIP 
> algorithm is in crypto API, the (more important, more complicated) key mixing 
> part is not in crypto api. It is unlikely that either the mic or key mixing 
> part would be used separately or even outside of TKIP/802.11i code, and we 
> don't want to encourage people anyways since they're just bandaids for 
> problems associated with using rc4.

Sure, I don't mind either way.  I think Jouni wrote this originally,
maybe he can share his thoughts with us?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: 2.6.18-rc2 tg3 Dead loop on netdevice eth0 fix it urgently!

2006-07-19 Thread Herbert Xu

Ruben Puettmann <[EMAIL PROTECTED]> wrote:
>
> Yes But in the moment I thing  I have not enough informations.

Oops, it was a thinko on my part.

[NET]: Fix reversed error test in netif_tx_trylock

A non-zero return value indicates success from spin_trylock,
not error.

Signed-off-by: Herbert Xu <[EMAIL PROTECTED]>

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 76cc099..75f02d8 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -924,10 +924,10 @@ static inline void netif_tx_lock_bh(stru
 
 static inline int netif_tx_trylock(struct net_device *dev)
 {
-   int err = spin_trylock(&dev->_xmit_lock);
-   if (!err)
+   int ok = spin_trylock(&dev->_xmit_lock);
+   if (likely(ok))
dev->xmit_lock_owner = smp_processor_id();
-   return err;
+   return ok;
 }
 
 static inline void netif_tx_unlock(struct net_device *dev)
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Bind TUN to IPv4 Subnet

2006-07-19 Thread Georg Wicherski


Hey Folks,

I need to bind a TUN device previously created by open("/dev/net/tun",
O_RW) to a subnet, let's say 10.254.0.0/16. The ultimate goal is to have
a tunnel endpoint for a IPv4 tunnel, where I can after writing the
packet which came over the tunnel to the TUN device use an arbitary
userland socket bound to one of the 2^16 IPs or just all of them with
with INADDR_ANY.

What is the (sequence?) of ioctl's to configure the TUN device into
`listening' on a whole subnet? If possible, I want to just specify a
base address and a prefix length.

Digging into the curent 2.6.x kernel sources didn't help me much (yes,
they are badly commented).


Thanks,
Georg Wicherski

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [2.6 patch] drivers/net/wireless/zd1211rw/: possible cleanups

2006-07-19 Thread Adrian Bunk

On Mon, Jul 17, 2006 at 11:29:51PM +0200, Ulrich Kunitz wrote:
> On 06-07-16 14:17 Daniel Drake wrote:
> 
> > Adrian Bunk wrote:
> > >This patch contains the following possible cleanups:
> > >- make needlessly global functions static
> > >- #if 0 unused functions
> > >
> > >Please review which of these functions do make sense and which do 
> > >conflict with pending patches.
> > 
> > Thanks Adrian. I have put this in my tree and made an additional change 
> > along the same lines (your patched introduced an unused function warning 
> > to the non-debug build). If Ulrich signifies acceptance, I will send 
> > this on to John.
> > 
> > I have also sent in a patch to add a MAINTAINERS entry for zd1211rw, in 
> > hope that this will help you send patches with myself and/or Ulrich CC'd 
> > in future :)
> > 
> > Thanks.
> > Daniel
> > -
> > To unsubscribe from this list: send the line "unsubscribe netdev" in
> > the body of a message to [EMAIL PROTECTED]
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> Adrian, I would like to see this patch split up into three at
> least. 
> 
> Patch 1: Remove unused IO emulation functions
> Patch 2: Remove other unused stuff, which could be split up
>  further for each C file and header
> Patch 3: Change DEBUG ifdefs to #if 0
> 
> The purpose of patch 3 is bogus, because the follow-up patch will
> be called "removed useless #if 0 stuff". Keep in mind there is

That's not the purpose.

#if 0'ed code is both marked as unused and does no longer bloat the 
kernel.

> some reason, why I have such code there. If they ifdefs are not
> acceptable I will make this code dependent on a module parameter
> and compile it into the production module. We have a lot of
> different devices from different vendors out there and people
> report "stuff isn't working" but almost nothing more.

You are misunderstanding this part of my patch.
It does NOT remove used debug code.
It does #if 0 code that was not used with CONFIG_ZD1211RW_DEBUG=y.

> The problem with patch 1 and 2 is, that almost all of the function
> are completing the interface and some of them are even only static
> inlines. They are there because they should be used, before
> somebody reinvents the wheel or makes something completely stupid.
> However if such reasoning is not acceptable I'm ready to
> compromise.

I've only #if 0'ed "static inline" functions when they were the only 
user of an otherwise unused global function.

The main question is what your "should be used" means.

Will they be used within the next days/weeks or only in some far future?

In the first case, I agree that my patch shouldn't be applied now.
In the latter case, there's no reason to bloat the kernel binary now for 
some future usage that might or might not happen.

> Uli Kunitz

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Weird TCP SACK problem. in Linux...

2006-07-19 Thread Oumer Teyeb


Hi ,

Alexey Kuznetsov wrote:


Hello!

 


DSACK)  is used, the retransmissions seem to happen earlier .
   



Yes. With SACK/FACK retransmissions can be triggered earlier,
if an ACK SACKs a segment which is far enough from current snd.una.
That's what happens f.e. in T_SACK_dump5.dat

01:28:15.681050 < 192.38.55.34.51137 > 192.168.110.111.42238: P 
18825:20273[31857](1448) ack 1/5841 win 5840/0  [|] (DF)(ttl 64, id 19165)
01:28:15.800946 < 192.168.110.111.42238 > 192.38.55.34.51137: . 1:1[5841](0) ack 
8689/31857 win 23168/0  (DF) [tos 0x8]  (ttl 62, id 45508)
01:28:15.860773 < 192.168.110.111.42238 > 192.38.55.34.51137: . 1:1[5841](0) ack 
8689/31857 win 23168/0  (DF) [tos 0x8]  (ttl 62, id 45509)
01:28:15.860781 < 192.38.55.34.51137 > 192.168.110.111.42238: . 
8689:10137[31857](1448) ack 1/5841 win 5840/0  
[|] (DF) (ttl 64, id 19166)

The second sack confirms that 13033..14481 already arrived.

And this is even not a mistake, the third dupack arrived immediately:
01:28:15.901382 < 192.168.110.111.42238 > 192.38.55.34.51137: . 1:1[5841](0) ack 
8689/31857 win 23168/0  (DF) [tos 0x8]  (ttl 62, id 45510)
 

Thanks a lot Alexey for pointing that out.!!!..That was more or less 
what I was asumming  but is this feature of linux TCP documented 
somewhere? as far as I can see I couldnt find it in Pasi's paper in 
the conservative sack based recovery RFC (* RFC 3517), it is clearly 
*stated that the


  Upon the receipt of the first (DupThresh - 1) duplicate ACKs, the
  scoreboard is to be updated as normal.  Note: The first and second
  duplicate ACKs can also be used to trigger the transmission of
  previously unsent segments using the Limited Transmit algorithm
  [RFC3042].

  When a TCP sender receives the duplicate ACK corresponding to
  DupThresh ACKs, the scoreboard MUST be updated with the new SACK
  information (via Update ()).  If no previous loss event has occurred
  on the connection or the cumulative acknowledgment point is beyond
  the last value of RecoveryPoint, a loss recovery phase SHOULD be
  initiated, per the fast retransmit algorithm outlined in [RFC2581].

ofcourse,  once we are in the fast recovery phase we are able to mark a packet 
lost based on the criteria (also from the same RFC)

IsLost (SeqNum):
 This routine returns whether the given sequence number is
 considered to be lost.  The routine returns true when either
 DupThresh discontiguous SACKed sequences have arrived above
 'SeqNum' or (DupThresh * SMSS) bytes with sequence numbers greater
 than 'SeqNum' have been SACKed.  Otherwise, the routine returns
 false.

But from the trace portion you cut outside  it seems the sack 
implementation in linux simply checked the sn of the newly sacked one, 
and finding out that there are two blocks in between, considered it as 
if it is a dupthresh duplicate ack and retransmitted it... So if we were 
not using sack the retransmission would have occured after 
01:28:15.90... so the TCP SACK retransmitted in this case around 50ms 
earlier...but  it might be larger in some cases, (I will try to look 
into the traces to find larger time differences but you can see there is 
a clear difference by looking at the plots of the cdf of the time of 
occurance of the first retransmissions for the different cases at  
http://kom.aau.dk/~oumer/first_transmission_times.pdf  so I am on 
the verge of concluding TCP SACK is worse than non SACK TCP incase of 
persistent reorderingif only I could find a reference about the 
linux TCP SACK behaviour we discussed above :-)...



Actually, it is the reason why the FACK heuristics is not disabled
even when FACK disabled. Experiments showed that relaxing it severely
damages recovery in presense of real multiple losses.
And when it happens to be reordering, undoing works really well.
 

so you are saying, it doesnt matter whether I disable FACK or not, it is 
basically set by default?
and it is disabled only when reordering is detected (and this is done 
either through timestamps or DSACK, right?)...
so if neither DSACK and timestamps are enabled we are unable to detect 
disorder, so basically there should be no difference between SACK and 
FACK, cause it is always FACK used... and that seems to make sense  from 
the results I have  (i.e. referrring to 

http://kom.aau.dk/~oumer/384_100Kbyte_Timestamps_SACK_FACK_DSACK_10FER_DT.pdf
http://kom.aau.dk/~oumer/384_100Kbyte_Timestamps_SACK_FACK_DSACK_10FER_ret.pdf
)...

now let's introduce DSACK and no timestamps... that means we are able to 
detect some reordering and download time should decrease, and it does so 
as shown in the first of the figures I just give the link to...however, 
the # of retransmissions increases as shown in the second figure? isnt 
that odd? shouldnt it be the other way around?


Also why does the # retransmissions in the timestamp case increases when 
we use SACK/FACK as compared with no SACK case?...and as you mentioned 
earlier reordering undoing wo

Re: [PATCH 0/2] NET: Accurate packet scheduling for ATM/ADSL

2006-07-19 Thread Patrick McHardy

Andy Furniss wrote:
> Russell Stuart wrote:
> 
>> - As it stands, it doesn't help the qdiscs that use   RTAB.  So unless
>> he proposes to remove RTAB entirely   the ATM patch as it will still
>> have to go in.
> 
> 
> Hmm - I was just looking at the kernel changes to htb. The only
> difference is the len - I am blindly assuming that it does/will return
> the link lengths properly for atm.
> 
> So for atm, qdisc_tx_len(skb) will always return lengths that are
> multiples of 53.
> 
> If nothing else were done we would suffer innacuarcy from the cell_log
> just like eth.
> 
> But no other kernel hack would be needed to do it perfectly - rather
> like we (who patch for atm already) just fill the tc generated rate
> table with what we like, that would be an option.

That is how it should work. If the calculation doesn't fit, lets fix it.

>> The kernel will have to do a shift and a division
>> for each packet, which I assume is permissible.
> 
> 
> I guess that is for others to decide :-) I think Patrick has a point
> about sfq/htb drr, Like you I guess, I thought that alot of extra per
> packet calculations would have got an instant NO.

Its only done once per packet (currently, it might be interesting to
override the length for specific classes and their childs, for example
if you do queueing on eth0 and have an DSL router one hop apart).
The division is gone in my patch btw.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/2] NET: Accurate packet scheduling for ATM/ADSL

2006-07-19 Thread Patrick McHardy

Russell Stuart wrote:
> On Tue, 2006-07-18 at 22:46 +0100, Andy Furniss wrote: 
> 
>>FWIW I think it may be possible to do it Patricks' way, as if I read it 
>>properly he will end up with the ATM cell train length which gets 
>>shifted by cell_log and looked up as before. The ATM length will be in 
>>steps of 53 so with cell_log 3 or 4 I think there will be no collisions 
>>- so special rate tables for ATM can still be made perfect.

Please excuse my silence, I was travelling and am still catching up
with my mails.

> Patrick is proposing that the packet lengths be sent to 
> the kernel in a similar way to how transmission times (ie
> RTAB) is sent now.  I agree that is how things should be 
> done - but it doesn't have much to do with the ATM patch, 
> other than he has allowed for ATM in the way he does the 
> calculation in the kernel [1].
> 
> In particular:
> 
> - As it stands, it doesn't help the qdiscs that use 
>   RTAB.  So unless he proposes to remove RTAB entirely 
>   the ATM patch as it will still have to go in.

Why? The length calculated by my STABs (or something similar)
is used by _all_ qdiscs. Not only for transmission time calculation,
but also for statistics and estimators. If the length calculation
doesn't fit for ATM, that can be fixed.

> - A bit of effort was put into making this current
>   ATM patch both backwards and forwards compatible.
>   Patricks patch would work with newer kernels,
>   obviously.  Older kernels, and in particular the
>   kernel that Debian is Etch is likely to distribute
>   would miss out.

True, but it provides more consistency, and making current
kernels behave better is more important than old kernels.

> If Patrick did intend remove RTAB entirely then he
> needs to add a fair bit more into his patch.  Since 
> RTAB is just STAB scaled, its certainly possible.
> The kernel will have to do a shift and a division
> for each packet, which I assume is permissible.

You seem to have misunderstood my patch. It doesn't need to
touch RTABs, it just calculates the packet length as seen
on the wire (whereever it is) and uses that thoughout the
entire qdisc layer.

>>As you say, I think mpu should be added aswell - so eth/other can benefit.
> 
> 
> Not really.  The MPU is reflected in the STAB table,
> just as it is for RTAB.
> 
> One other point - the optimisation Patrick proposes
> for STAB (over RTAB) was to make the number of entries
> variable.  This seems like a good idea.  However there 
> is no such thing as a free lunch, and if you did 
> indeed reduce the number of entries to 16 for Ethernet 
> (as I think Patrick suggested), then each entry would
> cover 1500/16 = 93 different packet lengths.  Ie,
> entry 0 would cover packet lengths 0..93, entry 1
> 94..186, and so on.  A single entry can't be right
> for all those packet lengths, so again we are back
> to a average 30% error for typical VOIP length
> packets.

My patch doesn't uses fixed sized cells, so it can deal
with anything, worst case is you use one cell per packet
size. Optimizing size and lookup speed for ethernet makes
a lot more sense than optimizing for ADSL.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/2] NET: Accurate packet scheduling for ATM/ADSL

2006-07-19 Thread Andy Furniss


Russell Stuart wrote:
On Tue, 2006-07-18 at 22:46 +0100, Andy Furniss wrote: 

FWIW I think it may be possible to do it Patricks' way, as if I read it 
properly he will end up with the ATM cell train length which gets 
shifted by cell_log and looked up as before. The ATM length will be in 
steps of 53 so with cell_log 3 or 4 I think there will be no collisions 
- so special rate tables for ATM can still be made perfect.



Patrick is proposing that the packet lengths be sent to 
the kernel in a similar way to how transmission times (ie
RTAB) is sent now.  I agree that is how things should be 
done - but it doesn't have much to do with the ATM patch, 
other than he has allowed for ATM in the way he does the 
calculation in the kernel [1].


In particular:

- As it stands, it doesn't help the qdiscs that use 
  RTAB.  So unless he proposes to remove RTAB entirely 
  the ATM patch as it will still have to go in.


Hmm - I was just looking at the kernel changes to htb. The only 
difference is the len - I am blindly assuming that it does/will return 
the link lengths properly for atm.


So for atm, qdisc_tx_len(skb) will always return lengths that are 
multiples of 53.


If nothing else were done we would suffer innacuarcy from the cell_log 
just like eth.


But no other kernel hack would be needed to do it perfectly - rather 
like we (who patch for atm already) just fill the tc generated rate 
table with what we like, that would be an option.




- A bit of effort was put into making this current
  ATM patch both backwards and forwards compatible.
  Patricks patch would work with newer kernels,
  obviously.  Older kernels, and in particular the
  kernel that Debian is Etch is likely to distribute
  would miss out.

If Patrick did intend remove RTAB entirely then he
needs to add a fair bit more into his patch.  Since 
RTAB is just STAB scaled, its certainly possible.

The kernel will have to do a shift and a division
for each packet, which I assume is permissible.


I guess that is for others to decide :-) I think Patrick has a point 
about sfq/htb drr, Like you I guess, I thought that alot of extra per 
packet calculations would have got an instant NO.





As you say, I think mpu should be added aswell - so eth/other can benefit.



Not really.  The MPU is reflected in the STAB table,
just as it is for RTAB.


OK, I was thinking of what Jamal said about helping others, so 
everything TC should be capable of accepting mpu and overhead with these 
patches - or is more work needed?


It will be good to be able to say

tc ... police rate 500kbit mpu 60 overhead 24 ... for eth.
(Assuming eth mpu/overhead are really 46/38 - p in mpu is payload AIUI 
so 60 and 24 come from allowing for skb->len being IP+14)


or for ATM + pppoa something like

tc ... police rate 500kbit overhead 10 atm ...

In the case of eth someone already added mpu/overhead for HTB and it 
doesn't need any extra per packet calcs. I guess this way it would.




One other point - the optimisation Patrick proposes
for STAB (over RTAB) was to make the number of entries
variable.  This seems like a good idea.  However there 
is no such thing as a free lunch, and if you did 
indeed reduce the number of entries to 16 for Ethernet 
(as I think Patrick suggested), then each entry would

cover 1500/16 = 93 different packet lengths.  Ie,
entry 0 would cover packet lengths 0..93, entry 1
94..186, and so on.  A single entry can't be right
for all those packet lengths, so again we are back
to a average 30% error for typical VOIP length
packets.


I agree less accuracy will not be nice. But as an option it could be the 
only way you can do 1/10Gig + jumbo frames.


Andy.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Wireless statistics for bcm43xx-d80211

2006-07-19 Thread Dan Williams

On Tue, 2006-07-18 at 23:24 -0500, Larry Finger wrote:
> I have gotten most things working to produce wireless statistics through 
> /proc/net/wireless for
> bcm43xx-d80211; however, I have one problem that I have not yet been able to 
> solve. When I do a 'cat
> /proc/net/wireless', the following is printed:
> 
> Inter-| sta-|   Quality|   Discarded packets   | Missed | 
> WE
>   face | tus | link level noise |  nwid  crypt   frag  retry   misc | beacon 
> | 20
> wmaster0:   100.0.0.   0  0  0  0  00
>   wlan1:   100.  -26.  -67.   0  0  0  0  00
> 
> Based on the numbers obtained using bcm43xx-softmac for my interface, the 
> numbers for level and 
> noise for wlan1 are what I expected (in dBm). The link value has not yet been 
> finished. The main 
> problem is that the wireless kicker applet for KDE, which I use for a 
> display, is only looking at 
> the first line, and never sees the wlan1 data - only the wmaster0 results.
> 
> Is there some way to detect that the master interface is being interrogated, 
> and return data for the
> attached STA instead?

Actually, now that I think about it, why are _any_ applets
screen-scraping /proc/net/wireless anymore?  If they profess to be a
wireless applet, yet screenscrape /proc/net/wireless, that's suspect
right there.  The ioctls for status are quite well-defined and haven't
changed in a very long time (ie, SIOCGIWRANGE).

On the flip side, /proc/net/wireless has been supported since the dawn
of time (ok, not really) and is the textual interface for reporting
wireless statistic, but maybe that shouldn't be the case anymore.

Dan


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Weird TCP SACK problem. in Linux...

2006-07-19 Thread Alexey Kuznetsov

Hello!

> DSACK)  is used, the retransmissions seem to happen earlier .

Yes. With SACK/FACK retransmissions can be triggered earlier,
if an ACK SACKs a segment which is far enough from current snd.una.
That's what happens f.e. in T_SACK_dump5.dat

01:28:15.681050 < 192.38.55.34.51137 > 192.168.110.111.42238: P 
18825:20273[31857](1448) ack 1/5841 win 5840/0  [|] (DF)(ttl 64, id 19165)
01:28:15.800946 < 192.168.110.111.42238 > 192.38.55.34.51137: . 1:1[5841](0) 
ack 8689/31857 win 23168/0  (DF) [tos 0x8]  (ttl 62, id 45508)
01:28:15.860773 < 192.168.110.111.42238 > 192.38.55.34.51137: . 1:1[5841](0) 
ack 8689/31857 win 23168/0  (DF) [tos 0x8]  (ttl 62, id 45509)
01:28:15.860781 < 192.38.55.34.51137 > 192.168.110.111.42238: . 
8689:10137[31857](1448) ack 1/5841 win 5840/0  [|] (DF) (ttl 64, id 19166)

The second sack confirms that 13033..14481 already arrived.

And this is even not a mistake, the third dupack arrived immediately:
01:28:15.901382 < 192.168.110.111.42238 > 192.38.55.34.51137: . 1:1[5841](0) 
ack 8689/31857 win 23168/0  (DF) [tos 0x8]  (ttl 62, id 45510)

Actually, it is the reason why the FACK heuristics is not disabled
even when FACK disabled. Experiments showed that relaxing it severely
damages recovery in presense of real multiple losses.
And when it happens to be reordering, undoing works really well.


There is one more thing, which probably happens in your experiments,
though I did not find it in dumps. If reordering exceeds RTT, i.e.
we receive SACK for a segment, which was sent as part of forward
retransmission after a hole was detected, fast retransmit entered immediately.
Two dupacks is enough for this: first triggers forward transmission,
if the second SACKs the segmetn which has just been sent, we are there.

> One more thing, say I have FRTO, DSACK and timestamps enabled, which 
> algorithm takes precedence ?

They live together, essnetially, not dependant. 

Alexey
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-19 Thread Alexey Kuznetsov

Hello!

> There is no socket spinlock anymore.
> Above lock is skb_queue lock which is held inside
> skb_dequeue/skb_queue_tail calls.

Lock is named differently, but it is still here.
BTW for UDP even the name is the same.

 
> > Equivalent of socket user lock.
> 
> No, it is an equivalent for hash lock in socket table.

OK. But you have to introduce socket mutex somewhere in any case.
Even in ATCP.


> Just an example - tcp_established() can be called with bh disabled under
> the socket lock.

When we have a process context in hands, it is not.

Did you ask youself, why do not we put all the packets to backlog/prequeue
and just wait when user will read the data? It would be 100% equivalent
to "netchannels".

The answer is simple: because we cannot wait. If user delays for 200msec,
wait for connection collapse due to retransmissions. If the segment is
out of order, immediate attention is required. Any scheme, which tries
to wait for user unconditionally, at least has to run a watchdog timer,
which fires before sender senses the gap.

And this is what we do for ages. Grep for "VJ" in sources. :-)

netchannels have nothing to do with it, it is much elder idea.



> In that case one copies the whole data into userspace, so access for 20
> bytes of headers completely does not matter.

For short packets it matters.

But I said not this. I said it looks _worse_. A bit, but worse.


> Hmm, for 80 bytes sized packets win was about 2.5 times. Could you
> please show me lines inside existing code, which should be commented, so
> I got 50Mbyte/sec for that?

If I knew it would be done. :-)

Actually, it is the action, which I would expect. This, but
not dropping all the TCP stack.


> I showed there, that using existing stack it is imposible

Please, understand, it is such statements that compromise your work.
If it is impossible then it is not interesting.

Alexey
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Wireless statistics for bcm43xx-d80211

2006-07-19 Thread Michael Buesch

On Wednesday 19 July 2006 06:24, Larry Finger wrote:
> I have gotten most things working to produce wireless statistics through 
> /proc/net/wireless for
> bcm43xx-d80211; however, I have one problem that I have not yet been able to 
> solve. When I do a 'cat
> /proc/net/wireless', the following is printed:
> 
> Inter-| sta-|   Quality|   Discarded packets   | Missed | 
> WE
>   face | tus | link level noise |  nwid  crypt   frag  retry   misc | beacon 
> | 20
> wmaster0:   100.0.0.   0  0  0  0  00
>   wlan1:   100.  -26.  -67.   0  0  0  0  00
> 
> Based on the numbers obtained using bcm43xx-softmac for my interface, the 
> numbers for level and 
> noise for wlan1 are what I expected (in dBm). The link value has not yet been 
> finished. The main 
> problem is that the wireless kicker applet for KDE, which I use for a 
> display, is only looking at 
> the first line, 

So it is broken. Fullstop.
Bug KDE for this.

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Patch "[PKT_SCHED]: PSCHED_TADD() and PSCHED_TADD2() can result,tv_usec >= 1000000" seems wrong

2006-07-19 Thread Guillaume Chazarain


Shuya MAEDA wrote :

"while (__delta > USEC_PER_SEC){ ... }", but I think it should be
"while (__delta >= USEC_PER_SEC){ ... }". Is it right?


I agree, good catch :-)

Thanks.

--
Guillaume

In PSCHED_TADD and PSCHED_TADD2, if delta is less than tv.tv_usec (so, less
than USEC_PER_SEC too) then tv_res will be smaller than tv. The
affectation "(tv_res).tv_usec = __delta;" is wrong.
The fix is to revert to the original code before
4ee303dfeac6451b402e3d8512723d3a0f861857 and change the 'if' in 'while'.

[Shuya MAEDA: "while (__delta >= USEC_PER_SEC){ ... }" instead of
"while (__delta > USEC_PER_SEC){ ... }"]

Signed-off-by: Guillaume Chazarain <[EMAIL PROTECTED]>
---

 pkt_sched.h |   18 ++
 1 file changed, 6 insertions(+), 12 deletions(-)

--- a/include/net/pkt_sched.h
+++ b/include/net/pkt_sched.h
@@ -169,23 +169,17 @@ psched_tod_diff(int delta_sec, int bound
 
 #define PSCHED_TADD2(tv, delta, tv_res) \
 ({ \
-	   int __delta = (delta); \
-	   (tv_res) = (tv); \
-	   while(__delta >= USEC_PER_SEC){ \
-		 (tv_res).tv_sec++; \
-		 __delta -= USEC_PER_SEC; \
-	   } \
+	   int __delta = (tv).tv_usec + (delta); \
+	   (tv_res).tv_sec = (tv).tv_sec; \
+	   while (__delta >= USEC_PER_SEC) { (tv_res).tv_sec++; __delta -= USEC_PER_SEC; } \
 	   (tv_res).tv_usec = __delta; \
 })
 
 #define PSCHED_TADD(tv, delta) \
 ({ \
-	   int __delta = (delta); \
-	   while(__delta >= USEC_PER_SEC){ \
-		 (tv).tv_sec++; \
-		 __delta -= USEC_PER_SEC; \
-	   } \
-	   (tv).tv_usec = __delta; \
+	   (tv).tv_usec += (delta); \
+	   while ((tv).tv_usec >= USEC_PER_SEC) { (tv).tv_sec++; \
+		 (tv).tv_usec -= USEC_PER_SEC; } \
 })
 
 /* Set/check that time is in the "past perfect";

Re: Repost: Re: [VLAN]: translate IF_OPER_DORMANT to netif_dormant_on()

2006-07-19 Thread Patrick McHardy

Stefan Rompf wrote:
> VLAN devices did not get registered as admin up in 2.6.16 and IMHO also
> not in 2.6.17. So update patch description.
> 
> Ok,
> 
> the following patch should fix the problem. Patrick, can you give it a
> try? As the bug did not affect me through my testing, I want to be sure it
> works now. This is stuff for 2.6.18 and 2.6.17-stable.

Sorry for the delay. Just tested by unplugging the cable from eth0,
adding a bunch of VLANs and plugging the cable again, everything
works fine.

> [VLAN]: Fix link state propagation
> 
> When the queue of the underlying device is stopped at initialization time
> or the device is marked "not present", the state will be propagated to the
> vlan device and never change. Based on an analysis by Patrick McHardy.

ACKed-by: Patrick McHardy <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] clear skb cb on IP input

2006-07-19 Thread Guillaume Chazarain


Herbert Xu wrote :

Probably. Patches are welcome :)

Here are they, in both case I checked that the stuff to clear
was not already cleared, but I could not produce any misbehavior
by writing random junk instead of clearing the data. All my tests
were on the loopback using UML.

For IPv4, the added safety seems worth, but for IPv6 it's less clear.

Thanks.

--
Guillaume

Clear the accumulated junk in IP6CB when starting to handle an
IPV6 packet.

Signed-off-by: Guillaume Chazarain <[EMAIL PROTECTED]>
---

 ip6_input.c |2 ++
 1 file changed, 2 insertions(+)

--- a/net/ipv6/ip6_input.c
+++ b/net/ipv6/ip6_input.c
@@ -70,6 +70,8 @@ int ipv6_rcv(struct sk_buff *skb, struct
 		IP6_INC_STATS_BH(IPSTATS_MIB_INDISCARDS);
 		goto out;
 	}
+
+	memset(IP6CB(skb), 0, sizeof(struct inet6_skb_parm));
 
 	/*
 	 * Store incoming device index. When the packet will

Clear the whole IPCB, this clears also IPCB(skb)->flags.

Signed-off-by: Guillaume Chazarain <[EMAIL PROTECTED]>
---

 ip_input.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/net/ipv4/ip_input.c
+++ b/net/ipv4/ip_input.c
@@ -429,7 +429,7 @@ int ip_rcv(struct sk_buff *skb, struct n
 	}
 
 	/* Remove any debris in the socket control block */
-	memset(&(IPCB(skb)->opt), 0, sizeof(struct ip_options));
+	memset(IPCB(skb), 0, sizeof(struct inet_skb_parm));
 
 	return NF_HOOK(PF_INET, NF_IP_PRE_ROUTING, skb, dev, NULL,
 		   ip_rcv_finish);

Re: [PATCH RFC]rfkill - Hardware button support for Wireless cards

2006-07-19 Thread Ivo Van Doorn


Hi,


> I have been quite busy lately, hence the reason for this late continuance
> of the Hardware button support for Wireless cards discussion.
> I have CC'ed the people who discussed this in earlier threads.

no problem. Look good, just one thing I'm missing:

> + For each registered hardware button an input device will be created.
> + If this input device has been opened by the user, rfkill will send a
> + signal to userspace instead of the hardware about the new button
> + status. This will allow userpace to perform the correct steps
> + in order to bring down all interfaces.

> + if (rfkill->input_dev->users) {
> + input_report_key(rfkill->input_dev,
> + KEY_RFKILL, new_status);
> + input_sync(rfkill->input_dev);

Shouldn't there be a continue to avoid calling enable/disable_radio()?


True, totally overlooked that part. Will fix this immediately.


> + }

Stefan

PS: This rfkill stuff is really caught between two stools. Sending a netlink
event for the device with an additional TLV for radio button status seems as
valid as sending an input event...


Hmm not sure about this one. Personally I would think that support for
a button would belong more to the input layer. Even when the button
would only be usefull for networking.

ivo
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Wireless statistics for bcm43xx-d80211

2006-07-19 Thread Dan Williams

On Tue, 2006-07-18 at 23:24 -0500, Larry Finger wrote:
> I have gotten most things working to produce wireless statistics through 
> /proc/net/wireless for
> bcm43xx-d80211; however, I have one problem that I have not yet been able to 
> solve. When I do a 'cat
> /proc/net/wireless', the following is printed:
> 
> Inter-| sta-|   Quality|   Discarded packets   | Missed | 
> WE
>   face | tus | link level noise |  nwid  crypt   frag  retry   misc | beacon 
> | 20
> wmaster0:   100.0.0.   0  0  0  0  00
>   wlan1:   100.  -26.  -67.   0  0  0  0  00
> 
> Based on the numbers obtained using bcm43xx-softmac for my interface, the 
> numbers for level and 
> noise for wlan1 are what I expected (in dBm). The link value has not yet been 
> finished. The main 
> problem is that the wireless kicker applet for KDE, which I use for a 
> display, is only looking at 
> the first line, and never sees the wlan1 data - only the wmaster0 results.
> 
> Is there some way to detect that the master interface is being interrogated, 
> and return data for the
> attached STA instead?

I'm not entirely sure how you configure d80211-based drivers [1], but I
think this is Kicker's problem?  Doesn't it know which device you are
actually connecting with, i.e. wlan1?  Kicker should know this, but we
may need to update userland applications like Kicker and NetworkManager
for the new device model that d80211 presents.  wlanmaster0 shouldn't
have any quality, because it's not connected to anything, and nothing
connects to it.  It's not exposed over the air in any way, AFAIUI.

Let's fix this the right way, in userspace apps, not working around this
stuff in the drivers.

Dan

[1] i.e., can you have more than one STA attached to a master?  I
thought so, but could be wrong.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-19 Thread Jörn Engel

On Tue, 18 July 2006 23:08:01 +0400, Evgeniy Polyakov wrote:
> On Tue, Jul 18, 2006 at 02:15:17PM +0200, J?rn Engel ([EMAIL PROTECTED]) 
> wrote:
> > 
> > Your description makes it sound as if you would take a huge leap,
> > changing all in-kernel code _and_ the userspace interface in a single
> > patch.  Am I wrong?  Or am I right and would it make sense to extract
> > small incremental steps from your patch similar to those Van did in
> > his non-published work?
> 
> My first implementation used existing kernel code and showed small
> performance win - there was binding of the socket to netchannel and all
> protocol processing was moved into process context.

Iirc, Van didn't show performance numbers but rather cpu utilization
numbers.  And those went down significantly without changing the
userspace interface.

Did you look at cpu utilization as well?  If you did and your numbers
are worse than Vans, he either did something smarter than you or
forged his numbers (quite unlikely).

Jörn

-- 
Sometimes, asking the right question is already the answer.
-- Unknown
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 3/9] d80211: better deallocation of mdev

2006-07-19 Thread Jiri Benc

On Tue, 18 Jul 2006 20:07:34 -0700, Simon Barber wrote:
> I have been thinking about a slightly different approach for the master
> device. Since the master device represents the physical hardware, I am
> thinking that the hardware driver could register the master device
> directly itself. It would use the normal netdev_register call to do so.

How will you allocate structures for 802.11 specific data? You will
require the driver to call some ieee80211_allocate function anyway, so
you will probably want to leave allocation of net_device structure to
this function too. And it is probably better to have correspondent
ieee80211_register function to prevent confusion (even if the only thing
done in ieee80211_register is calling netdev_register).

> Received frames would be marked as having a protocol type of 802.11, and
> the 802.11 code would register itself as a protocol handler for this
> protocol type. Now netif_rx() could be used within the hardware driver
> to pass frames to the kernel 802.11 code

How will you pass data like signal strength or rate to 802.11 stack?

> - this has the benefit of
> better performance for the hardirq/softirq transition than the current
> scheme.

I don't see much benefit in this. Instead of rewriting big part of the
stack (and drivers as well) for questionable performance gain I would
rather see a work on putting the stack to mainline - the only blockers
left are SMP safety and stack<->hostapd netlink protocol. Oh, and better
communication with tools like NetworkManager.

> On the transmit side - currently the 802.11 code does much
> processing within the hard_start_xmit function on the master device. I
> would move all this processing into the 802.11 qdisc - putting it into
> the dequeue function.

This makes sense, especially fragmentation is a good candidate (and this
implies nearly the whole processing needs to be moved there as well).
Though most of these problems will vanish when the Ethernet<->802.11
conversion routines are removed and the stack starts working with native
802.11 frames.

> Now the hard_start_xmit function would be the real
> hardware drivers transmit function.

No need for this.

I also think that moving xmit processing to qdisc doesn't need to be
done now. It can be done without changing drivers, so we can do it after
the stack goes into mainline.

Thanks,

 Jiri

-- 
Jiri Benc
SUSE Labs
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Weird TCP SACK problem. in Linux...

2006-07-19 Thread Oumer Teyeb


Hi David,

I am using an emualtor that I developed using netfilter (see 
http://kom.aau.dk/~oumer/publications/VTC05.pdf for a description of the 
emulator).. and I emualte a UMTS network with RTT of 150ms, and I use a 
384kbps connection. There is UMTS frame erasure rate of 10%, but I have 
persistant link layer retransmission, which means nothing is actually 
lost. So due to this link layer errors, some packets arrive out of order 
and the effect of that on tcp performance is what I am after. I am using 
linux 2.4.


I have put more detailed traces  at
www.kom.auc.dk/~oumer/sackstuff.tar.gz
I have run the different cases 10 times each,

NT_NSACK[1-10].dat---no timestamp, no SACK 
NT_SACK[1-10].datno timestamp, SACK

T_NSACK[1-10].dat---timestamp, no SACK
T_SACK[1-10].dattimestamp. SACK

(by no SACK I mean only SACK, DSACK and FACK disabled, I also have 
results when they are enabled, see below for curves illustrating the 
different cases...)


the files without extension are just two column files that summarize the 
ten runs for the four different cases, the first column in the # 
retransmission, and second column is the download time, the values are 
gathered from tcptrace


the two eps files are just the plot summarizing  the above average 
download time and average retransmission # for each case...


one more thing in the trace files, you will find 3 tcp connections, the 
first one is not modified by my emulator that causes the reordering 
(actually, that is the connection through which I reset the destination 
catch that stores some metrics from previous runs using some commands 
via ssh), the second one is the ftp control channel and the third one is 
the ftp data channelthe emulator affects the last two channels

and causes reordering once in a while.
please dont hesistate to ask me if anything is not clear...

Also, I have put the final curves of all my emulations showing the 
download times and percentage of retransmissions (#retransmission 
/total  packets sent)

at
www.kom.auc.dk/~oumer/384_100Kbyte_Timestamps_SACK_FACK_DSACK_10FER_DT.pdf
www.kom.auc.dk/~oumer/384_100Kbyte_Timestamps_SACK_FACK_DSACK_10FER_ret.pdf

There are a lot of other things that I dont understand from these two 
curve. However the most bizzare one (apart from the SACK issue that 
started this discussion) is why DSACK leads to increased retransmissions 
when used without timestamps? (the behaviour is ok interms of download 
time as it is reducing it, showing that DSACK base spurious 
retransmission is at work)


Thanks a lot for taking the time

Regards,
Oumer







Xiaoliang (David) Wei wrote:


Hi Oumer,

   Your result is interesting. Just a few questions (along with your 
texts):



So I looked further into the results, and what I found was that when
SACK (when I refer to SACK here, I mean SACK only without FACK and
DSACK)  is used, the retransmissions seem to happen earlier .
at www.kom.auc.dk/~oumer/first_transmission_times.pdf
you can find the pic of cdf of the time when the first TCP
retransmission occured for the four combinations of SACK and timestamps
after hundrends of downloads of a 100K file for the different conditions
under network reordering...



Could you give a little bit more details on the scenarios. For example:
What is your RTT, capacity and etc? Linux versions? Packetsize is
1.5K? Then 100K is about 66 packets. Do flows finish slow start or
not? Also, what is the reordering level? Are you using Dummynet or
real network?



...but I couldnt figure out why the retransmissions occur earlier for
SACK than no SACK TCP. As far as I know, for both SACK and non SACK
cases, we need three (or more according to the setting) duplicate ACKs
to enter the fast retransmission /recovery state which would have
resulted in the same behaviour to the first occurance of a
retransmission. or is there some undocumented enhancment in Linux
TCP when using SACK that makes it enter fast retransmit earlier... the
ony explanation I could imagine is something like this



Are you sure FACK is turned OFF? FACK might retransmit earlier if you
have packet reordering, I think.



non SACK case
=
1 2 3 4 5 6 7 8 9 10. were sent and 2 was reorderdand assume we
are using delayed ACKs...and we get a triple duplicate ACK after pkt#8
is received. (i.e 3&4--first duplicate ACK, 5&6..second duplicate ACK
and 7&8...third duplicate ACK.)...

so if SACK behaved like this...

3&4 SACKEd 2 packets out of order received
5&6 SACKEd4 packets out of order received start fast
retransmissionas reorderd is greater than 3 (this is true when
it comes to marking packets as lost during fast recovery, but is it true
als for the first retransmission?)



I guess delayed ACK is turned off when there is packet reordering. The
receiver will send one ack for each data packet whenever there is out
of order packets in its queue. So we will get duplicate ack ealier
than what you explain above...

Re: Weird TCP SACK problem. in Linux...

2006-07-19 Thread Xiaoliang (David) Wei


Hi Oumer,

   Your result is interesting. Just a few questions (along with your texts):


So I looked further into the results, and what I found was that when
SACK (when I refer to SACK here, I mean SACK only without FACK and
DSACK)  is used, the retransmissions seem to happen earlier .
at www.kom.auc.dk/~oumer/first_transmission_times.pdf
you can find the pic of cdf of the time when the first TCP
retransmission occured for the four combinations of SACK and timestamps
after hundrends of downloads of a 100K file for the different conditions
under network reordering...


Could you give a little bit more details on the scenarios. For example:
What is your RTT, capacity and etc? Linux versions? Packetsize is
1.5K? Then 100K is about 66 packets. Do flows finish slow start or
not? Also, what is the reordering level? Are you using Dummynet or
real network?



...but I couldnt figure out why the retransmissions occur earlier for
SACK than no SACK TCP. As far as I know, for both SACK and non SACK
cases, we need three (or more according to the setting) duplicate ACKs
to enter the fast retransmission /recovery state which would have
resulted in the same behaviour to the first occurance of a
retransmission. or is there some undocumented enhancment in Linux
TCP when using SACK that makes it enter fast retransmit earlier... the
ony explanation I could imagine is something like this


Are you sure FACK is turned OFF? FACK might retransmit earlier if you
have packet reordering, I think.



non SACK case
=
1 2 3 4 5 6 7 8 9 10. were sent and 2 was reorderdand assume we
are using delayed ACKs...and we get a triple duplicate ACK after pkt#8
is received. (i.e 3&4--first duplicate ACK, 5&6..second duplicate ACK
and 7&8...third duplicate ACK.)...

so if SACK behaved like this...

3&4 SACKEd 2 packets out of order received
5&6 SACKEd4 packets out of order received start fast
retransmissionas reorderd is greater than 3 (this is true when
it comes to marking packets as lost during fast recovery, but is it true
als for the first retransmission?)


I guess delayed ACK is turned off when there is packet reordering. The
receiver will send one ack for each data packet whenever there is out
of order packets in its queue. So we will get duplicate ack ealier
than what you explain above...



One more thing, say I have FRTO, DSACK and timestamps enabled, which
algorithm takes precedence ? if FRTO is enabled, then all spurious
timeout detection are done through FRTO or a combination?..


They are compatible, I think?

When retransmission timer times out, it first tries to go through
FRTO. If FRTO found it's a real loss, then it goes to traditional
timeout process as specified in FRTO algorithm.

-David

--
Xiaoliang (David) Wei  Graduate Student, [EMAIL PROTECTED]
http://davidwei.org
***
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

sky2 drops immediately after boot

2006-07-19 Thread Daniel Drake


Hi Stephen,

Here's another sky2 issue for you to look at when you have some time. 
[EMAIL PROTECTED] reported this at http://bugs.gentoo.org/136508 
but I have tried to include all relevant info here.


The connection seems to work very briefly during boot, so the user can 
get an IP over DHCP. After that point, the connection stops working.


The adapter is integrated into a MSI K8N nVidia 4 Platinum motherboard.
sky2 v1.5 addr 0xfe6fc000 irq 217 Yukon-EC (0xb6) rev 1
lspci: http://bugs.gentoo.org/attachment.cgi?id=92181&action=view

This has all been reproduced on 2.6.18-rc2 with the "sky2: NAPI poll 
fix" patch on top (sky2 v1.5).


sky2 v0.15 worked, and sk98lin works when he can get it to compile.

This is being used on a half duplex 10mbps connection -- I'm only 
guessing, but I wonder if this might have something to do with the failure.


Here is dmesg:
http://bugs.gentoo.org/attachment.cgi?id=92182&action=view
and /proc/interrupts
http://bugs.gentoo.org/attachment.cgi?id=92183&action=view

We also tried with the disable_msi parameter:
dmesg: http://bugs.gentoo.org/attachment.cgi?id=92184&action=view
interrupts: http://bugs.gentoo.org/attachment.cgi?id=92185&action=view

When the connection drops, the interrupt counter is still increasing as 
normal.


Any ideas? What extra info can we provide?

Thanks,
Daniel

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] au1000_eth.c power management and driver registration support

2006-07-19 Thread Rodolfo Giometti

Hello,

Attached you can find my patch to add power managament and driver
registration to the new version of file "drivers/net/au1000_eth.c"
that implements the PHY-layer support.

Ciao,

Rodolfo Giometti

Signed-off-by: Rodolfo Giometti <[EMAIL PROTECTED]

-- 

GNU/Linux Solutions  e-mail:[EMAIL PROTECTED]
Linux Device Driver [EMAIL PROTECTED]
Embedded Systems[EMAIL PROTECTED]
UNIX programming phone: +39 349 2432127
diff --git a/arch/mips/au1000/common/au1xxx_irqmap.c 
b/arch/mips/au1000/common/au1xxx_irqmap.c
index 7acfe9b..d94bde1 100644
--- a/arch/mips/au1000/common/au1xxx_irqmap.c
+++ b/arch/mips/au1000/common/au1xxx_irqmap.c
@@ -117,7 +117,7 @@ #elif defined(CONFIG_SOC_AU1500)
{ AU1000_USB_DEV_SUS_INT, INTC_INT_RISE_EDGE, 0 },
{ AU1000_USB_HOST_INT, INTC_INT_LOW_LEVEL, 0 },
{ AU1000_ACSYNC_INT, INTC_INT_RISE_EDGE, 0 },
-   { AU1500_MAC0_DMA_INT, INTC_INT_HIGH_LEVEL, 0},
+   { AU1000_MAC0_DMA_INT, INTC_INT_HIGH_LEVEL, 0},
{ AU1500_MAC1_DMA_INT, INTC_INT_HIGH_LEVEL, 0},
{ AU1000_AC97C_INT, INTC_INT_RISE_EDGE, 0 },
 
@@ -151,7 +151,7 @@ #elif defined(CONFIG_SOC_AU1100)
{ AU1000_USB_DEV_SUS_INT, INTC_INT_RISE_EDGE, 0 },
{ AU1000_USB_HOST_INT, INTC_INT_LOW_LEVEL, 0 },
{ AU1000_ACSYNC_INT, INTC_INT_RISE_EDGE, 0 },
-   { AU1100_MAC0_DMA_INT, INTC_INT_HIGH_LEVEL, 0},
+   { AU1000_MAC0_DMA_INT, INTC_INT_HIGH_LEVEL, 0},
/*{ AU1000_GPIO215_208_INT, INTC_INT_HIGH_LEVEL, 0},*/
{ AU1100_LCD_INT, INTC_INT_HIGH_LEVEL, 0},
{ AU1000_AC97C_INT, INTC_INT_RISE_EDGE, 0 },
diff --git a/arch/mips/au1000/common/platform.c 
b/arch/mips/au1000/common/platform.c
index 8fd203d..ec81d4b 100644
--- a/arch/mips/au1000/common/platform.c
+++ b/arch/mips/au1000/common/platform.c
@@ -15,6 +15,78 @@ #include 
 
 #include 
 
+#if defined(CONFIG_MIPS_AU1X00_ENET) || defined(CONFIG_MIPS_AU1X00_ENET_MODULE)
+/* Ethernet controllers */
+static struct resource au1xxx_eth0_resources[] = {
+   [0] = {
+   .name   = "eth-base",
+   .start  = ETH0_BASE,
+   .end= ETH0_BASE + MAC_IOSIZE - 1,
+   .flags  = IORESOURCE_MEM,
+   },
+   [1] = {
+   .name   = "eth-mac",
+   .start  = MAC0_ENABLE,
+   .end= MAC0_ENABLE + 4 - 1,
+   .flags  = IORESOURCE_MEM,
+   },
+   [2] = {
+   .name   = "eth-irq",
+#if defined(CONFIG_SOC_AU1550)
+   .start  = AU1550_MAC0_DMA_INT,
+   .end= AU1550_MAC0_DMA_INT,
+#else
+   .start  = AU1000_MAC0_DMA_INT,
+   .end= AU1000_MAC0_DMA_INT,
+#endif
+   .flags  = IORESOURCE_IRQ,
+   },
+};
+
+static struct platform_device au1xxx_eth0_device = {
+   .name   = "au1000_eth",
+   .id = 0,
+   .num_resources  = ARRAY_SIZE(au1xxx_eth0_resources),
+   .resource   = au1xxx_eth0_resources,
+};
+
+#if defined(CONFIG_SOC_AU1000) || \
+defined(CONFIG_SOC_AU1500) || defined(CONFIG_SOC_AU1550)
+static struct resource au1xxx_eth1_resources[] = {
+   [0] = {
+   .name   = "eth-base",
+   .start  = ETH1_BASE,
+   .end= ETH1_BASE + MAC_IOSIZE - 1,
+   .flags  = IORESOURCE_MEM,
+   },
+   [1] = {
+   .name   = "eth-mac",
+   .start  = MAC1_ENABLE,
+   .end= MAC1_ENABLE + 4 - 1,
+   .flags  = IORESOURCE_MEM,
+   },
+   [2] = {
+   .name   = "eth-irq",
+#if defined(CONFIG_SOC_AU1550)
+   .start  = AU1550_MAC1_DMA_INT,
+   .end= AU1550_MAC1_DMA_INT,
+#else
+   .start  = AU1000_MAC1_DMA_INT,
+   .end= AU1000_MAC1_DMA_INT,
+#endif
+   .flags  = IORESOURCE_IRQ,
+   },
+};
+
+static struct platform_device au1xxx_eth1_device = {
+   .name   = "au1000_eth",
+   .id = 1,
+   .num_resources  = ARRAY_SIZE(au1xxx_eth1_resources),
+   .resource   = au1xxx_eth1_resources,
+};
+#endif
+#endif
+
 /* OHCI (USB full speed host controller) */
 static struct resource au1xxx_usb_ohci_resources[] = {
[0] = {
@@ -270,7 +367,14 @@ static struct platform_device smc91x_dev
 
 #endif
 
 static struct platform_device *au1xxx_platform_devices[] __initdata = {
+#if defined(CONFIG_MIPS_AU1X00_ENET) || defined(CONFIG_MIPS_AU1X00_ENET_MODULE)
+   &au1xxx_eth0_device,
+#if defined(CONFIG_SOC_AU1000) || \
+defined(CONFIG_SOC_AU1500) || defined(CONFIG_SOC_AU1550)
+   &au1xxx_eth1_device,
+#endif
+#endif
&au1xxx_usb_ohci_device,
&au1x00_pcmcia_device,
 #ifdef CONFIG_FB_AU1100
diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 27d465f..47c624b 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -453,13 +453,13 @@ config MIPS_GT96100ETH

Re: Patch "[PKT_SCHED]: PSCHED_TADD() and PSCHED_TADD2() can result,tv_usec >= 1000000" seems wrong

2006-07-19 Thread Shuya MAEDA


Hello.

In PSCHED_TADD2, if delta is less than tv.tv_usec (so, less than 
USEC_PER_SEC too) then tv_res will be before tv. The
affectation (tv_res).tv_usec = __delta; is wrong. The same applies to 
PSCHED_TADD.


You are right. It is my mistake.

I think the correct fix is simply to restore the original code and 
change the 'if' in a 'while'.


In Guillaume's patch,
"while (__delta > USEC_PER_SEC){ ... }", but I think it should be
"while (__delta >= USEC_PER_SEC){ ... }". Is it right?

Thank you very much.
--
Shuya MAEDA
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Strange TCP SACK behaviour in Linux TCP

2006-07-19 Thread Oumer Teyeb


Could you please CC your answers to me? thanx!

Oumer Teyeb wrote:


Hi Stephen,

Thanks for the quick response.

I have done what you asked and you can find the files at
www.kom.auc.dk/~oumer/sackstuff.tar.gz
I have run the different cases 10 times each,

NT_NSACK[1-10].dat---no timestamp, no SACK
NT_SACK[1-10].datno timestamp, SACK
T_NSACK[1-10].dat---timestamp, no SACK
T_SACK[1-10].dattimestamp. SACK

the files without extension are just two column files that summarize 
the ten runs for the four different cases, the first column in the # 
retransmission, and second column is the download time, the values are 
gathered from tcptrace


the two eps files are just the plot summarizing  the above average 
download time and average retransmission # for each case...


one more thing in the trace files, you will find 3 tcp connections, 
the first one is not modified by my emulator that causes the 
reordering (actually, that is the connection through which I reset the 
destination catch that stores some metrics from previous runs using 
some commands via ssh), the second one is the ftp control channel and 
the third one is the ftp data channelthe emulator affects the last 
two channels

and causes reordering once in a while.
please dont hesistate to ask me if anything is not clear...

Thanks a lot for taking the time

Regards,
Oumer

Stephen Hemminger wrote:


On Tue, 18 Jul 2006 18:20:47 +0200
Oumer Teyeb <[EMAIL PROTECTED]> wrote:

 


Hello Guys,

I have some questions regarding TCP SACK implementation in Linux .
As I am a subscriber, could you please cc the reply to me? thanks!


I am doing these experiments to find out the impact of reordering. 
So I have different TCP versions (newReno, SACK, FACk, DSACK, 
FRTO,) as implemented in Linux. and I am trying their 
combination to see how they behave. What struck me was that when I 
dont use timestamps, introducing SACK increases the download time 
but decreases the total number of retransmissions.
When timestamps is used, SACK leads to an increase in both the 
download time and the retransmissions.


So I looked further into the results, and what I found was that when 
SACK  is used, the retransmissions seem to happen earlier .

at www.kom.auc.dk/~oumer/first_transmission_times.pdf
you can find the pic of cdf of the time when the first TCP 
retransmission occured for the four combinations of SACK and 
timestamps after hundrends of downloads of a 100K file for the 
different conditions under network reordering...


This explains the reason why the download time increases with SACK, 
because the earlier we go into fast recovery the longer the time we 
spend on congestion avoidance, and the longer the download time


...but I couldnt figure out why the retransmissions occur earlier 
for SACK than no SACK TCP. As far as I know, for both SACK and non 
SACK cases, we need three (or more according to the setting) 
duplicate ACKs to enter the fast retransmission /recovery state 
which would have resulted in the same behaviour to the first 
occurance of a retransmission. or is there some undocumented 
enhancment in Linux TCP when using SACK that makes it enter fast 
retransmit earlier... the ony explanation I could imagine is 
something like this


non SACK case
=
1 2 3 4 5 6 7 8 9 10. were sent and 2 was reorderdand assume 
we are using delayed ACKs...and we get a triple duplicate ACK after 
pkt#8 is received. (i.e 3&4--first duplicate ACK, 5&6..second 
duplicate ACK and 7&8...third duplicate ACK.)...


so if SACK behaved like this...

3&4 SACKEd 2 packets out of order received
5&6 SACKEd4 packets out of order received start fast 
retransmissionas reorderd is greater than 3 (this is true 
when it comes to marking packets as lost during fast recovery, but 
is it true als for the first retransmission?)


.. any ideas why this is happening???


Thanks in advance,
Oumer
  



Could you post some short tcpdump snapshot summaries to 
[EMAIL PROTECTED]
 






-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

58 matches

Mail list logo