Re: [patch 1/1] ipmr_cache_unresolved() fix

2006-03-25 Thread David S. Miller
From: [EMAIL PROTECTED]
Date: Sat, 25 Mar 2006 00:33:46 -0800

 A possible bug:
 
   rt_fill_info() calls ipmr_get_route().
 
 ipmr_get_route() calls ipmr_cache_unresolved()
 
   ipmr_cache_unresolved() gets an error and does kfree_skb(skb)
 
   ipmr_cache_unresolved() returns a -ve errno to ipmr_get_route()
 
 ipmr_get_route() returns the errno to rt_fill_info()
 
   rt_fill_info() diddles with the now-kfreed skb.

Yes, this comes up every few weeks and I keep giving the same
explanation of what needs to happen here each and every time. :)

You'll notice that ipmr_get_route() tries to use the RTNETLINK skb
passed in to send a multicast IPV4 packet out to try and resolve an
unresolved entry which will obviously corrupt the RTNETLINK skb
(which obviously the caller didn't intend to let ipmr_get_route()
use to send an ipv4 packet out) and just about everything else.

So fixing some stray kfree_skb() here is basically pointless.  This
area of the ipmr code needs to be totally rewritten.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [NET]: Take RTNL when unregistering notifier

2006-03-25 Thread David S. Miller
From: Herbert Xu [EMAIL PROTECTED]
Date: Sat, 25 Mar 2006 16:00:51 +1100

 The netdev notifier call chain is currently unregistered without taking
 any locks outside the notifier system.  Because the notifier system itself
 does not synchronise unregistration with respect to the calling of the
 chain, we as its user need to do our own locking.
 
 We are supposed to take the RTNL for all calls to netdev notifiers, so
 taking the RTNL should be sufficient to protect it.
 
 The registration path in dev.c already takes the RTNL so it's OK.
 
 Signed-off-by: Herbert Xu [EMAIL PROTECTED]

Looks great, applied.

Did you actually encounter some bug due to this or it is purely
from code audit?

Thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [SCTP]: Fix up sctp_rcv return value

2006-03-25 Thread David S. Miller
From: Herbert Xu [EMAIL PROTECTED]
Date: Sat, 25 Mar 2006 16:42:56 +1100

 I was working on the ipip/xfrm problem and as usual I get side-tracked by
 other problems.

It is the nature of the game :-)

 As part of an attempt to change the IPv4 protocol handler calling
 convention I found that SCTP violated the existing convention.
 
 It's returning non-zero values after freeing the skb.  This is doubly bad
 as 1) the skb gets resubmitted; 2) the return value is interpreted as a
 protocol number.
 
 This patch changes those return values to zero.
 
 IPv6 doesn't suffer from this problem because it uses a positive return
 value as an indication for resubmission.  So the only effect of this patch
 there is to increment the IPSTATS_MIB_INDELIVERS counter which IMHO is
 the right thing to do.
 
 Signed-off-by: Herbert Xu [EMAIL PROTECTED]

Good catch, applied.

Thanks Herbert.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ip6_tunnel: fix a soft lockup when there is no active tunnel for an encapsulated packet

2006-03-25 Thread David S. Miller
From: Herbert Xu [EMAIL PROTECTED]
Date: Sat, 25 Mar 2006 08:45:39 +1100

 Hugo Santos [EMAIL PROTECTED] wrote:
This patch fixes a soft lockup in ip6_tunnel when not using
  xfrm6_tunnel (CONFIG_INET6_TUNNEL). It is triggered when an encapsula-
  ted packet reaches ip6ip6_rcv() and there is no tunnel associated with
  it. The error path returns a positive value (1) which will trigger
  ip6_input to re-submit the packet for processing. As no skb parameters
  have been changed, ip6ip6_rcv() will continue to be called with the
  exact same context. Also, ip6ip6_rcv() should free the skb when
  discarding it.
  
  Signed-off-by: Hugo Santos [EMAIL PROTECTED]
 
 OK this is a bit ugly but will do for now.  Could you please do it for
 the ICMP packet and IPv4 as well?

I don't want to apply this, for now, it's correct but really ugly :)

I'd rather the suggested cleanup occur to solve this, and I think
the fix is not so urgent that we can wait for the correct version
to get coded up.

Thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] set default max TCP buffers from memory pool size

2006-03-25 Thread David S. Miller
From: Rick Jones [EMAIL PROTECTED]
Date: Fri, 24 Mar 2006 09:45:51 -0800

 Stephen Hemminger wrote:
  We should also tag tcp_rmem/tcp_wmem as __read_mostly
 
 That would apply to just about all the tcp sysctl's yes?

Yes, absolutely.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] set default max TCP buffers from memory pool size

2006-03-25 Thread David S. Miller
From: John Heffner [EMAIL PROTECTED]
Date: Fri, 24 Mar 2006 11:47:29 -0500

 This patch sets the maximum TCP buffer sizes (available to automatic buffer 
 tuning, not to setsockopt) based on the TCP memory pool size.  The maximum 
 sndbuf and rcvbuf each will be up to 4 MB, but no more than 1/128 of the 
 memory pressure threshold.
 
 Signed-off-by: John Heffner [EMAIL PROTECTED]

Applied.

I made a minor modification, I calculated the max_shared thing using
unsigned long as the type for the max() calculation.  With the
shifts, the second argument perhaps could get shifted so high as to
become zero with large enough parameters on 64-bit and large
PAGE_SIZE.

Maybe it doesn't matter, but I felt safer doing it this way.
:-)

Thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] set default max TCP buffers from memory pool size

2006-03-25 Thread David S. Miller
From: Stephen Hemminger [EMAIL PROTECTED]
Date: Fri, 24 Mar 2006 09:23:27 -0800

 We should also tag tcp_rmem/tcp_wmem as __read_mostly

I've done this, thanks for the suggestion.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [IPV4] PMTU issues due to TOS field manipulation

2006-03-25 Thread David S. Miller
From: Herbert Xu [EMAIL PROTECTED]
Date: Fri, 24 Mar 2006 22:51:16 +1100

 On Thu, Feb 16, 2006 at 10:04:14PM +0200, Ilia Sotnikov wrote:
  
  Here it is, against 2.6.16-rc3.
 
 OK, I've brought this patch up-to-date with 2.6.16 and got rid of a few
 more references to tos in ip_rt_redirect.  Please note that the author
 of this patch is Ilia Sotnikov [EMAIL PROTECTED].
 
 From: Ilia Sotnikov [EMAIL PROTECTED]

Thanks a lot for doing this Herbert.

Applied, thanks everyone.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [NET]: Take RTNL when unregistering notifier

2006-03-25 Thread Herbert Xu
On Sat, Mar 25, 2006 at 01:24:55AM -0800, David S. Miller wrote:

 Looks great, applied.
 
 Did you actually encounter some bug due to this or it is purely
 from code audit?

It's code inspection arising out of the parameterised crypto stuff
that I'm currently working on.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][UPDATED PATCH 2.6.16] [Patch 9/9] Generic netlink interface for delay accounting

2006-03-25 Thread Balbir Singh
On Fri, Mar 24, 2006 at 08:19:25PM -0500, jamal wrote:
 On Fri, 2006-24-03 at 20:24 +0530, Balbir Singh wrote:
 
  Hmm... Would it be ok to send one message with the following format
  
  1. TLV=TASKSTATS_TYPE_PID
  2. TLV=TASKSTATS_TYPE_STATS
  3. TLV=TASKSTATS_TYPE_TGID
  4. TLV=TASKSTATS_TYPE_STATS
  
  It would still be one message, except that 3 and 4 would be optional.
  What do you think?
  
 
 No, that wont work since #2 and #4 are basically the same TLV. [Recall
 that T is used to index an array]. Your other alternative is to have
 #4 perhaps called TASKSTATS_TGID_STATS and #2 TASKSTATS_PID_STATS
 although that would smell a little.
 Dont be afraid to do the nest, it will be a little painful initially but
 i am sure once you figure it out you will appreciate it.


Thanks for the advice, I will dive into nesting. I could not find any 
in tree users who use nesting, so I have a few questions

nla_nest_start() accepts two parameters an skb and an attribute type.
Do I have to create a new attribute type like TASKSTATS_TYPE_AGGR to
contain the nested attributes 

TASKSTATS_TYPE_AGGR
   TASKSTATS_TYPE_PID/TGID
   TASKSTATS_TYPE_STATS

but this will lead to


TASKSTATS_TYPE_AGGR
   TASKSTATS_TYPE_PID
   TASKSTATS_TYPE_STATS
TASKSTATS_TYPE_AGGR
   TASKSTATS_TYPE_TGID
   TASKSTATS_TYPE_STATS

being returned from taskstats_exit_pid().

The other option is to nest

TASKSTATS_TYPE_PID/TGID
   TASKSTATS_TYPE_STATS

but the problem with this approach is, nla_len contains the length of
all attributes including the nested attribute. So it is hard to find
the offset of TASKSTATS_TYPE_STATS in the buffer.

Do I understand NLA nesting at all? May be I am missing something obvious.

Thanks,
Balbir
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Allow skb headroom to be overridden

2006-03-25 Thread Anton Blanchard

Previously we added NET_IP_ALIGN so an architecture can override the
padding done to align headers. The next step is to allow the skb
headroom to be overridden.

We currently always reserve 16 bytes to grow into, meaning all DMAs
start 16 bytes into a cacheline. On ppc64 we really want DMA writes to
start on a cacheline boundary, so we increase that headroom to one
cacheline.

Signed-off-by: Anton Blanchard [EMAIL PROTECTED]
---

Index: kernel/include/linux/skbuff.h
===
--- kernel.orig/include/linux/skbuff.h  2006-03-22 17:53:33.250531451 -0600
+++ kernel/include/linux/skbuff.h   2006-03-22 18:02:31.25608 -0600
@@ -956,6 +956,25 @@ static inline void skb_reserve(struct sk
 #define NET_IP_ALIGN   2
 #endif
 
+/*
+ * The networking layer reserves some headroom in skb data (via
+ * dev_alloc_skb). This is used to avoid having to reallocate skb data when
+ * the header has to grow. In the default case, if the header has to grow
+ * 16 bytes or less we avoid the reallocation.
+ *
+ * Unfortunately this headroom changes the DMA alignment of the resulting
+ * network packet. As for NET_IP_ALIGN, this unaligned DMA is expensive
+ * on some architectures. An architecture can override this value,
+ * perhaps setting it to a cacheline in size (since that will maintain
+ * cacheline alignment of the DMA). It must be a power of 2.
+ *
+ * Various parts of the networking layer expect at least 16 bytes of
+ * headroom, you should not reduce this.
+ */
+#ifndef NET_SKB_PAD
+#define NET_SKB_PAD16
+#endif
+
 extern int ___pskb_trim(struct sk_buff *skb, unsigned int len, int realloc);
 
 static inline void __skb_trim(struct sk_buff *skb, unsigned int len)
@@ -1045,9 +1064,9 @@ static inline void __skb_queue_purge(str
 static inline struct sk_buff *__dev_alloc_skb(unsigned int length,
  gfp_t gfp_mask)
 {
-   struct sk_buff *skb = alloc_skb(length + 16, gfp_mask);
+   struct sk_buff *skb = alloc_skb(length + NET_SKB_PAD, gfp_mask);
if (likely(skb))
-   skb_reserve(skb, 16);
+   skb_reserve(skb, NET_SKB_PAD);
return skb;
 }
 #else
@@ -1085,13 +1104,15 @@ static inline struct sk_buff *dev_alloc_
  */
 static inline int skb_cow(struct sk_buff *skb, unsigned int headroom)
 {
-   int delta = (headroom  16 ? headroom : 16) - skb_headroom(skb);
+   int delta = (headroom  NET_SKB_PAD ? headroom : NET_SKB_PAD) -
+   skb_headroom(skb);
 
if (delta  0)
delta = 0;
 
if (delta || skb_cloned(skb))
-   return pskb_expand_head(skb, (delta + 15)  ~15, 0, GFP_ATOMIC);
+   return pskb_expand_head(skb, (delta + (NET_SKB_PAD-1)) 
+   ~(NET_SKB_PAD-1), 0, GFP_ATOMIC);
return 0;
 }
 
Index: kernel/include/asm-powerpc/system.h
===
--- kernel.orig/include/asm-powerpc/system.h2006-03-22 17:53:33.250531451 
-0600
+++ kernel/include/asm-powerpc/system.h 2006-03-22 17:54:06.487187558 -0600
@@ -363,8 +363,11 @@ __cmpxchg(volatile void *ptr, unsigned l
  * powers of 2 writes until it reaches sufficient alignment).
  *
  * Based on this we disable the IP header alignment in network drivers.
+ * We also modify NET_SKB_PAD to be a cacheline in size, thus maintaining
+ * cacheline alignment of buffers.
  */
-#define NET_IP_ALIGN   0
+#define NET_IP_ALIGN   0
+#define NET_SKB_PADL1_CACHE_BYTES
 #endif
 
 #define arch_align_stack(x) (x)

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][UPDATED PATCH 2.6.16] [Patch 9/9] Generic netlink interface for delay accounting

2006-03-25 Thread jamal
On Sat, 2006-25-03 at 15:11 +0530, Balbir Singh wrote:

 
 Thanks for the advice, I will dive into nesting. I could not find any 
 in tree users who use nesting, so I have a few questions
 

Hrm - I have to say i am suprised theres nothing; i could have sworn
Thomas had done some conversions already.

 nla_nest_start() accepts two parameters an skb and an attribute type.
 Do I have to create a new attribute type like TASKSTATS_TYPE_AGGR to
 contain the nested attributes 
 
 TASKSTATS_TYPE_AGGR
TASKSTATS_TYPE_PID/TGID
TASKSTATS_TYPE_STATS
 

 but this will lead to
 
 TASKSTATS_TYPE_AGGR
TASKSTATS_TYPE_PID
TASKSTATS_TYPE_STATS
 TASKSTATS_TYPE_AGGR
TASKSTATS_TYPE_TGID
TASKSTATS_TYPE_STATS
 
 being returned from taskstats_exit_pid().
 

no this is wrong by virtue of having TASKSTATS_TYPE_AGGR twice.
Again invoke the rule i cited earlier.
What you could do instead is a second AGGR; and your nesting would be:

TASKSTATS_TYPE_AGGR1 --- nest start with this type
   TASKSTATS_TYPE_PID -- NLA_U32_PUT
   TASKSTATS_TYPE_STATS -- NAL_PUT_TYPE
 -- nest end of TASKSTATS_TYPE_AGGR1
TASKSTATS_TYPE_AGGR2 --- nest start with this type
   TASKSTATS_TYPE_TGID -- NLA_U32_PUT
   TASKSTATS_TYPE_STATS -- NAL_PUT_TYPE
   -- nest end of TASKSTATS_TYPE_AGGR2

 The other option is to nest
 
 TASKSTATS_TYPE_PID/TGID
TASKSTATS_TYPE_STATS
 

The advantage being you dont introduce another T.

 but the problem with this approach is, nla_len contains the length of
 all attributes including the nested attribute. So it is hard to find
 the offset of TASKSTATS_TYPE_STATS in the buffer.
 

So you would distinguish the two as have something like:

TASKSTATS_TYPE_PID
   u32 pid
   TASKSTATS_TYPE_STATS
TASKSTATS_TYPE_TGID
   u32 tgid
   TASKSTATS_TYPE_STATS
or
TASKSTATS_TYPE_PID
   u32 pid
TASKSTATS_TYPE_TGID
   u32 tgid

both should be fine. The difference between the two is the length in the
second case will be 4 and in the other case will be larger. 

But come to think of it, this will introduce unneeded semantics; you
have very few items to do, so forget it. Go with scheme #1 but change
the names to TASKSTATS_TYPE_AGGR_PID and TASKSTATS_TYPE_AGGR_TGID.

cheers,
jamal


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ip6_tunnel: fix a soft lockup when there is no active tunnel for an encapsulated packet

2006-03-25 Thread Hugo Santos
 I'd rather the suggested cleanup occur to solve this, and I think
 the fix is not so urgent that we can wait for the correct version
 to get coded up.

   I would be glad to code a better version like i specified in an
 earlier mail. I just didn't do it yet because Herbert said he would do
 it.  And Dave, i'm not sure i agree with it not being urgent. Surely
 the number of people that have ip6_tunnel loaded is reduced, but any
 host in the internet is able to hang any such machine by sending an
 easily crafted packet. If Herbert doesn't have the time, i can code the
 patch today as this fix is important for me.

   Hugo


signature.asc
Description: Digital signature


Re: [PATCH] ip6_tunnel: fix a soft lockup when there is no active tunnel for an encapsulated packet

2006-03-25 Thread Hugo Santos
  host in the internet is able to hang any such machine by sending an

   The ipv6-enabled internet of course :-)

   Hugo


signature.asc
Description: Digital signature


Re: [stable] [PATCH 0/2] Host AP driver update

2006-03-25 Thread John W. Linville
On Fri, Mar 24, 2006 at 10:40:38PM -0800, Greg KH wrote:
 On Fri, Mar 24, 2006 at 09:24:53PM -0800, Jouni Malinen wrote:
  Please apply following two patches to Host AP driver in wireless-2.6.
  The second patch (Fix EAPOL frame encryption) is a trivial bug fix for
  a somewhat unfortunate bug and it could be a good candidate for a
  2.6.16.x stable release ([EMAIL PROTECTED] cc'ed).
 
 John, I'll let you decide if these should be added to the stable tree or
 not.

The Fix EAPOL frame encryption patch looks like a good candidate for
stable to me.  I don't think Jouni intended the other one for stable
anyway, so I'd leave it out of there.

Thanks,

John
-- 
John W. Linville
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch] spectrum: select FW_LOADER

2006-03-25 Thread maximilian attems
The spectrum_cs driver uses request_firmware()
and thus needs to select FW_LOADER.

Signed-off-by: maximilian attems [EMAIL PROTECTED]

diff --git a/drivers/net/wireless/Kconfig b/drivers/net/wireless/Kconfig
index 6a1033e..3f02b87 100644
--- a/drivers/net/wireless/Kconfig
+++ b/drivers/net/wireless/Kconfig
@@ -400,6 +400,7 @@ config PCMCIA_HERMES
 config PCMCIA_SPECTRUM
tristate Symbol Spectrum24 Trilogy PCMCIA card support
depends on NET_RADIO  PCMCIA  HERMES
+   select FW_LOADER
---help---
 
  This is a driver for 802.11b cards using RAM-loadable Symbol
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][UPDATED PATCH 2.6.16] [Patch 9/9] Generic netlink interface for delay accounting

2006-03-25 Thread Balbir Singh
On Sat, Mar 25, 2006 at 07:52:13AM -0500, jamal wrote:
 On Sat, 2006-25-03 at 15:11 +0530, Balbir Singh wrote:
 
  
  Thanks for the advice, I will dive into nesting. I could not find any 
  in tree users who use nesting, so I have a few questions
  
 
 Hrm - I have to say i am suprised theres nothing; i could have sworn
 Thomas had done some conversions already.

I used cscope to check for global references and callers of nla_nest_.*.
I could not find anything.

 
  nla_nest_start() accepts two parameters an skb and an attribute type.
  Do I have to create a new attribute type like TASKSTATS_TYPE_AGGR to
  contain the nested attributes 
  
  TASKSTATS_TYPE_AGGR
 TASKSTATS_TYPE_PID/TGID
 TASKSTATS_TYPE_STATS
  
 
  but this will lead to
  
  TASKSTATS_TYPE_AGGR
 TASKSTATS_TYPE_PID
 TASKSTATS_TYPE_STATS
  TASKSTATS_TYPE_AGGR
 TASKSTATS_TYPE_TGID
 TASKSTATS_TYPE_STATS
  
  being returned from taskstats_exit_pid().
  
 
 no this is wrong by virtue of having TASKSTATS_TYPE_AGGR twice.
 Again invoke the rule i cited earlier.

Yes, thats why I wanted to point it out to you. Thanks for explaining the
rule.

 What you could do instead is a second AGGR; and your nesting would be:
 
 TASKSTATS_TYPE_AGGR1 --- nest start with this type
TASKSTATS_TYPE_PID -- NLA_U32_PUT
TASKSTATS_TYPE_STATS -- NAL_PUT_TYPE
  -- nest end of TASKSTATS_TYPE_AGGR1
 TASKSTATS_TYPE_AGGR2 --- nest start with this type
TASKSTATS_TYPE_TGID -- NLA_U32_PUT
TASKSTATS_TYPE_STATS -- NAL_PUT_TYPE
-- nest end of TASKSTATS_TYPE_AGGR2
 
  The other option is to nest
  
  TASKSTATS_TYPE_PID/TGID
 TASKSTATS_TYPE_STATS
  
 
 The advantage being you dont introduce another T.
 
  but the problem with this approach is, nla_len contains the length of
  all attributes including the nested attribute. So it is hard to find
  the offset of TASKSTATS_TYPE_STATS in the buffer.
  
 
 So you would distinguish the two as have something like:
 
 TASKSTATS_TYPE_PID
u32 pid
TASKSTATS_TYPE_STATS
 TASKSTATS_TYPE_TGID
u32 tgid
TASKSTATS_TYPE_STATS
 or
 TASKSTATS_TYPE_PID
u32 pid
 TASKSTATS_TYPE_TGID
u32 tgid
 
 both should be fine. The difference between the two is the length in the
 second case will be 4 and in the other case will be larger. 
 
 But come to think of it, this will introduce unneeded semantics; you
 have very few items to do, so forget it. Go with scheme #1 but change
 the names to TASKSTATS_TYPE_AGGR_PID and TASKSTATS_TYPE_AGGR_TGID.


I prefer #1 as well. The overloaded use of the same type with different lengths
can be confusing.
 
 cheers,
 jamal
 

Here is another attempt (one more iteration) at trying to get it right.
Thank you for your patience and help in getting it right.

Changelog
-

As discussed in our email.

Thanks,
Balbir

Signed-off-by: Shailabh Nagar [EMAIL PROTECTED]
Signed-off-by: Balbir Singh [EMAIL PROTECTED]

---

 include/linux/delayacct.h |   11 +
 include/linux/taskstats.h |  113 +
 init/Kconfig  |   16 ++
 kernel/Makefile   |1 
 kernel/delayacct.c|   44 ++
 kernel/taskstats.c|  291 ++
 6 files changed, 473 insertions(+), 3 deletions(-)

diff -puN include/linux/delayacct.h~delayacct-genetlink 
include/linux/delayacct.h
--- linux-2.6.16/include/linux/delayacct.h~delayacct-genetlink  2006-03-22 
11:56:03.0 +0530
+++ linux-2.6.16-balbir/include/linux/delayacct.h   2006-03-22 
11:56:03.0 +0530
@@ -15,6 +15,7 @@
 #define _LINUX_TASKDELAYS_H
 
 #include linux/sched.h
+#include linux/taskstats.h
 
 #ifdef CONFIG_TASK_DELAY_ACCT
 extern int delayacct_on;   /* Delay accounting turned on/off */
@@ -25,6 +26,7 @@ extern void __delayacct_tsk_exit(struct 
 extern void __delayacct_blkio_start(void);
 extern void __delayacct_blkio_end(void);
 extern unsigned long long __delayacct_blkio_ticks(struct task_struct *);
+extern int __delayacct_add_tsk(struct taskstats *, struct task_struct *);
 
 static inline void delayacct_tsk_init(struct task_struct *tsk)
 {
@@ -72,4 +74,13 @@ static inline unsigned long long delayac
return 0;
 }
 #endif /* CONFIG_TASK_DELAY_ACCT */
+#ifdef CONFIG_TASKSTATS
+static inline int delayacct_add_tsk(struct taskstats *d,
+   struct task_struct *tsk)
+{
+   if (!tsk-delays)
+   return -EINVAL;
+   return __delayacct_add_tsk(d, tsk);
+}
+#endif
 #endif /* _LINUX_TASKDELAYS_H */
diff -puN /dev/null include/linux/taskstats.h
--- /dev/null   2004-06-24 23:34:38.0 +0530
+++ linux-2.6.16-balbir/include/linux/taskstats.h   2006-03-25 
20:56:55.0 +0530
@@ -0,0 +1,113 @@
+/* taskstats.h - exporting per-task statistics
+ *
+ * Copyright (C) Shailabh Nagar, IBM Corp. 2006
+ *   (C) Balbir Singh,   IBM Corp. 2006
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the 

Re: [RFC] bcm43xx: please review for 2.6.17

2006-03-25 Thread John W. Linville
On Sat, Mar 25, 2006 at 03:16:20PM +0100, Michael Buesch wrote:
 Hm, and I think someone already reported that issue, John:
 
 [EMAIL PROTECTED]:~/develop/git/buesch-wireless-2.6$ git pull linville softmac
 error: no such remote ref refs/heads/softmac
 Fetch failure: 
 git://kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.git
 
 Is that intentional? New repository layout?

Yes the old softmac branch has been renamed as devel for the
time being.  Eventually it will go away.

The name softmac didn't make sense for the branch after the softmac
code was merged upstream -- at least that was the thinking.

I apologize for the confusion!

John
-- 
John W. Linville
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][UPDATED PATCH 2.6.16] [Patch 9/9] Generic netlink interface for delay accounting

2006-03-25 Thread jamal
On Sat, 2006-25-03 at 21:06 +0530, Balbir Singh wrote:
 On Sat, Mar 25, 2006 at 07:52:13AM -0500, jamal wrote:


I didnt pay attention to failure paths etc; i suppose your testing
should catch those. Getting there, a couple more comments:


 +enum {
 + TASKSTATS_CMD_UNSPEC = 0,   /* Reserved */
 + TASKSTATS_CMD_GET,  /* user-kernel request */
 + TASKSTATS_CMD_NEW,  /* kernel-user event */

Should the comment read kernel-user event/get-response


 +
 +static int taskstats_send_stats(struct sk_buff *skb, struct genl_info *info)
 +{


 +
 + if (info-attrs[TASKSTATS_CMD_ATTR_PID]) {
 + u32 pid = nla_get_u32(info-attrs[TASKSTATS_CMD_ATTR_PID]);
 + rc = fill_pid((pid_t)pid, NULL, stats);
 + if (rc  0)
 + goto err;
 +
 + na = nla_nest_start(rep_skb, TASKSTATS_TYPE_AGGR_PID);
 + NLA_PUT_U32(rep_skb, TASKSTATS_TYPE_PID, pid);
 + } else if (info-attrs[TASKSTATS_CMD_ATTR_TGID]) {

in regards to the elseif above:
Could you not have both PID and TGID passed? From my earlier
understanding it seemed legit, no? if answer is yes, then you will have
to do your sizes + reply TLVs at the end.

Also in regards to the nesting, isnt there a need for nla_nest_cancel in
case of failures to add TLVs?

cheers,
jamal


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][UPDATED PATCH 2.6.16] [Patch 9/9] Generic netlink interface for delay accounting

2006-03-25 Thread Balbir Singh
On 3/25/06, jamal [EMAIL PROTECTED] wrote:
 On Sat, 2006-25-03 at 21:06 +0530, Balbir Singh wrote:
  On Sat, Mar 25, 2006 at 07:52:13AM -0500, jamal wrote:


 I didnt pay attention to failure paths etc; i suppose your testing
 should catch those. Getting there, a couple more comments:


Yes, I have tried several negative test cases.


  +enum {
  + TASKSTATS_CMD_UNSPEC = 0,   /* Reserved */
  + TASKSTATS_CMD_GET,  /* user-kernel request */
  + TASKSTATS_CMD_NEW,  /* kernel-user event */

 Should the comment read kernel-user event/get-response


Yes, good catch. I will update the comment.


  +
  +static int taskstats_send_stats(struct sk_buff *skb, struct genl_info 
  *info)
  +{


  +
  + if (info-attrs[TASKSTATS_CMD_ATTR_PID]) {
  + u32 pid = nla_get_u32(info-attrs[TASKSTATS_CMD_ATTR_PID]);
  + rc = fill_pid((pid_t)pid, NULL, stats);
  + if (rc  0)
  + goto err;
  +
  + na = nla_nest_start(rep_skb, TASKSTATS_TYPE_AGGR_PID);
  + NLA_PUT_U32(rep_skb, TASKSTATS_TYPE_PID, pid);
  + } else if (info-attrs[TASKSTATS_CMD_ATTR_TGID]) {

 in regards to the elseif above:
 Could you not have both PID and TGID passed? From my earlier
 understanding it seemed legit, no? if answer is yes, then you will have
 to do your sizes + reply TLVs at the end.

No, we cannot have both passed. If we pass both a PID and a TGID and
then the code returns just the stats for the PID.


 Also in regards to the nesting, isnt there a need for nla_nest_cancel in
 case of failures to add TLVs?


I thought about it, but when I looked at the code of genlmsg_cancel()
and nla_nest_cancel().  It seemed that genlmsg_cancel() should
suffice.

snippet
static inline int genlmsg_cancel(struct sk_buff *skb, void *hdr)
{
return nlmsg_cancel(skb, hdr - GENL_HDRLEN - NLMSG_HDRLEN);
}

static inline int nlmsg_cancel(struct sk_buff *skb, struct nlmsghdr *nlh)
{
skb_trim(skb, (unsigned char *) nlh - skb-data);

return -1;
}

static inline int nla_nest_cancel(struct sk_buff *skb, struct nlattr *start)
{
if (start)
skb_trim(skb, (unsigned char *) start - skb-data);

return -1;
}

/snippet

genlmsg_cancel() seemed more generic, since it handles skb_trim from
the nlmsghdr down to skb-data, where as nla_test_cancel() does it
only from the start of the nested attributes to skb-data.

Is my understanding correct?


 cheers,
 jamal


Thanks,
Balbir
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[2.6 patch] make UNIX a bool

2006-03-25 Thread Adrian Bunk
On Sat, Feb 25, 2006 at 11:46:31PM +0100, Olaf Hering wrote:
  On Sat, Feb 25, Adrian Bunk wrote:
 
  CONFIG_UNIX=m doesn't make much sense.
 
 There is likely more code to support a modular unix.ko, this has to go
 as well.

Sounds resonable, updated patch below.

cu
Adrian


--  snip  --


CONFIG_UNIX=m doesn't make that much sense and requires us to export 
things we don't want to export to modules.

Signed-off-by: Adrian Bunk [EMAIL PROTECTED]

---

 fs/file_table.c|1 -
 include/net/af_unix.h  |2 --
 net/unix/Kconfig   |2 +-
 net/unix/af_unix.c |   18 --
 net/unix/sysctl_net_unix.c |9 +
 5 files changed, 2 insertions(+), 30 deletions(-)

--- linux-2.6.16-mm1-full/net/unix/Kconfig.old  2006-03-25 20:10:47.0 
+0100
+++ linux-2.6.16-mm1-full/net/unix/Kconfig  2006-03-25 20:10:54.0 
+0100
@@ -3,7 +3,7 @@
 #
 
 config UNIX
-   tristate Unix domain sockets
+   bool Unix domain sockets
---help---
  If you say Y here, you will include support for Unix domain sockets;
  sockets are the standard Unix mechanism for establishing and
--- linux-2.6.16-mm1-full/include/net/af_unix.h.old 2006-03-25 
20:13:33.0 +0100
+++ linux-2.6.16-mm1-full/include/net/af_unix.h 2006-03-25 20:13:40.0 
+0100
@@ -85,10 +85,8 @@
 #ifdef CONFIG_SYSCTL
 extern int sysctl_unix_max_dgram_qlen;
 extern void unix_sysctl_register(void);
-extern void unix_sysctl_unregister(void);
 #else
 static inline void unix_sysctl_register(void) {}
-static inline void unix_sysctl_unregister(void) {}
 #endif
 #endif
 #endif
--- linux-2.6.16-mm1-full/net/unix/af_unix.c.old2006-03-25 
20:11:07.0 +0100
+++ linux-2.6.16-mm1-full/net/unix/af_unix.c2006-03-25 20:14:05.0 
+0100
@@ -475,7 +475,6 @@
 
 static const struct proto_ops unix_stream_ops = {
.family =   PF_UNIX,
-   .owner =THIS_MODULE,
.release =  unix_release,
.bind = unix_bind,
.connect =  unix_stream_connect,
@@ -496,7 +495,6 @@
 
 static const struct proto_ops unix_dgram_ops = {
.family =   PF_UNIX,
-   .owner =THIS_MODULE,
.release =  unix_release,
.bind = unix_bind,
.connect =  unix_dgram_connect,
@@ -517,7 +515,6 @@
 
 static const struct proto_ops unix_seqpacket_ops = {
.family =   PF_UNIX,
-   .owner =THIS_MODULE,
.release =  unix_release,
.bind = unix_bind,
.connect =  unix_stream_connect,
@@ -538,7 +535,6 @@
 
 static struct proto unix_proto = {
.name = UNIX,
-   .owner= THIS_MODULE,
.obj_size = sizeof(struct unix_sock),
 };
 
@@ -2012,7 +2008,6 @@
 }
 
 static struct file_operations unix_seq_fops = {
-   .owner  = THIS_MODULE,
.open   = unix_seq_open,
.read   = seq_read,
.llseek = seq_lseek,
@@ -2024,7 +2019,6 @@
 static struct net_proto_family unix_family_ops = {
.family = PF_UNIX,
.create = unix_create,
-   .owner  = THIS_MODULE,
 };
 
 static int __init af_unix_init(void)
@@ -2053,16 +2047,4 @@
return rc;
 }
 
-static void __exit af_unix_exit(void)
-{
-   sock_unregister(PF_UNIX);
-   unix_sysctl_unregister();
-   proc_net_remove(unix);
-   proto_unregister(unix_proto);
-}
-
 module_init(af_unix_init);
-module_exit(af_unix_exit);
-
-MODULE_LICENSE(GPL);
-MODULE_ALIAS_NETPROTO(PF_UNIX);
--- linux-2.6.16-mm1-full/net/unix/sysctl_net_unix.c.old2006-03-25 
20:14:10.0 +0100
+++ linux-2.6.16-mm1-full/net/unix/sysctl_net_unix.c2006-03-25 
20:15:21.0 +0100
@@ -46,15 +46,8 @@
{ .ctl_name = 0 }
 };
 
-static struct ctl_table_header * unix_sysctl_header;
-
 void unix_sysctl_register(void)
 {
-   unix_sysctl_header = register_sysctl_table(unix_root_table, 0);
-}
-
-void unix_sysctl_unregister(void)
-{
-   unregister_sysctl_table(unix_sysctl_header);
+   register_sysctl_table(unix_root_table, 0);
 }
 
--- linux-2.6.16-mm1-full/fs/file_table.c.old   2006-03-25 20:39:30.0 
+0100
+++ linux-2.6.16-mm1-full/fs/file_table.c   2006-03-25 20:39:44.0 
+0100
@@ -62,7 +62,6 @@
 {
return files_stat.max_files;
 }
-EXPORT_SYMBOL_GPL(get_max_files);
 
 /*
  * Handle nr_files sysctl

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] acxsm: Reduce the number of ACX_PACKED instructions

2006-03-25 Thread Carlos Martin
Up to now, we were using ACX_PACKED after every field. I've finally
found out how to use only one at the end of each struct whilst
maintaining the typedef where it is now.

This should also apply to acx with a bit of fuzz, but I consider it to
be in maintenance mode, so this doesn't qualify for it.

Signed-off-by: Carlos Martin [EMAIL PROTECTED]

---

 acx_struct.h |  874 +-
 common.c |   26 +-
 ioctl.c  |   10 -
 usb.c|8 -
 4 files changed, 457 insertions(+), 461 deletions(-)

5fb2fdfd8c028a40921bbf9ef7ec4c53c03fcab4
diff --git a/acx_struct.h b/acx_struct.h
index 227b6e3..8c92302 100644
--- a/acx_struct.h
+++ b/acx_struct.h
@@ -522,9 +522,9 @@ DEF_IE(111_IE_DOT11_INVAL_1013, 0x1013
  * --vda
  */
 typedef struct phy_hdr {
-   u8  unknown[4] ACX_PACKED;
-   u8  acx111_unknown[4] ACX_PACKED;
-} phy_hdr_t;
+   u8  unknown[4];
+   u8  acx111_unknown[4];
+} ACX_PACKED phy_hdr_t;
 
 /* seems to be a bit similar to hfa384x_rx_frame.
  * These fields are still not quite obvious, though.
@@ -584,22 +584,22 @@ time: 4 bytes:
 */
 
 typedef struct rxbuffer {
-   u16 mac_cnt_rcvd ACX_PACKED;/* only 12 bits are len! 
(0xfff) */
-   u8  mac_cnt_mblks ACX_PACKED;
-   u8  mac_status ACX_PACKED;
-   u8  phy_stat_baseband ACX_PACKED;   /* bit 0x80: used LNA 
(Low-Noise Amplifier) */
-   u8  phy_plcp_signal ACX_PACKED;
-   u8  phy_level ACX_PACKED;   /* PHY stat */
-   u8  phy_snr ACX_PACKED; /* PHY stat */
-   u32 time ACX_PACKED;/* timestamp upon MAC rcv first 
byte */
+   u16 mac_cnt_rcvd;   /* only 12 bits are len! (0xfff) */
+   u8  mac_cnt_mblks;
+   u8  mac_status;
+   u8  phy_stat_baseband;  /* bit 0x80: used LNA (Low-Noise 
Amplifier) */
+   u8  phy_plcp_signal;
+   u8  phy_level;  /* PHY stat */
+   u8  phy_snr;/* PHY stat */
+   u32 time;   /* timestamp upon MAC rcv first byte */
 /* 4-byte (acx100) or 8-byte (acx111) phy header will be here
 ** if RX_CFG1_INCLUDE_PHY_HDR is in effect:
 ** phy_hdr_t phy   */
-   wlan_hdr_a3_t hdr_a3 ACX_PACKED;
+   wlan_hdr_a3_t hdr_a3;
/* maximally sized data part of wlan packet */
-   u8  data_a3[WLAN_A4FR_MAXLEN_WEP_FCS - WLAN_HDR_A3_LEN] ACX_PACKED;
+   u8  data_a3[WLAN_A4FR_MAXLEN_WEP_FCS - WLAN_HDR_A3_LEN];
/* can add hdr/data_a4 if needed */
-} rxbuffer_t;
+} ACX_PACKED rxbuffer_t;
 
 
 /*--- Firmware statistics 
*/
@@ -609,93 +609,93 @@ typedef struct rxbuffer {
 #define FW_STATS_FUTURE_EXTENSION  100
 
 typedef struct fw_stats_tx {
-   u32 tx_desc_of ACX_PACKED;
-} fw_stats_tx_t;
+   u32 tx_desc_of;
+} ACX_PACKED fw_stats_tx_t;
 
 typedef struct fw_stats_rx {
-   u32 rx_oom ACX_PACKED;
-   u32 rx_hdr_of ACX_PACKED;
-   u32 rx_hw_stuck ACX_PACKED; /* old: u32 rx_hdr_use_next */
-   u32 rx_dropped_frame ACX_PACKED;
-   u32 rx_frame_ptr_err ACX_PACKED;
-   u32 rx_xfr_hint_trig ACX_PACKED;
-   u32 rx_aci_events ACX_PACKED; /* later versions only */
-   u32 rx_aci_resets ACX_PACKED; /* later versions only */
-} fw_stats_rx_t;
+   u32 rx_oom;
+   u32 rx_hdr_of;
+   u32 rx_hw_stuck; /* old: u32rx_hdr_use_next */
+   u32 rx_dropped_frame;
+   u32 rx_frame_ptr_err;
+   u32 rx_xfr_hint_trig;
+   u32 rx_aci_events; /* later versions only */
+   u32 rx_aci_resets; /* later versions only */
+} ACX_PACKED fw_stats_rx_t;
 
 typedef struct fw_stats_dma {
-   u32 rx_dma_req ACX_PACKED;
-   u32 rx_dma_err ACX_PACKED;
-   u32 tx_dma_req ACX_PACKED;
-   u32 tx_dma_err ACX_PACKED;
-} fw_stats_dma_t;
+   u32 rx_dma_req;
+   u32 rx_dma_err;
+   u32 tx_dma_req;
+   u32 tx_dma_err;
+} ACX_PACKED fw_stats_dma_t;
 
 typedef struct fw_stats_irq {
-   u32 cmd_cplt ACX_PACKED;
-   u32 fiq ACX_PACKED;
-   u32 rx_hdrs ACX_PACKED;
-   u32 rx_cmplt ACX_PACKED;
-   u32 rx_mem_of ACX_PACKED;
-   u32 rx_rdys ACX_PACKED;
-   u32 irqs ACX_PACKED;
-   u32 tx_procs ACX_PACKED;
-   u32 decrypt_done ACX_PACKED;
-   u32 dma_0_done ACX_PACKED;
-   u32 dma_1_done ACX_PACKED;
-   u32 tx_exch_complet ACX_PACKED;
-   u32 commands ACX_PACKED;
-   u32 rx_procs ACX_PACKED;
-   u32 hw_pm_mode_changes ACX_PACKED;
-   u32 host_acks ACX_PACKED;
-   u32 pci_pm ACX_PACKED;
-   u32 acm_wakeups ACX_PACKED;
-} fw_stats_irq_t;
+   u32 cmd_cplt;
+   u32 fiq;
+   u32 rx_hdrs;
+   u32 rx_cmplt;
+   

Re: sky2: hangs on 2.6.16

2006-03-25 Thread MichaelM
On Fri, Mar 24, 2006 at 02:32:41PM -0800, Stephen Hemminger wrote:
 On Fri, 24 Mar 2006 22:13:54 +
 Michael Menegakis [EMAIL PROTECTED] wrote:
 
 
  were they any helpfull?
 
 The first thing to look for is are packets showing up (and being transmitted)
 by doing 
   ethtool -S eth0
 Since in this driver stats come out of the PHY, it is possible for the PHY
 to be receiving packets but have the bus interface wedged.
 
 It also will tell you if you have pause frames going back and forth.
 You might have a bad switch that doesn't do flow-control properly.
 
 Next you can turn on debug with:
   ethtool -s eth0 msglvl 0xfff
 
 and see if packets are being received and transmitted. 
 
 In your case, it looks like the driver is receiving and transmitting fine;
 so it probably is in the upper layers. So look into higher level statistics
 like: netstat (or ip and ss).
 
 
 Other possiblities:
 
 * turn off TSO
   ethtool -K eth0 tso off
 * turn off Tx checksum
   ethtool -K eth0 tx off
 * turn off Rx checksum
   ethtool -K eth0 rx off
 
 If you get things really wedged and want to dig into the driver and
 look at all the registers. But it really requires lots of time to decode...
 
   ethtool -d eth0 raw on /tmp/eth0.dump
   hexdump /tmp/eth0.dump

I hope this helps at all since my knowledge in all this is very limited
if not inexistent.

The iface seems to recieve but not transmit after the hang. The debug
options above managed to have a similar output with debug=16
on the logs. the turn off options, didn't seem to alter the way of
logging or affect networking, may have missed sth though.

an indicator I managed to get was that connections during
the hang up, seem to be stuck at netstat as FIN_WAIT1. that is, if I 
ctrl-c the application which tests downloading from multiple http
connections or if wait for those connections to time out.

on ethereal before/during/after the hang I got many TCP segment of a
reassembled PDU send from here, TCP Dup ACK send from other end, TCP
retransmission - TCP segment of a reassembled PDU from here, TCP Keep
Alive from here, TCP Out-of-order - TCP segment of a reassembled
PDU ..
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] bcm43xx: please review for 2.6.17

2006-03-25 Thread Benjamin Herrenschmidt
On Thu, 2006-03-23 at 22:01 -0800, David S. Miller wrote:

 Otherwise looks fine.
 
 Please find a non-x86_64 64-bit system to at least cross compile test
 into, preferably big-endian to really get all the nasties out :-)

That reminds me that I should really add something to the ppc64 iommu
code to be able to limit allocations of virtual dma space to the dma
mask of the device... should be easy on the apple machines (though
impossible on IBM pSeries that partition the DMA space between bridges)
and thus would probably fix using that driver on G5s...

/me curses Broadcom for that...

Ben.


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Output packet processing (was stretch ACKs, etc.)

2006-03-25 Thread Mark Butler

David S. Miller wrote:


From: Mark Butler [EMAIL PROTECTED]
Date: Fri, 24 Mar 2006 22:37:26 -0700

 

On a more general note, I find the idea that a current dst entry doesn't 
actually reflect the interface (even a logical interface) and nexthop 
that will be used to deliver a packet a little disturbing.  It would 
seem to me that any filter that is going to re-route a packet to a 
different address or a different interface should be a logical device 
(with its own IP address) or logical interface, respectively.   
Otherwise what is going on is completely invisible to the transport 
protocol, as well as users of tools like traceroute.
   



Welcome to firewalls and NAT.
 

A true firewall should never need to do anything but drop packets and 
reset connections.  Changes to the way packets are routed should be done 
at the routing layer, using the flow information from the transport 
layer.  Simple firewall rules should be implemented the same way.  By 
the time a dst entry is returned, the need for NF output chain 
processing should be minimal to non-existent. 

Serialized processing of every IP packet, whether it needs it or not is 
ridiculously inefficient.  No high capacity router would operate that 
way.  A route decision for a flow would be made once, and data in most 
flows would use a fast (generally hardware) path without further 
consideration.


Of course NAT processing only needs to be done on the NF forward chain, 
not the input or output chains. No need to affect local transport 
protocols at all.  The need for any kind of NF processing should be 
reflected in the routing tables, and echoed in the dst entry (or dst 
entry stack). 

There has been discussion of Van Jacobson style optimization of the 
input chain. Well the quickest way to optimize the output chain would be 
to return filtered routing information to the transport layer so that a 
transport protocol could run its own output processing.  For example, 
why should IPSEC encryption be delayed to the moment of transmission?  
Why should a re-transmitted packet be re-encrypted?  Performance would 
be improved significantly if a transport  protocol could arrange for 
IPSEC transformations to be done in advance, so that when a congestion 
window opening ACK arrived, data could be transmitted without further 
delay.  Same deal for retransmissions.  IPSEC encryption would then 
generally occur in the process context of the sender, rather than 
softirq context at the last possible moment.


Same thing for Neighbor discovery delays and IP fragmentation.  Instead 
of holding a packet somewhere in the IP layer waiting for an ARP reply, 
the transport driver should just get an appropriate notification.  Then 
it could (for example) bundle additional data into the same packet in 
the meantime.


Transports could easily hold IP fragments for further processing as 
well. Some of them (notably DCCP) can profitably make use of IP 
datagrams with missing segments. Other transports could use the 
information to make better determinations about congestion and packet 
loss.  In any case IP segmentation and reassembly at the transport layer 
would be more efficient and would be a straight forward extension of 
what is already present for anything more sophisticated than UDP.



You don't know anything until the packet is examined by the filter,
because it's impossible to know what rule would be matched until the
packet is actually built, since the rule matching is on packet
contents (such as the source and destination IP addresses, and source
and destination ports, but more obscure mathing is also possible, like
matching by TOS or other IP header flags).
 

The flowi structure already contains all that information for routing 
purposes.  No reason why it could not be used to do early netfilter 
reduction as well. Right?


- Mark B.


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6.16-gitX] heavy performance regression in ipw2200 wireless driver

2006-03-25 Thread Alessandro Suardi
On 3/24/06, Zhu Yi [EMAIL PROTECTED] wrote:
 On Thu, 2006-03-23 at 15:02 +0100, Alessandro Suardi wrote:
  That scp test shows 50%ish - but that was a quickie. The VNC
   client even reported a 719Kbps throughput down from the more
   usual 11500Kbps it starts off with. The first scp I tried when the
   sluggishness was intolerable was going at 200KB/s - which
   shows the problem can easily get in the neighborhood of an
   order of magnitude.

 What kind of wireless encryption do you use? We turned off hardware
 encryption by default recently as a workaround for a firmware restart
 bug. You might want to load module with modprobe ipw2200 hwcrypto=1
 and retest.

The issue seems to have vanished in more recent kernel snapshots
 (namely, 2.6.16-git3 and -git5 exhibited the problem; -git8 and -git9
 did not).

I will holler if the problem pops up again... thanks,

--alessandro

 Dreamer ? Each one of us is a dreamer. We just push it down deep because
   we are repeatedly told that we are not allowed to dream in real life
 (Reinhold Ziegler)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Output packet processing (was stretch ACKs, etc.)

2006-03-25 Thread Andi Kleen
On Saturday 25 March 2006 23:32, Mark Butler wrote:

 A true firewall should never need to do anything but drop packets and 
 reset connections.  Changes to the way packets are routed should be done 
 at the routing layer, using the flow information from the transport 
 layer. 

The real world doesn't work this way.

 The flowi structure already contains all that information for routing 
 purposes.  No reason why it could not be used to do early netfilter 
 reduction as well. Right?

netfilter is unfortunately too powerfull for that. It can do many complex
dynamic decisions per packet that are impossible to cache or predict.

In theory you could try to build such a fast path for some simple 
filtering that implements a subset of full netfilter, but nobody has 
attempted to do so so far.

-Andi
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[KJ][Patch] fix array overflows in de4x5.c

2006-03-25 Thread Darren Jenkins\\
G'day list

Coverity found 3 'OVERRUN_STATIC' in de4x5.c, @ lines 4814, 5115 and
5125.

Looking at the code these look like very minor problems, but as they are
easy to fix I though I would do a patch.

The patch below just adds an explicit check for the array index in
type3_infoblock() and corrects a loop exit check, to eliminate an 
(array+1) error in mii_get_phy().

Note: A better solution in type3_infoblock() may be to change 
- #define MOTO_SROM_BUG((lp-active == 8)  
(((le32_to_cpu(get_unaligned(((s32 *)dev-dev_addr0x00ff)==0x3e0008))
+ #define MOTO_SROM_BUG((lp-active = DE4X5_MAX_PHY) || 
(((le32_to_cpu(get_unaligned(((s32 *)dev-dev_addr0x00ff)==0x3e0008

as this seems to make sense, but as i am not sure that this bug happens
when (lp-active  8) I am reluctant to change this.



Signed-off-by: Darren Jenkins [EMAIL PROTECTED]

--- linux-2.6.16-git8/drivers/net/tulip/de4x5.c.orig2006-03-26 
14:36:17.0 +1100
+++ linux-2.6.16-git8/drivers/net/tulip/de4x5.c 2006-03-26 17:06:06.0 
+1100
@@ -4810,7 +4810,8 @@ type3_infoblock(struct net_device *dev, 
 if (lp-state == INITIALISED) {
 lp-ibn = 3;
 lp-active = *p++;
-   if (MOTO_SROM_BUG) lp-active = 0;
+   if (MOTO_SROM_BUG || lp-active = DE4X5_MAX_PHY)
+   lp-active = 0;
lp-phy[lp-active].gep = (*p ? p : NULL); p += (2 * (*p) + 1);
lp-phy[lp-active].rst = (*p ? p : NULL); p += (2 * (*p) + 1);
lp-phy[lp-active].mc  = TWIDDLE(p); p += 2;
@@ -5111,7 +5112,7 @@ mii_get_phy(struct net_device *dev)
break;
}
if ((j == limit)  (i  DE4X5_MAX_MII)) {
-   for (k=0; lp-phy[k].id  (k  DE4X5_MAX_PHY); k++);
+   for (k=0; lp-phy[k].id  (k  DE4X5_MAX_PHY - 1); k++);
lp-phy[k].addr = i;
lp-phy[k].id = id;
lp-phy[k].spd.reg = GENERIC_REG;  /* ANLPA register */


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html