Re: [Bugme-new] [Bug 9773] New: pptp/ppp connection die at high speed on Athlon X2 6000+

2008-01-17 Thread Andrew Morton
On Thu, 17 Jan 2008 23:33:54 -0800 (PST) [EMAIL PROTECTED] wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=9773
> 
>Summary: pptp/ppp connection die at high speed on Athlon X2 6000+
>Product: Networking
>Version: 2.5
>  KernelVersion: 2.6.23.12
>   Platform: All
> OS/Version: Linux
>   Tree: Mainline
> Status: NEW
>   Severity: blocking
>   Priority: P1
>  Component: Other
> AssignedTo: [EMAIL PROTECTED]
> ReportedBy: [EMAIL PROTECTED]
> 
> 
> Distribution: debian sid, vanilla kernel
> Hardware Environment: Athlon64 X2 6000+ Nforce5
> Software Environment: pptp 1.7.0 pppd 2.4.4
> 
> Problem Description:
> when I download data at high speed (1,5 MB/s or more) connection (pptp dies)
> drops and reconnect begins (persist option used for pppd) and after reconnect
> it dies again and again. connection made by "pppd call provider"
> 
> before upgrade of hardware (was Athlon64 3500+ NForce4) all works fine at any
> speed.
> 
> Connection will be working if I do some CPU aggressive work such as kernel
> compile at the same time when I download. When CPU load become low it will 
> drop
> again.
> 
> kernel 2.6.24rc8 makes it worst. As I see it just hangs my console with 
> message
> something like "waiting for ppp0 finish" or so... I don't remember exactly
> message.
> 
> 
> Steps to reproduce:
> I think you can use any X2 Athlon, make pptp connection and try to download
> something...

(weird)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] [POWERPC] fsl_soc: add support for gianfar for fixed-link property

2008-01-17 Thread Kumar Gala
On Fri, 7 Dec 2007, Vitaly Bordug wrote:

>
> fixed-link says: register new "Fixed/emulated PHY", i.e. PHY that
> not connected to the real MDIO bus.
>
> Signed-off-by: Vitaly Bordug <[EMAIL PROTECTED]>
> Signed-off-by: Anton Vorontsov <[EMAIL PROTECTED]>
>
> ---
>
>  Documentation/powerpc/booting-without-of.txt |4 +
>  arch/powerpc/sysdev/fsl_soc.c|   79 
> --
>  2 files changed, 66 insertions(+), 17 deletions(-)
>

applied.

- k
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] [POWERPC] MPC8349E-mITX: Vitesse 7385 PHY is not connected to the MDIO bus

2008-01-17 Thread Kumar Gala
On Fri, 7 Dec 2007, Vitaly Bordug wrote:

>
> ...thus use fixed-link to register proper "Fixed PHY"
>
> Signed-off-by: Anton Vorontsov <[EMAIL PROTECTED]>
> Signed-off-by: Vitaly Bordug <[EMAIL PROTECTED]>
>

applied.

- k
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] [NET] phy/fixed.c: rework to not duplicate PHY layer functionality

2008-01-17 Thread Kumar Gala
On Fri, 7 Dec 2007, Vitaly Bordug wrote:

>
> With that patch fixed.c now fully emulates MDIO bus, thus no need
> to duplicate PHY layer functionality. That, in turn, drastically
> simplifies the code, and drops down line count.
>
> As an additional bonus, now there is no need to register MDIO bus
> for each PHY, all emulated PHYs placed on the platform fixed MDIO bus.
> There is also no more need to pre-allocate PHYs via .config option,
> this is all now handled dynamically.
>
>
> Signed-off-by: Anton Vorontsov <[EMAIL PROTECTED]>
> Signed-off-by: Vitaly Bordug <[EMAIL PROTECTED]>
> Acked-by: Jeff Garzik <[EMAIL PROTECTED]>
>
> ---
>
>  drivers/net/phy/Kconfig   |   32 +--
>  drivers/net/phy/fixed.c   |  445 
> +
>  include/linux/phy_fixed.h |   51 ++---
>  3 files changed, 195 insertions(+), 333 deletions(-)
>

applied.

- k
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH] Fixing SA/SP dumps on netlink/af_key

2008-01-17 Thread Timo Teräs
Herbert Xu wrote:
> On Thu, Jan 17, 2008 at 03:31:14PM +0200, Timo Teräs wrote:
>> I guess the idea was that application should know about the SAs it
>> created. Though a SA dump needs to be done if you want to check
>> for existing entries (created by other processes, or if you are
>> recovering from a crash).
> 
> That's what SADB_GET is for.  In any case KMs cannot coexist so this
> is pointless.  After a crash you should simply flush all states and
> policies.

I thought KMs could coexist. Actually, in strongSwan you have two
daemons: charon and pluto. Both can run at the same time. The other
is for IKEv1 and the other one for IKEv2. It uses netlink, though.
Looking the way pfkey works, it looks like being designed to work
with multiple KMs (e.g. acquires are sent to all registered sockets).

>> SPD dumping is still a must if you want to work nicely with kernel.
> 
> No it isn't.  Look at how Openswan does it.  No dumping anywhere at all.

Then you have to have all policy/static association configuration in
the application configuration. ipsec-tools wanted separate that. As
this is more robust if someone decided to run multiple KMs.

My point (as an ipsec-tools developer) is that the patch would fix
a lot of problems until ipsec-tools is netlink enabled. It would
handle perfectly the dumping even as non-atomic. And that ipsec-tools
is the KM you get by default in almost all distributions.

But apparently you are too worried that all the so many other KMs
still using pfkey (is there others? I think most others use netlink
already) in Linux might break because the rely on an unspecified
feature of pfkey.

I'm also tired of arguing since this is going nowhere. I'll run my
patched kernel and try to get ipsec-tools fixed to use netlink...
eventually.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IPv4: Enable use of 240/4 address space

2008-01-17 Thread YOSHIFUJI Hideaki / 吉藤英明
In article <[EMAIL PROTECTED]> (at Fri, 18 Jan 2008 11:13:19 +0900 (JST)), 
YOSHIFUJI Hideaki / 吉藤英明 <[EMAIL PROTECTED]> says:

> Assuming IN_BADCLASS() is still there, we should not reuse the name
> of "ipv6_is_badclass" because the their meanings are different.

Again, ipv4_is_badclass()
My hands almost automatically type "6" after "ipv"...

--yoshfuji
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IPv4: Enable use of 240/4 address space

2008-01-17 Thread YOSHIFUJI Hideaki / 吉藤英明
In article <[EMAIL PROTECTED]> (at Fri, 18 Jan 2008 02:52:08 +0100 (CET)), Jan 
Engelhardt <[EMAIL PROTECTED]> says:

> 
> On Jan 18 2008 10:26, YOSHIFUJI Hideaki / 吉藤英明 wrote:
> >> -#define   IN_EXPERIMENTAL(a)  long int) (a)) & 0xf000) == 
> >> 0xf000)
> >> -#define   IN_BADCLASS(a)  IN_EXPERIMENTAL((a))
> >
> >No, please keep these macros.
> >
> >> @@ -264,7 +261,7 @@ static inline bool ipv4_is_local_multicast(__be32 addr)
> >>  
> >>  static inline bool ipv4_is_badclass(__be32 addr)
> >>  {
> >> -  return (addr & htonl(0xf000)) == htonl(0xf000);
> >> +  return addr == 0x;
> >>  }
> >>  
> >
> >To (un)align the IN_BADCLASS macro and ipv6_is_badclass() definition,
> 
> Unalign? IPv6? "Limited" broadcast?

Sorry, ipv4_is_badclass().
Assuming IN_BADCLASS() is still there, we should not reuse the name
of "ipv6_is_badclass" because the their meanings are different.

> -static inline bool ipv4_is_badclass(__be32 addr)
> +static inline bool ipv4_is_broadcast(__be32 addr)
>  {

I'm just afraid that people might think ipv4_is_broadcast
is for testing subnet broadcast address.

255.255.255.255 is "limited broadcast address"
(vs subnet broadcast address, which can be forwarded by routers).

--yoshfuji
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


ipv6 syn cookie questions

2008-01-17 Thread Glenn Griffin
In researching the linux implementation of syn cookies I stumbled on a few
points that aren't initially clear to me.  I was hoping somehow could elaborate
and shed some light onto what I'm missing.

at net/ipv6/tcp_ipv6.c:1249 within tcp_v6_conn_request()

There is the following comment:

/*
 *  There are no SYN attacks on IPv6, yet...
 */

Also above that in the same file within tcp_v6_hnd_req() it looks like pieces
of an ipv6 syn cookie implementation were left but have been commented out.

I was not aware of ipv6 making any changes to how tcp behaved (ignoring obvious
details such as pseudo-header changes in checksum calculations, etc.), so the
comment above puzzles me.  How does ipv6 change the viability of syn attacks
against tcp?

Also if anyone can provide any background that may exist on an ipv6
implementation of syn cookies that would be appreciated.  I'm considering
working on an implementation, but would like to know if anyone is already
working on one or it's been submitted but deemed unnecessary for some reasons.

--Glenn
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IPv4: Enable use of 240/4 address space

2008-01-17 Thread Jan Engelhardt

On Jan 18 2008 10:26, YOSHIFUJI Hideaki / 吉藤英明 wrote:
>> -#define IN_EXPERIMENTAL(a)  long int) (a)) & 0xf000) == 
>> 0xf000)
>> -#define IN_BADCLASS(a)  IN_EXPERIMENTAL((a))
>
>No, please keep these macros.
>
>> @@ -264,7 +261,7 @@ static inline bool ipv4_is_local_multicast(__be32 addr)
>>  
>>  static inline bool ipv4_is_badclass(__be32 addr)
>>  {
>> -return (addr & htonl(0xf000)) == htonl(0xf000);
>> +return addr == 0x;
>>  }
>>  
>
>To (un)align the IN_BADCLASS macro and ipv6_is_badclass() definition,

Unalign? IPv6? "Limited" broadcast?

>you should change the name anyway, e.g., ipv6_is_limited_broadcast()
>or some something alike.

===
Author: Jan Engelhardt <[EMAIL PROTECTED]>
Date:   Fri Jan 18 02:51:34 2008 +0100

IPv4: enable use of 240/4 address space

This short patch modifies the IPv4 networking to enable use of the
240.0.0.0/4 (aka "class-E") address space as propsed in the internet
draft draft-fuller-240space-00.txt.

Signed-off-by: Jan Engelhardt <[EMAIL PROTECTED]>

diff --git a/include/linux/in.h b/include/linux/in.h
index 27d8a5a..4887768 100644
--- a/include/linux/in.h
+++ b/include/linux/in.h
@@ -262,9 +262,9 @@ static inline bool ipv4_is_local_multicast(__be32 addr)
return (addr & htonl(0xff00)) == htonl(0xe000);
 }
 
-static inline bool ipv4_is_badclass(__be32 addr)
+static inline bool ipv4_is_broadcast(__be32 addr)
 {
-   return (addr & htonl(0xf000)) == htonl(0xf000);
+   return addr == 0x;
 }
 
 static inline bool ipv4_is_zeronet(__be32 addr)
diff --git a/include/net/addrconf.h b/include/net/addrconf.h
index 8b1509b..f5bd33b 100644
--- a/include/net/addrconf.h
+++ b/include/net/addrconf.h
@@ -262,7 +262,7 @@ static inline int ipv6_isatap_eui64(u8 *eui, __be32 addr)
  ipv4_is_private_172(addr) || ipv4_is_test_192(addr) ||
  ipv4_is_anycast_6to4(addr) || ipv4_is_private_192(addr) ||
  ipv4_is_test_198(addr) || ipv4_is_multicast(addr) ||
- ipv4_is_badclass(addr)) ? 0x00 : 0x02;
+ ipv4_is_broadcast(addr)) ? 0x00 : 0x02;
eui[1] = 0;
eui[2] = 0x5E;
eui[3] = 0xFE;
diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index d18fdb1..9d6e7af 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -2266,7 +2266,7 @@ static void mod_cur_headers(struct pktgen_dev *pkt_dev)
 
while (ipv4_is_loopback(s) ||
   ipv4_is_multicast(s) ||
-  ipv4_is_badclass(s) ||
+  ipv4_is_broadcast(s) ||
   ipv4_is_zeronet(s) ||
   ipv4_is_local_multicast(s)) {
t = random32() % (imx - imn) + 
imn;
diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index 8ddcd3f..367e097 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -176,7 +176,7 @@ static inline unsigned __inet_dev_addr_type(struct net *net,
unsigned ret = RTN_BROADCAST;
struct fib_table *local_table;
 
-   if (ipv4_is_zeronet(addr) || ipv4_is_badclass(addr))
+   if (ipv4_is_zeronet(addr) || ipv4_is_broadcast(addr))
return RTN_BROADCAST;
if (ipv4_is_multicast(addr))
return RTN_MULTICAST;
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 1e59c0d..9f0ea73 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1154,7 +1154,7 @@ void ip_rt_redirect(__be32 old_gw, __be32 daddr, __be32 
new_gw,
return;
 
if (new_gw == old_gw || !IN_DEV_RX_REDIRECTS(in_dev)
-   || ipv4_is_multicast(new_gw) || ipv4_is_badclass(new_gw)
+   || ipv4_is_multicast(new_gw) || ipv4_is_broadcast(new_gw)
|| ipv4_is_zeronet(new_gw))
goto reject_redirect;
 
@@ -1634,7 +1634,7 @@ static int ip_route_input_mc(struct sk_buff *skb, __be32 
daddr, __be32 saddr,
if (in_dev == NULL)
return -EINVAL;
 
-   if (ipv4_is_multicast(saddr) || ipv4_is_badclass(saddr) ||
+   if (ipv4_is_multicast(saddr) || ipv4_is_broadcast(saddr) ||
ipv4_is_loopback(saddr) || skb->protocol != htons(ETH_P_IP))
goto e_inval;
 
@@ -1891,7 +1891,7 @@ static int ip_route_input_slow(struct sk_buff *skb, 
__be32 daddr, __be32 saddr,
   by fib_lookup.
 */
 
-   if (ipv4_is_multicast(saddr) || ipv4_is_badclass(saddr) ||
+   if (ipv4_is_multicast(saddr) || ipv4_is_broadcast(saddr) ||
ipv4_is_loopback(saddr))
goto martian_source;
 
@@ -1904,7 +1904,7 @@ static int ip_route_input_slow(struct sk_buff *skb, 
__be32 daddr, __be32 saddr,
if (ipv4_is_zeronet(saddr))
goto martian_source;
 
-   if (ipv4_is_badclas

Re: [patch for 2.6.24? 1/1] bonding: locking fix

2008-01-17 Thread Jay Vosburgh
Krzysztof Oledzki <[EMAIL PROTECTED]> wrote:

>> Andrew Morton <[EMAIL PROTECTED]> wrote:
>> [...]
>>> Can we get this bug fixed please?  Today?  It has been known about for more
>>> than two months.
>>
>>  I just reposted the complete fix; it's #1 of the series of 7.
>
>Bad news. :( 2.6.24-rc7 + patch #1 (bonding: fix locking in sysfs
>primary/active selection):
[...]
>=
>[ INFO: possible irq lock inversion dependency detected ]
>2.6.24-rc7 #1
>-
>events/0/9 just changed the state of lock:
> (&mc->mca_lock){-+..}, at: [] mld_ifc_timer_expire+0x130/0x1fb
>but this lock took another, soft-read-irq-unsafe lock in the past:
> (&bond->lock){-.--}

None of the seven patches I posted just a bit ago will fix this
lockdep warning (which is a different thing that the bug Andrew inquired
about); I'm still working on that one.

For that one, I had posted this work in progress patch:

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 77d004d..6906dbc 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3937,8 +3937,6 @@ static void bond_set_multicast_list(struct net_device 
*bond_dev)
struct bonding *bond = bond_dev->priv;
struct dev_mc_list *dmi;
 
-   write_lock_bh(&bond->lock);
-
/*
 * Do promisc before checking multicast_mode
 */
@@ -3959,6 +3957,8 @@ static void bond_set_multicast_list(struct net_device 
*bond_dev)
bond_set_allmulti(bond, -1);
}
 
+   read_lock(&bond->lock);
+
bond->flags = bond_dev->flags;
 
/* looking for addresses to add to slaves' mc list */
@@ -3979,7 +3979,7 @@ static void bond_set_multicast_list(struct net_device 
*bond_dev)
bond_mc_list_destroy(bond);
bond_mc_list_copy(bond_dev->mc_list, bond, GFP_ATOMIC);
 
-   write_unlock_bh(&bond->lock);
+   read_unlock(&bond->lock);
 }
 
 /*

which makes the warning go away, but Herbert Xu pointed out that
there is a potential problem with bond_enslave accessing the mc_lists
without sufficient locking.  It's not the only offender, either, and the
bond->mc_list references really need to be protected by the bond_lock,
and the whole thing probably ought to use dev_mc_sync/unsync instead of
what it does now.

Since the bond_enslave, et al, business isn't a new problem, and
I've never heard of it being hit, I'm thinking now to just leave the
bond_enslave part for 2.6.25, and fix the lockdep warning for 2.6.24.

But I wanted to make sure dropping the write lock to a read lock
wasn't going to make something worse.

-J

---
-Jay Vosburgh, IBM Linux Technology Center, [EMAIL PROTECTED]
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IPv4: Enable use of 240/4 address space

2008-01-17 Thread YOSHIFUJI Hideaki / 吉藤英明
In article <[EMAIL PROTECTED]> (at Fri, 18 Jan 2008 02:13:52 +0100 (CET)), Jan 
Engelhardt <[EMAIL PROTECTED]> says:

> diff --git a/include/linux/in.h b/include/linux/in.h
> index 27d8a5a..b01bf75 100644
> --- a/include/linux/in.h
> +++ b/include/linux/in.h
> @@ -216,9 +216,6 @@ struct sockaddr_in {
>  #define  IN_MULTICAST(a) IN_CLASSD(a)
>  #define IN_MULTICAST_NET 0xF000
>  
> -#define  IN_EXPERIMENTAL(a)  long int) (a)) & 0xf000) == 
> 0xf000)
> -#define  IN_BADCLASS(a)  IN_EXPERIMENTAL((a))
> -
>  /* Address to accept any incoming messages. */
>  #define  INADDR_ANY  ((unsigned long int) 0x)
>  

No, please keep these macros.

> @@ -264,7 +261,7 @@ static inline bool ipv4_is_local_multicast(__be32 addr)
>  
>  static inline bool ipv4_is_badclass(__be32 addr)
>  {
> - return (addr & htonl(0xf000)) == htonl(0xf000);
> + return addr == 0x;
>  }
>  

To (un)align the IN_BADCLASS macro and ipv6_is_badclass() definition,
you should change the name anyway, e.g., ipv6_is_limited_broadcast()
or some something alike.

--yoshfuji
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] IPv4: Enable use of 240/4 address space

2008-01-17 Thread Jan Engelhardt

On Jan 7 2008 17:10, Vince Fuller wrote:
>
>This set of diffs modify the 2.6.20 kernel to enable use of the 240/4
>(aka "class-E") address space as consistent with the Internet Draft
>draft-fuller-240space-00.txt.
>

Below is a patch against davem/net-2.6.25. It might look very spartan, 
but that is actually all that is needed on a sane system. No class E 
macros or so needed.

Only the ipv4_is_badclass() might need renaming if you think it really 
needs a name change. Or maybe a comment. Comments please :)

===
ancestor 7651a1f7ebe567f9088283f6354a5634b5dccb8e
commit 44762168d7cbefc4f8753a79d99a761cbd9875d9
Author: Jan Engelhardt <[EMAIL PROTECTED]>
Date:   Fri Jan 18 02:10:44 2008 +0100

IPv4: enable use of 240/4 address space

This short patch modifies the IPv4 networking to enable use of the
240.0.0.0/4 (aka "class-E") address space as propsed in the internet
draft draft-fuller-240space-00.txt.

Signed-off-by: Jan Engelhardt <[EMAIL PROTECTED]>

diff --git a/include/linux/in.h b/include/linux/in.h
index 27d8a5a..b01bf75 100644
--- a/include/linux/in.h
+++ b/include/linux/in.h
@@ -216,9 +216,6 @@ struct sockaddr_in {
 #defineIN_MULTICAST(a) IN_CLASSD(a)
 #define IN_MULTICAST_NET   0xF000
 
-#defineIN_EXPERIMENTAL(a)  long int) (a)) & 0xf000) == 
0xf000)
-#defineIN_BADCLASS(a)  IN_EXPERIMENTAL((a))
-
 /* Address to accept any incoming messages. */
 #defineINADDR_ANY  ((unsigned long int) 0x)
 
@@ -264,7 +261,7 @@ static inline bool ipv4_is_local_multicast(__be32 addr)
 
 static inline bool ipv4_is_badclass(__be32 addr)
 {
-   return (addr & htonl(0xf000)) == htonl(0xf000);
+   return addr == 0x;
 }
 
 static inline bool ipv4_is_zeronet(__be32 addr)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 001/001] ipv4: enable use of 240/4 address space

2008-01-17 Thread Vince Fuller
On Fri, Jan 18, 2008 at 12:04:06AM +0100, Jan Engelhardt wrote:
> 
> On Jan 7 2008 17:10, Vince Fuller wrote:
> >--- net/ipv4/devinet.c.orig  2007-04-12 10:16:23.0 -0700
> >+++ net/ipv4/devinet.c   2008-01-07 16:55:59.0 -0800
> >@@ -594,6 +594,8 @@ static __inline__ int inet_abc_len(__be3
> > rc = 16;
> > else if (IN_CLASSC(haddr))
> > rc = 24;
> >+else if (IN_CLASSE(haddr))
> >+rc = 24;
> > }
> > 
> > return rc;
> 
> Any particular reason why 24?

Using the same default as old-style "class-C" seemed to be the most expedient
thing to do.

In normal practice, the mask should be explicitly configured so this case
should not be frequently encountered.

--Vince
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/7] bonding: fix lock ordering for rtnl and bonding_rwsem

2008-01-17 Thread Andrew Morton
On Thu, 17 Jan 2008 16:25:02 -0800
Jay Vosburgh <[EMAIL PROTECTED]> wrote:

> Fix the handling of rtnl and the bonding_rwsem to always be acquired
> in a consistent order (rtnl, then bonding_rwsem).
> 
> The existing code sometimes acquired them in this order, and sometimes
> in the opposite order, which opens a window for deadlock between ifenslave
> and sysfs.
> 
> ...
>
>  int bond_create(char *name, struct bond_params *params, struct bonding 
> **newbond)
>  {
>   struct net_device *bond_dev;
> + struct bonding *bond, *nxt;
>   int res;
>  
>   rtnl_lock();
> + down_write(&bonding_rwsem);
> +
> + /* Check to see if the bond already exists. */
> + list_for_each_entry_safe(bond, nxt, &bond_dev_list, bond_list)

this could (should) use list_for_each_entry().

> + if (strnicmp(bond->dev->name, name, IFNAMSIZ) == 0) {
> + printk(KERN_ERR DRV_NAME
> +": cannot add bond %s; it already exists\n",
> +name);
> + res = -EPERM;
> + goto out_rtnl;
> + }
> +
>   bond_dev = alloc_netdev(sizeof(struct bonding), name ? name : "",
>   ether_setup);
>   if (!bond_dev) {
> @@ -4915,10 +4928,12 @@ int bond_create(char *name, struct bond_params 
> *params, struct bonding **newbond
>  
>   netif_carrier_off(bond_dev);
>  
> + up_write(&bonding_rwsem);
>   rtnl_unlock(); /* allows sysfs registration of net device */
>   res = bond_create_sysfs_entry(bond_dev->priv);
>   if (res < 0) {
>   rtnl_lock();
> + down_write(&bonding_rwsem);
>   goto out_bond;
>   }
>  
> @@ -4929,6 +4944,7 @@ out_bond:
>  out_netdev:
>   free_netdev(bond_dev);
>  out_rtnl:
> + up_write(&bonding_rwsem);
>   rtnl_unlock();
>   return res;
>  }
> @@ -4949,6 +4965,9 @@ static int __init bonding_init(void)
>  #ifdef CONFIG_PROC_FS
>   bond_create_proc_dir();
>  #endif
> +
> + init_rwsem(&bonding_rwsem);

It would be better to initialise this at compile time with DECLARE_RWSEM().


But neither of those things need to be done for 2.6.24.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch for 2.6.24? 1/1] bonding: locking fix

2008-01-17 Thread Jay Vosburgh

Andrew Morton <[EMAIL PROTECTED]> wrote:
[...]
>Can we get this bug fixed please?  Today?  It has been known about for more
>than two months.

I just reposted the complete fix; it's #1 of the series of 7.

-J

---
-Jay Vosburgh, IBM Linux Technology Center, [EMAIL PROTECTED]
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 7/7] bonding: Don't hold lock when calling rtnl_unlock

2008-01-17 Thread Jay Vosburgh
Change bond_mii_monitor to not hold any locks when calling rtnl_unlock,
as rtnl_unlock can sleep (when acquring another mutex in netdev_run_todo).

Bug reported by Makito SHIOKAWA <[EMAIL PROTECTED]>, who
included a different patch.

Signed-off-by: Jay Vosburgh <[EMAIL PROTECTED]>
---
 drivers/net/bonding/bond_main.c |4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 2c6da49..49a1982 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2386,7 +2386,9 @@ void bond_mii_monitor(struct work_struct *work)
rtnl_lock();
read_lock(&bond->lock);
__bond_mii_monitor(bond, 1);
-   rtnl_unlock();
+   read_unlock(&bond->lock);
+   rtnl_unlock();  /* might sleep, hold no other locks */
+   read_lock(&bond->lock);
}
 
delay = ((bond->params.miimon * HZ) / 1000) ? : 1;
-- 
1.5.3.4.206.g58ba4-dirty

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/7] bonding: Fix up parameter parsing

2008-01-17 Thread Jay Vosburgh
A recent change to add an additional hash policy modified
bond_parse_parm, but it now does not correctly match parameters passed in
via sysfs.

Rewrote bond_parse_parm to handle (a) parameter matches that
are substrings of one another and (b) user input with whitespace (e.g.,
sysfs input often has a trailing newline).

Signed-off-by: Jay Vosburgh <[EMAIL PROTECTED]>
---
 drivers/net/bonding/bond_main.c  |   23 ---
 drivers/net/bonding/bond_sysfs.c |8 
 drivers/net/bonding/bonding.h|4 +++-
 3 files changed, 23 insertions(+), 12 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 3ede0a2..379c5d8 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -4540,18 +4540,27 @@ static void bond_free_all(void)
 
 /*
  * Convert string input module parms.  Accept either the
- * number of the mode or its string name.
+ * number of the mode or its string name.  A bit complicated because
+ * some mode names are substrings of other names, and calls from sysfs
+ * may have whitespace in the name (trailing newlines, for example).
  */
-int bond_parse_parm(char *mode_arg, struct bond_parm_tbl *tbl)
+int bond_parse_parm(const char *buf, struct bond_parm_tbl *tbl)
 {
-   int i;
+   int mode = -1, i, rv;
+   char modestr[BOND_MAX_MODENAME_LEN + 1] = { 0, };
+
+   rv = sscanf(buf, "%d", &mode);
+   if (!rv) {
+   rv = sscanf(buf, "%20s", modestr);
+   if (!rv)
+   return -1;
+   }
 
for (i = 0; tbl[i].modename; i++) {
-   if ((isdigit(*mode_arg) &&
-tbl[i].mode == simple_strtol(mode_arg, NULL, 0)) ||
-   (strcmp(mode_arg, tbl[i].modename) == 0)) {
+   if (mode == tbl[i].mode)
+   return tbl[i].mode;
+   if (strcmp(modestr, tbl[i].modename) == 0)
return tbl[i].mode;
-   }
}
 
return -1;
diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index 28a2d80..bff4f2b 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -423,7 +423,7 @@ static ssize_t bonding_store_mode(struct device *d,
goto out;
}
 
-   new_value = bond_parse_parm((char *)buf, bond_mode_tbl);
+   new_value = bond_parse_parm(buf, bond_mode_tbl);
if (new_value < 0)  {
printk(KERN_ERR DRV_NAME
   ": %s: Ignoring invalid mode value %.*s.\n",
@@ -478,7 +478,7 @@ static ssize_t bonding_store_xmit_hash(struct device *d,
goto out;
}
 
-   new_value = bond_parse_parm((char *)buf, xmit_hashtype_tbl);
+   new_value = bond_parse_parm(buf, xmit_hashtype_tbl);
if (new_value < 0)  {
printk(KERN_ERR DRV_NAME
   ": %s: Ignoring invalid xmit hash policy value %.*s.\n",
@@ -518,7 +518,7 @@ static ssize_t bonding_store_arp_validate(struct device *d,
int new_value;
struct bonding *bond = to_bond(d);
 
-   new_value = bond_parse_parm((char *)buf, arp_validate_tbl);
+   new_value = bond_parse_parm(buf, arp_validate_tbl);
if (new_value < 0) {
printk(KERN_ERR DRV_NAME
   ": %s: Ignoring invalid arp_validate value %s\n",
@@ -941,7 +941,7 @@ static ssize_t bonding_store_lacp(struct device *d,
goto out;
}
 
-   new_value = bond_parse_parm((char *)buf, bond_lacp_tbl);
+   new_value = bond_parse_parm(buf, bond_lacp_tbl);
 
if ((new_value == 1) || (new_value == 0)) {
bond->params.lacp_fast = new_value;
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index e1e4734..6d83be4 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -141,6 +141,8 @@ struct bond_parm_tbl {
int mode;
 };
 
+#define BOND_MAX_MODENAME_LEN 20
+
 struct vlan_entry {
struct list_head vlan_list;
__be32 vlan_ip;
@@ -314,7 +316,7 @@ void bond_mii_monitor(struct work_struct *);
 void bond_loadbalance_arp_mon(struct work_struct *);
 void bond_activebackup_arp_mon(struct work_struct *);
 void bond_set_mode_ops(struct bonding *bond, int mode);
-int bond_parse_parm(char *mode_arg, struct bond_parm_tbl *tbl);
+int bond_parse_parm(const char *mode_arg, struct bond_parm_tbl *tbl);
 void bond_select_active_slave(struct bonding *bond);
 void bond_change_active_slave(struct bonding *bond, struct slave *new_active);
 void bond_register_arp(struct bonding *);
-- 
1.5.3.4.206.g58ba4-dirty

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/7] bonding: fix lock ordering for rtnl and bonding_rwsem

2008-01-17 Thread Jay Vosburgh
Fix the handling of rtnl and the bonding_rwsem to always be acquired
in a consistent order (rtnl, then bonding_rwsem).

The existing code sometimes acquired them in this order, and sometimes
in the opposite order, which opens a window for deadlock between ifenslave
and sysfs.

Signed-off-by: Jay Vosburgh <[EMAIL PROTECTED]>
---
 drivers/net/bonding/bond_main.c  |   19 
 drivers/net/bonding/bond_sysfs.c |   43 ++---
 2 files changed, 35 insertions(+), 27 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 379c5d8..2c6da49 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -4874,9 +4874,22 @@ static struct lock_class_key 
bonding_netdev_xmit_lock_key;
 int bond_create(char *name, struct bond_params *params, struct bonding 
**newbond)
 {
struct net_device *bond_dev;
+   struct bonding *bond, *nxt;
int res;
 
rtnl_lock();
+   down_write(&bonding_rwsem);
+
+   /* Check to see if the bond already exists. */
+   list_for_each_entry_safe(bond, nxt, &bond_dev_list, bond_list)
+   if (strnicmp(bond->dev->name, name, IFNAMSIZ) == 0) {
+   printk(KERN_ERR DRV_NAME
+  ": cannot add bond %s; it already exists\n",
+  name);
+   res = -EPERM;
+   goto out_rtnl;
+   }
+
bond_dev = alloc_netdev(sizeof(struct bonding), name ? name : "",
ether_setup);
if (!bond_dev) {
@@ -4915,10 +4928,12 @@ int bond_create(char *name, struct bond_params *params, 
struct bonding **newbond
 
netif_carrier_off(bond_dev);
 
+   up_write(&bonding_rwsem);
rtnl_unlock(); /* allows sysfs registration of net device */
res = bond_create_sysfs_entry(bond_dev->priv);
if (res < 0) {
rtnl_lock();
+   down_write(&bonding_rwsem);
goto out_bond;
}
 
@@ -4929,6 +4944,7 @@ out_bond:
 out_netdev:
free_netdev(bond_dev);
 out_rtnl:
+   up_write(&bonding_rwsem);
rtnl_unlock();
return res;
 }
@@ -4949,6 +4965,9 @@ static int __init bonding_init(void)
 #ifdef CONFIG_PROC_FS
bond_create_proc_dir();
 #endif
+
+   init_rwsem(&bonding_rwsem);
+
for (i = 0; i < max_bonds; i++) {
res = bond_create(NULL, &bonding_defaults, NULL);
if (res)
diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index bff4f2b..90a1f31 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -109,11 +109,10 @@ static ssize_t bonding_store_bonds(struct class *cls, 
const char *buffer, size_t
 {
char command[IFNAMSIZ + 1] = {0, };
char *ifname;
-   int res = count;
+   int rv, res = count;
struct bonding *bond;
struct bonding *nxt;
 
-   down_write(&(bonding_rwsem));
sscanf(buffer, "%16s", command); /* IFNAMSIZ*/
ifname = command + 1;
if ((strlen(command) <= 1) ||
@@ -121,39 +120,28 @@ static ssize_t bonding_store_bonds(struct class *cls, 
const char *buffer, size_t
goto err_no_cmd;
 
if (command[0] == '+') {
-
-   /* Check to see if the bond already exists. */
-   list_for_each_entry_safe(bond, nxt, &bond_dev_list, bond_list)
-   if (strnicmp(bond->dev->name, ifname, IFNAMSIZ) == 0) {
-   printk(KERN_ERR DRV_NAME
-   ": cannot add bond %s; it already 
exists\n",
-   ifname);
-   res = -EPERM;
-   goto out;
-   }
-
printk(KERN_INFO DRV_NAME
": %s is being created...\n", ifname);
-   if (bond_create(ifname, &bonding_defaults, &bond)) {
-   printk(KERN_INFO DRV_NAME
-   ": %s interface already exists. Bond creation 
failed.\n",
-   ifname);
-   res = -EPERM;
+   rv = bond_create(ifname, &bonding_defaults, &bond);
+   if (rv) {
+   printk(KERN_INFO DRV_NAME ": Bond creation failed.\n");
+   res = rv;
}
goto out;
}
 
if (command[0] == '-') {
+   rtnl_lock();
+   down_write(&bonding_rwsem);
+
list_for_each_entry_safe(bond, nxt, &bond_dev_list, bond_list)
if (strnicmp(bond->dev->name, ifname, IFNAMSIZ) == 0) {
-   rtnl_lock();
/* check the ref count on the bond's kobject.
 * If it's > expected, then there's a file open,
   

[PATCH 1/7] bonding: fix locking in sysfs primary/active selection

2008-01-17 Thread Jay Vosburgh
Fix the functions that store the primary and active slave
options via sysfs to hold the correct locks in the correct order.

The bond_change_active_slave and bond_select_active_slave
functions both require rtnl, bond->lock for read and curr_slave_lock for
write_bh, and no other locks.  This is so that the lower level
mode-specific functions (notably for balance-alb mode) can release locks
down to just rtnl in order to call, e.g., dev_set_mac_address with the
locks it expects (rtnl only).

Signed-off-by: Jay Vosburgh <[EMAIL PROTECTED]>
Signed-off-by: Andy Gospodarek <[EMAIL PROTECTED]>
---
 drivers/net/bonding/bond_sysfs.c |   15 ++-
 1 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index 11b76b3..28a2d80 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -1075,7 +1075,10 @@ static ssize_t bonding_store_primary(struct device *d,
struct slave *slave;
struct bonding *bond = to_bond(d);
 
-   write_lock_bh(&bond->lock);
+   rtnl_lock();
+   read_lock(&bond->lock);
+   write_lock_bh(&bond->curr_slave_lock);
+
if (!USES_PRIMARY(bond->params.mode)) {
printk(KERN_INFO DRV_NAME
   ": %s: Unable to set primary slave; %s is in mode %d\n",
@@ -1109,8 +1112,8 @@ static ssize_t bonding_store_primary(struct device *d,
}
}
 out:
-   write_unlock_bh(&bond->lock);
-
+   write_unlock_bh(&bond->curr_slave_lock);
+   read_unlock(&bond->lock);
rtnl_unlock();
 
return count;
@@ -1190,7 +1193,8 @@ static ssize_t bonding_store_active_slave(struct device 
*d,
struct bonding *bond = to_bond(d);
 
rtnl_lock();
-   write_lock_bh(&bond->lock);
+   read_lock(&bond->lock);
+   write_lock_bh(&bond->curr_slave_lock);
 
if (!USES_PRIMARY(bond->params.mode)) {
printk(KERN_INFO DRV_NAME
@@ -1247,7 +1251,8 @@ static ssize_t bonding_store_active_slave(struct device 
*d,
}
}
 out:
-   write_unlock_bh(&bond->lock);
+   write_unlock_bh(&bond->curr_slave_lock);
+   read_unlock(&bond->lock);
rtnl_unlock();
 
return count;
-- 
1.5.3.4.206.g58ba4-dirty

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/7] bonding: release slaves when master removed via sysfs

2008-01-17 Thread Jay Vosburgh
Add a call to bond_release_all in the bonding netdev event
handler for the master.  This releases the slaves for the case of, e.g.,
"echo -bond0 > /sys/class/net/bonding_masters", which otherwise will spin
forever waiting for references to be released.

Signed-off-by: Jay Vosburgh <[EMAIL PROTECTED]>
---
 drivers/net/bonding/bond_main.c |4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 77d004d..3ede0a2 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3401,9 +3401,7 @@ static int bond_master_netdev_event(unsigned long event, 
struct net_device *bond
case NETDEV_CHANGENAME:
return bond_event_changename(event_bond);
case NETDEV_UNREGISTER:
-   /*
-* TODO: remove a bond from the list?
-*/
+   bond_release_all(event_bond->dev);
break;
default:
break;
-- 
1.5.3.4.206.g58ba4-dirty

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/7] bonding: fix ASSERT_RTNL that produces spurious warnings

2008-01-17 Thread Jay Vosburgh
Move an ASSERT_RTNL down to where we should hold only RTNL;
the existing check produces spurious warnings because we hold additional
locks at _bh, tripping a debug warning in spin_lock_mutex().

Signed-off-by: Jay Vosburgh <[EMAIL PROTECTED]>
---
 drivers/net/bonding/bond_alb.c |5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index 25b8dbf..9b55a12 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -1601,9 +1601,6 @@ void bond_alb_handle_active_change(struct bonding *bond, 
struct slave *new_slave
struct slave *swap_slave;
int i;
 
-   if (new_slave)
-   ASSERT_RTNL();
-
if (bond->curr_active_slave == new_slave) {
return;
}
@@ -1649,6 +1646,8 @@ void bond_alb_handle_active_change(struct bonding *bond, 
struct slave *new_slave
write_unlock_bh(&bond->curr_slave_lock);
read_unlock(&bond->lock);
 
+   ASSERT_RTNL();
+
/* curr_active_slave must be set before calling alb_swap_mac_addr */
if (swap_slave) {
/* swap mac address */
-- 
1.5.3.4.206.g58ba4-dirty

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/7] bonding: fix locking during alb failover and slave removal

2008-01-17 Thread Jay Vosburgh
alb_fasten_mac_swap (actually rlb_teach_disabled_mac_on_primary)
requries RTNL and no other locks.  This could cause dev_set_promiscuity
and/or dev_set_mac_address to be called with improper locking.

Changed callers to hold only RTNL during calls to alb_fasten_mac_swap
or functions calling it.  Updated header comments in affected functions to
reflect proper reality of locking requirements.

Signed-off-by: Jay Vosburgh <[EMAIL PROTECTED]>
---
 drivers/net/bonding/bond_alb.c  |   18 --
 drivers/net/bonding/bond_main.c |   14 --
 2 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index 9b55a12..b57bc94 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -979,7 +979,7 @@ static void alb_swap_mac_addr(struct bonding *bond, struct 
slave *slave1, struct
 /*
  * Send learning packets after MAC address swap.
  *
- * Called with RTNL and bond->lock held for read.
+ * Called with RTNL and no other locks
  */
 static void alb_fasten_mac_swap(struct bonding *bond, struct slave *slave1,
struct slave *slave2)
@@ -987,6 +987,8 @@ static void alb_fasten_mac_swap(struct bonding *bond, 
struct slave *slave1,
int slaves_state_differ = (SLAVE_IS_OK(slave1) != SLAVE_IS_OK(slave2));
struct slave *disabled_slave = NULL;
 
+   ASSERT_RTNL();
+
/* fasten the change in the switch */
if (SLAVE_IS_OK(slave1)) {
alb_send_learning_packets(slave1, slave1->dev->dev_addr);
@@ -1031,7 +1033,7 @@ static void alb_fasten_mac_swap(struct bonding *bond, 
struct slave *slave1,
  * a slave that has @slave's permanet address as its current address.
  * We'll make sure that that slave no longer uses @slave's permanent address.
  *
- * Caller must hold bond lock
+ * Caller must hold RTNL and no other locks
  */
 static void alb_change_hw_addr_on_detach(struct bonding *bond, struct slave 
*slave)
 {
@@ -1542,7 +1544,12 @@ int bond_alb_init_slave(struct bonding *bond, struct 
slave *slave)
return 0;
 }
 
-/* Caller must hold bond lock for write */
+/*
+ * Remove slave from tlb and rlb hash tables, and fix up MAC addresses
+ * if necessary.
+ *
+ * Caller must hold RTNL and no other locks
+ */
 void bond_alb_deinit_slave(struct bonding *bond, struct slave *slave)
 {
if (bond->slave_cnt > 1) {
@@ -1658,12 +1665,11 @@ void bond_alb_handle_active_change(struct bonding 
*bond, struct slave *new_slave
   bond->alb_info.rlb_enabled);
}
 
-   read_lock(&bond->lock);
-
if (swap_slave) {
alb_fasten_mac_swap(bond, swap_slave, new_slave);
+   read_lock(&bond->lock);
} else {
-   /* fasten bond mac on new current slave */
+   read_lock(&bond->lock);
alb_send_learning_packets(new_slave, bond->dev->dev_addr);
}
 
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index b0b2603..77d004d 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1746,7 +1746,9 @@ int bond_release(struct net_device *bond_dev, struct 
net_device *slave_dev)
 * has been cleared (if our_slave == old_current),
 * but before a new active slave is selected.
 */
+   write_unlock_bh(&bond->lock);
bond_alb_deinit_slave(bond, slave);
+   write_lock_bh(&bond->lock);
}
 
if (oldcurrent == slave) {
@@ -1905,6 +1907,12 @@ static int bond_release_all(struct net_device *bond_dev)
slave_dev = slave->dev;
bond_detach_slave(bond, slave);
 
+   /* now that the slave is detached, unlock and perform
+* all the undo steps that should not be called from
+* within a lock.
+*/
+   write_unlock_bh(&bond->lock);
+
if ((bond->params.mode == BOND_MODE_TLB) ||
(bond->params.mode == BOND_MODE_ALB)) {
/* must be called only after the slave
@@ -1915,12 +1923,6 @@ static int bond_release_all(struct net_device *bond_dev)
 
bond_compute_features(bond);
 
-   /* now that the slave is detached, unlock and perform
-* all the undo steps that should not be called from
-* within a lock.
-*/
-   write_unlock_bh(&bond->lock);
-
bond_destroy_slave_symlinks(bond_dev, slave_dev);
bond_del_vlans_from_slave(bond, slave_dev);
 
-- 
1.5.3.4.206.g58ba4-dirty

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/7] bonding: 7 fixes for 2.6.24

2008-01-17 Thread Jay Vosburgh
Following are seven patches to fix locking problems,
silence locking-related warnings and resolve one recent regression
in the current 2.6.24-rc.

The first three patches are reposts, the rest are new.

patch 1: fix locking in sysfs primary/active selection

Call core network functions with expected locks to
eliminate potential deadlock and silence warnings.

patch 2: fix ASSERT_RTNL that produces spurious warnings

Relocate ASSERT_RTNL to remove a false warning; after patch,
ASSERT is located in code that holds only RTNL (additional locks were
causing the ASSERT to trip)

patch 3: fix locking during alb failover and slave removal

Fix all call paths into alb_fasten_mac_swap to hold only RTNL.
Eliminates potential deadlock and silences warnings.

patch 4: release slaves when master removed via sysfs

Insure that all slaves are removed when bond is destroyed via
sysfs.

patch 5: Fix up parameter parsing

Recent changes broke parameter parsing; this fixes things.

patch 6: Fix lock ordering for rtnl and bonding_rwsem

Resolves some lockdep warnings related to ordering between
rtnl and bonding_rwsem.

patch 7: Don't hold lock when calling rtnl_unlock

Since rtnl_unlock can sleep, don't hold any other locks when
calling it.

Patches are against the current netdev-2.6#upstream branch.

Please apply for 2.6.24.

-J

---
-Jay Vosburgh, IBM Linux Technology Center, [EMAIL PROTECTED]
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] [IrDA] af_irda memory leak fixes

2008-01-17 Thread Samuel Ortiz
Hi Dave,

Here goes an IrDA patch against your latest net-2.6 tree.

This patch fixes some af_irda memory leaks.
It also checks for irias_new_obect() return value.

Signed-off-by: Jesper Juhl <[EMAIL PROTECTED]>
Signed-off-by: Samuel Ortiz <[EMAIL PROTECTED]>
---
 net/irda/af_irda.c |   30 +++---
 1 file changed, 27 insertions(+), 3 deletions(-)

Index: net-2.6-davem/net/irda/af_irda.c
===
--- net-2.6-davem.orig/net/irda/af_irda.c   2008-01-17 06:00:10.0 
+0100
+++ net-2.6-davem/net/irda/af_irda.c2008-01-18 08:37:36.0 +0100
@@ -802,12 +802,18 @@
}
 #endif /* CONFIG_IRDA_ULTRA */
 
+   self->ias_obj = irias_new_object(addr->sir_name, jiffies);
+   if (self->ias_obj == NULL)
+   return -ENOMEM;
+
err = irda_open_tsap(self, addr->sir_lsap_sel, addr->sir_name);
-   if (err < 0)
+   if (err < 0) {
+   kfree(self->ias_obj->name);
+   kfree(self->ias_obj);
return err;
+   }
 
/*  Register with LM-IAS */
-   self->ias_obj = irias_new_object(addr->sir_name, jiffies);
irias_add_integer_attrib(self->ias_obj, "IrDA:TinyTP:LsapSel",
 self->stsap_sel, IAS_KERNEL_ATTR);
irias_insert_object(self->ias_obj);
@@ -1825,7 +1831,7 @@
struct irda_ias_set*ias_opt;
struct ias_object  *ias_obj;
struct ias_attrib * ias_attr;   /* Attribute in IAS object */
-   int opt;
+   int opt, free_ias = 0;
 
IRDA_DEBUG(2, "%s(%p)\n", __FUNCTION__, self);
 
@@ -1881,11 +1887,20 @@
/* Create a new object */
ias_obj = irias_new_object(ias_opt->irda_class_name,
   jiffies);
+   if (ias_obj == NULL) {
+   kfree(ias_opt);
+   return -ENOMEM;
+   }
+   free_ias = 1;
}
 
/* Do we have the attribute already ? */
if(irias_find_attrib(ias_obj, ias_opt->irda_attrib_name)) {
kfree(ias_opt);
+   if (free_ias) {
+   kfree(ias_obj->name);
+   kfree(ias_obj);
+   }
return -EINVAL;
}
 
@@ -1904,6 +1919,11 @@
if(ias_opt->attribute.irda_attrib_octet_seq.len >
   IAS_MAX_OCTET_STRING) {
kfree(ias_opt);
+   if (free_ias) {
+   kfree(ias_obj->name);
+   kfree(ias_obj);
+   }
+
return -EINVAL;
}
/* Add an octet sequence attribute */
@@ -1932,6 +1952,10 @@
break;
default :
kfree(ias_opt);
+   if (free_ias) {
+   kfree(ias_obj->name);
+   kfree(ias_obj);
+   }
return -EINVAL;
}
irias_insert_object(ias_obj);

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch for 2.6.24? 1/1] bonding: locking fix

2008-01-17 Thread Andrew Morton
On Mon, 14 Jan 2008 15:01:13 -0800
Jay Vosburgh <[EMAIL PROTECTED]> wrote:

> Andrew Morton <[EMAIL PROTECTED]> wrote:
> [...]
> >That's bond_lock.
> >
> >This patch (below) addresses what appears to me to be an obvious
> >imbalance in rtnl_lock.
> >
> >I don't care how it's fixed, really.  Someone please fix it?
> 
>   I posted a correct patch for this a few days ago:
> 
> http://marc.info/?l=linux-netdev&m=119975746803886&w=2
> 
>   The correct fix requires more than simply removing the rtnl calls.
> 
>   I've got a few other patches in the pipeline, so I'm planning to
> repost the set the above patch was a part of plus a few others, most
> likely tomorrow.

Can we get this bug fixed please?  Today?  It has been known about for more
than two months.

I can only assume that people don't use this feature much because this bug
will kill your kernel, every time.

Applying this:

--- a/drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
+++ a/drivers/net/bonding/bond_sysfs.c
@@ -,8 +,6 @@ static ssize_t bonding_store_primary(str
 out:
write_unlock_bh(&bond->lock);
 
-   rtnl_unlock();
-
return count;
 }
 static DEVICE_ATTR(primary, S_IRUGO | S_IWUSR, bonding_show_primary, 
bonding_store_primary);


is better than doing nothing.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT] sky2: wake-on-lan configuration issues

2008-01-17 Thread Andrew Morton
On Mon, 14 Jan 2008 22:05:28 +0100
supersud501 <[EMAIL PROTECTED]> wrote:

> 
> 
> Stephen Hemminger wrote:
> > Please test this patch against Linus's current (approx 2.6.24-rc7-git5).
> > Ignore Andrew's premature reversion attempt...
> > 
> > This patch disables config mode access after clearing PCI settings.
> > 
>
> ...
> 
> yes, that did it! just tested it (current linus git tree with patched 
> sky with above patch), everything is clean now (dmesg output) and wol 
> works even with the commit ac93a3946b676025fa55356180e8321639744b31
> 
> so it the bug is fixed without the need to revert 
> ac93a3946b676025fa55356180e8321639744b31.

Are we putting this into 2.6.24?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 001/001] ipv4: enable use of 240/4 address space

2008-01-17 Thread Jan Engelhardt

On Jan 7 2008 17:10, Vince Fuller wrote:
>--- net/ipv4/devinet.c.orig2007-04-12 10:16:23.0 -0700
>+++ net/ipv4/devinet.c 2008-01-07 16:55:59.0 -0800
>@@ -594,6 +594,8 @@ static __inline__ int inet_abc_len(__be3
>   rc = 16;
>   else if (IN_CLASSC(haddr))
>   rc = 24;
>+  else if (IN_CLASSE(haddr))
>+  rc = 24;
>   }
> 
>   return rc;

Any particular reason why 24?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Pull request for 'ipg-fixes' branch (try #2)

2008-01-17 Thread Francois Romieu
Please pull from branch 'ipg-fixes' in repository

git://git.kernel.org/pub/scm/linux/kernel/git/romieu/netdev-2.6.git ipg-fixes

to get the changes below.

Distance from 'master' (d8c89eb3a12f0da96d049bd515c7fa3702e511c5)
-

47cccd7d7cc1f2b6f34aadc9041fb991c6293cdd
dafdec746f8c468bebf6b99f32a392ee6c8d0212
0da1b995aee447656c0eb77e4e32468e37f868a3
227bc24d675d80de1cfb3ab72891cc932dadbc3b

Diffstat


 drivers/net/ipg.c |   36 
 1 files changed, 12 insertions(+), 24 deletions(-)

Shortlog


Francois Romieu (4):
  ipg: balance locking in irq handler
  ipg: plug Tx completion leak
  ipg: fix queue stop condition in the xmit handler
  ipg: fix Tx completion irq request

Patch
-

diff --git a/drivers/net/ipg.c b/drivers/net/ipg.c
index dbd23bb..50f0c17 100644
--- a/drivers/net/ipg.c
+++ b/drivers/net/ipg.c
@@ -857,21 +857,14 @@ static void init_tfdlist(struct net_device *dev)
 static void ipg_nic_txfree(struct net_device *dev)
 {
struct ipg_nic_private *sp = netdev_priv(dev);
-   void __iomem *ioaddr = sp->ioaddr;
-   unsigned int curr;
-   u64 txd_map;
-   unsigned int released, pending;
-
-   txd_map = (u64)sp->txd_map;
-   curr = ipg_r32(TFD_LIST_PTR_0) -
-   do_div(txd_map, sizeof(struct ipg_tx)) - 1;
+   unsigned int released, pending, dirty;
 
IPG_DEBUG_MSG("_nic_txfree\n");
 
pending = sp->tx_current - sp->tx_dirty;
+   dirty = sp->tx_dirty % IPG_TFDLIST_LENGTH;
 
for (released = 0; released < pending; released++) {
-   unsigned int dirty = sp->tx_dirty % IPG_TFDLIST_LENGTH;
struct sk_buff *skb = sp->TxBuff[dirty];
struct ipg_tx *txfd = sp->txd + dirty;
 
@@ -882,11 +875,8 @@ static void ipg_nic_txfree(struct net_device *dev)
 * If the TFDDone bit is set, free the associated
 * buffer.
 */
-   if (dirty == curr)
-   break;
-
-   /* Setup TFDDONE for compatible issue. */
-   txfd->tfc |= cpu_to_le64(IPG_TFC_TFDDONE);
+   if (!(txfd->tfc & cpu_to_le64(IPG_TFC_TFDDONE)))
+break;
 
/* Free the transmit buffer. */
if (skb) {
@@ -898,6 +888,7 @@ static void ipg_nic_txfree(struct net_device *dev)
 
sp->TxBuff[dirty] = NULL;
}
+   dirty = (dirty + 1) % IPG_TFDLIST_LENGTH;
}
 
sp->tx_dirty += released;
@@ -1630,6 +1621,8 @@ static irqreturn_t ipg_interrupt_handler(int irq, void 
*dev_inst)
 #ifdef JUMBO_FRAME
ipg_nic_rxrestore(dev);
 #endif
+   spin_lock(&sp->lock);
+
/* Get interrupt source information, and acknowledge
 * some (i.e. TxDMAComplete, RxDMAComplete, RxEarly,
 * IntRequested, MacControlFrame, LinkEvent) interrupts
@@ -1647,9 +1640,7 @@ static irqreturn_t ipg_interrupt_handler(int irq, void 
*dev_inst)
handled = 1;
 
if (unlikely(!netif_running(dev)))
-   goto out;
-
-   spin_lock(&sp->lock);
+   goto out_unlock;
 
/* If RFDListEnd interrupt, restore all used RFDs. */
if (status & IPG_IS_RFD_LIST_END) {
@@ -1733,9 +1724,9 @@ out_enable:
ipg_w16(IPG_IE_TX_DMA_COMPLETE | IPG_IE_RX_DMA_COMPLETE |
IPG_IE_HOST_ERROR | IPG_IE_INT_REQUESTED | IPG_IE_TX_COMPLETE |
IPG_IE_LINK_EVENT | IPG_IE_UPDATE_STATS, INT_ENABLE);
-
+out_unlock:
spin_unlock(&sp->lock);
-out:
+
return IRQ_RETVAL(handled);
 }
 
@@ -1943,10 +1934,7 @@ static int ipg_nic_hard_start_xmit(struct sk_buff *skb, 
struct net_device *dev)
 */
if (sp->tenmbpsmode)
txfd->tfc |= cpu_to_le64(IPG_TFC_TXINDICATE);
-   else if (!((sp->tx_current - sp->tx_dirty + 1) >
-   IPG_FRAMESBETWEENTXDMACOMPLETES)) {
-   txfd->tfc |= cpu_to_le64(IPG_TFC_TXDMAINDICATE);
-   }
+   txfd->tfc |= cpu_to_le64(IPG_TFC_TXDMAINDICATE);
/* Based on compilation option, determine if FCS is to be
 * appended to transmit frame by IPG.
 */
@@ -2003,7 +1991,7 @@ static int ipg_nic_hard_start_xmit(struct sk_buff *skb, 
struct net_device *dev)
ipg_w32(IPG_DC_TX_DMA_POLL_NOW, DMA_CTRL);
 
if (sp->tx_current == (sp->tx_dirty + IPG_TFDLIST_LENGTH))
-   netif_wake_queue(dev);
+   netif_stop_queue(dev);
 
spin_unlock_irqrestore(&sp->lock, flags);
 
-- 
Ueimor
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT] sky2: wake-on-lan configuration issues

2008-01-17 Thread Tino Keitel
Hi,

I tried 2.6.24-rc8 on my Mac mini Core Duo and noticed that WOL didn't
work anymore, whereas I never had a WOL failure with 2.6.23. The above
patch fixes WOL with 2.6.24-rc8 for me.

Thanks for tracking it down even before I was hit by this regression.
:-)

Regards,
Tino
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH] Fixing SA/SP dumps on netlink/af_key

2008-01-17 Thread Herbert Xu
On Thu, Jan 17, 2008 at 03:31:14PM +0200, Timo Teräs wrote:
> 
> I guess the idea was that application should know about the SAs it
> created. Though a SA dump needs to be done if you want to check
> for existing entries (created by other processes, or if you are
> recovering from a crash).

That's what SADB_GET is for.  In any case KMs cannot coexist so this
is pointless.  After a crash you should simply flush all states and
policies.

> SPD dumping is still a must if you want to work nicely with kernel.

No it isn't.  Look at how Openswan does it.  No dumping anywhere at all.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] PHYLIB: Locking fixes for PHY I/O potentially sleeping

2008-01-17 Thread Benjamin Herrenschmidt

On Thu, 2008-01-03 at 17:36 -0600, Nate Case wrote:
> PHY read/write functions can potentially sleep (e.g., a PHY accessed
> via I2C).  The following changes were made to account for this:
> 
> * Change spin locks to mutex locks
> * Add a BUG_ON() to phy_read() phy_write() to warn against
>   calling them from an interrupt context.
> * Use work queue for PHY state machine handling since
>   it can potentially sleep
> * Change phydev lock from spinlock to mutex
> 
> Signed-off-by: Nate Case <[EMAIL PROTECTED]>

Excellent ! I've been wanting to do that for some time. I'll be able to
convert EMAC to phylib now :-) I'll review the patch in detail as I do
this convesion, maybe next week. Thanks !

Ben.

> ---
>  drivers/net/phy/mdio_bus.c   |2 +-
>  drivers/net/phy/phy.c|   68 -
>  drivers/net/phy/phy_device.c |   11 +++
>  include/linux/phy.h  |5 ++-
>  4 files changed, 55 insertions(+), 31 deletions(-)
> 
> diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c
> index c30196d..6e9f619 100644
> --- a/drivers/net/phy/mdio_bus.c
> +++ b/drivers/net/phy/mdio_bus.c
> @@ -49,7 +49,7 @@ int mdiobus_register(struct mii_bus *bus)
>   int i;
>   int err = 0;
>  
> - spin_lock_init(&bus->mdio_lock);
> + mutex_init(&bus->mdio_lock);
>  
>   if (NULL == bus || NULL == bus->name ||
>   NULL == bus->read ||
> diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
> index 7c9e6e3..12fccb1 100644
> --- a/drivers/net/phy/phy.c
> +++ b/drivers/net/phy/phy.c
> @@ -26,7 +26,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
> @@ -72,9 +71,11 @@ int phy_read(struct phy_device *phydev, u16 regnum)
>   int retval;
>   struct mii_bus *bus = phydev->bus;
>  
> - spin_lock_bh(&bus->mdio_lock);
> + BUG_ON(in_interrupt());
> +
> + mutex_lock(&bus->mdio_lock);
>   retval = bus->read(bus, phydev->addr, regnum);
> - spin_unlock_bh(&bus->mdio_lock);
> + mutex_unlock(&bus->mdio_lock);
>  
>   return retval;
>  }
> @@ -95,9 +96,11 @@ int phy_write(struct phy_device *phydev, u16 regnum, u16 
> val)
>   int err;
>   struct mii_bus *bus = phydev->bus;
>  
> - spin_lock_bh(&bus->mdio_lock);
> + BUG_ON(in_interrupt());
> +
> + mutex_lock(&bus->mdio_lock);
>   err = bus->write(bus, phydev->addr, regnum, val);
> - spin_unlock_bh(&bus->mdio_lock);
> + mutex_unlock(&bus->mdio_lock);
>  
>   return err;
>  }
> @@ -428,7 +431,7 @@ int phy_start_aneg(struct phy_device *phydev)
>  {
>   int err;
>  
> - spin_lock_bh(&phydev->lock);
> + mutex_lock(&phydev->lock);
>  
>   if (AUTONEG_DISABLE == phydev->autoneg)
>   phy_sanitize_settings(phydev);
> @@ -449,13 +452,14 @@ int phy_start_aneg(struct phy_device *phydev)
>   }
>  
>  out_unlock:
> - spin_unlock_bh(&phydev->lock);
> + mutex_unlock(&phydev->lock);
>   return err;
>  }
>  EXPORT_SYMBOL(phy_start_aneg);
>  
> 
>  static void phy_change(struct work_struct *work);
> +static void phy_state_machine(struct work_struct *work);
>  static void phy_timer(unsigned long data);
>  
>  /**
> @@ -476,6 +480,7 @@ void phy_start_machine(struct phy_device *phydev,
>  {
>   phydev->adjust_state = handler;
>  
> + INIT_WORK(&phydev->state_queue, phy_state_machine);
>   init_timer(&phydev->phy_timer);
>   phydev->phy_timer.function = &phy_timer;
>   phydev->phy_timer.data = (unsigned long) phydev;
> @@ -493,11 +498,12 @@ void phy_start_machine(struct phy_device *phydev,
>  void phy_stop_machine(struct phy_device *phydev)
>  {
>   del_timer_sync(&phydev->phy_timer);
> + cancel_work_sync(&phydev->state_queue);
>  
> - spin_lock_bh(&phydev->lock);
> + mutex_lock(&phydev->lock);
>   if (phydev->state > PHY_UP)
>   phydev->state = PHY_UP;
> - spin_unlock_bh(&phydev->lock);
> + mutex_unlock(&phydev->lock);
>  
>   phydev->adjust_state = NULL;
>  }
> @@ -541,9 +547,9 @@ static void phy_force_reduction(struct phy_device *phydev)
>   */
>  void phy_error(struct phy_device *phydev)
>  {
> - spin_lock_bh(&phydev->lock);
> + mutex_lock(&phydev->lock);
>   phydev->state = PHY_HALTED;
> - spin_unlock_bh(&phydev->lock);
> + mutex_unlock(&phydev->lock);
>  }
>  
>  /**
> @@ -705,10 +711,10 @@ static void phy_change(struct work_struct *work)
>   if (err)
>   goto phy_err;
>  
> - spin_lock_bh(&phydev->lock);
> + mutex_lock(&phydev->lock);
>   if ((PHY_RUNNING == phydev->state) || (PHY_NOLINK == phydev->state))
>   phydev->state = PHY_CHANGELINK;
> - spin_unlock_bh(&phydev->lock);
> + mutex_unlock(&phydev->lock);
>  
>   atomic_dec(&phydev->irq_disable);
>   enable_irq(phydev->irq);
> @@ -735,7 +741,7 @@ phy_err:
>   */
>  void phy_stop(struct phy_device *phydev)
>  {
> - spin_lock_bh(&phyde

Re: [PATCH] net: NEWEMAC: Remove "rgmii-interface" from rgmii matching table

2008-01-17 Thread Benjamin Herrenschmidt

On Wed, 2008-01-16 at 10:37 +0100, Stefan Roese wrote:
> With the removal the the "rgmii-interface" device_type property from the
> dts files, the newemac driver needs an update to only rely on compatible
> property.
> 
> Signed-off-by: Stefan Roese <[EMAIL PROTECTED]>
> ---

Acked-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>

Jeff, please stick that into your queue for 2.6.25, thanks !

Ben.

>  drivers/net/ibm_newemac/rgmii.c |1 -
>  1 files changed, 0 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/net/ibm_newemac/rgmii.c b/drivers/net/ibm_newemac/rgmii.c
> index 9bc1132..5757788 100644
> --- a/drivers/net/ibm_newemac/rgmii.c
> +++ b/drivers/net/ibm_newemac/rgmii.c
> @@ -302,7 +302,6 @@ static int __devexit rgmii_remove(struct of_device *ofdev)
>  static struct of_device_id rgmii_match[] =
>  {
>   {
> - .type   = "rgmii-interface",
>   .compatible = "ibm,rgmii",
>   },
>   {

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/8] [PATCH] Split up rndis_host.c

2008-01-17 Thread Jussi Kivilinna
On Tue, 2008-01-08 at 03:19 -0800, David Brownell wrote:
> > Our 
> > module version checks OID_GEN_PHYSICAL_MEDIUM in generic_rndis_bind,
> > with rndis_host bind fails if OID is supported and wireless media
type
> > is returned, with rndis_wext if OID isn't supported or type isn't
> > wireless. Should this be ok?
> 
> Sounds like if you can go the different-modules route, then
rndis_bind()
> should maybe accept a parameter giving it that option.  Then
> 
>   rndis_wlan_bind(...) == rndis_bind(..., check_wlan_media)
>   rndis_generic_bind(...) == rndis_bind(..., non_wlan_media)
> 
> That would be getting pretty complicated, since as you say the same
> USB-level binding (hence hotplugging) would need to kick in ... but
> having different modules for different RNDIS flavors would need a 
> new hotplug mechanism keying on media type.  Ugh.

Different modules works with rndis_generic_bind(..., media type option).
dmesg shows following:

[11214.496773] usb 3-3: new high speed USB device using ehci_hcd and
address 7
[11214.622861] usb 3-3: configuration #1 chosen from 1 choice
[11214.784874] usbcore: registered new interface driver cdc_ether
[11214.805700] rndis_host 3-3:1.0: driver requires wired physical
medium, but device is wireless.
[11214.806823] usbcore: registered new interface driver rndis_host
[11215.190824] wlan0: register 'rndis_wext' at usb-:00:10.3-3,
Wireless RNDIS device, 00:xx:xx:xx:xx:xx
[11215.190853] usbcore: registered new interface driver rndis_wext
[11215.398850] rndis_wext 3-3:1.0: rndis media disconnect

I assume that rndis_host gets always bind first as it's first in
alphabetic, so rndis_wext is only loaded if RNDIS device has wireless
media type.

I have new patchset that prepares rndis_host/usbnet for separate
rndis_wext but not sure should I submit (I haven't managed to get into
contact with Bjorge Dijkstra for 3,5 weeks now to have him decide how to
proceed). 

 - Jussi Kivilinna

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IPv6 support for NFS server

2008-01-17 Thread J. Bruce Fields
On Thu, Jan 17, 2008 at 06:04:35PM +0100, Aurélien Charbon wrote:
> Hi Bruce.
>
> Thanks for your comments.
> Here is the patch with some cleanups.

Thanks for the revisions.  We need to submit this with a patch comment
that:

- Explains more precisely what this does (fixes export
  interfaces to allow ipv6) and what remains to be done (?)
- Credits the folks (like Brian Haley) who have provided
  feedback.

I'll help clean up that comment if needed but please make sure it's
always included with the patch when you resend it.

--b.

>
> Regards,
> Aurélien
>
>
> -- 
>
> 
>   Aurelien Charbon
>   Linux NFSv4 team
>   Bull SAS
> Echirolles - France
> http://nfsv4.bullopensource.org/
> 
>

> >From 51755892e19186cd18230bac3f783b0382bf9ae0 Mon Sep 17 00:00:00 2001
> From: Aurelien Charbon <[EMAIL PROTECTED]>
> Date: Thu, 17 Jan 2008 14:55:03 +0100
> Subject: [PATCH 1/1] IPv6 support for NFS server
> 
> ---
>  fs/nfsd/export.c   |9 ++-
>  fs/nfsd/nfsctl.c   |   15 +-
>  include/linux/sunrpc/svcauth.h |4 +-
>  include/net/ipv6.h |9 +++
>  net/sunrpc/svcauth_unix.c  |  118 
> +++-
>  5 files changed, 110 insertions(+), 45 deletions(-)
> 
> diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c
> index 66d0aeb..208db3a 100644
> --- a/fs/nfsd/export.c
> +++ b/fs/nfsd/export.c
> @@ -35,6 +35,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #define NFSDDBG_FACILITY NFSDDBG_EXPORT
>  
> @@ -1556,6 +1557,7 @@ exp_addclient(struct nfsctl_client *ncp)
>  {
>   struct auth_domain  *dom;
>   int i, err;
> + struct in6_addr addr6;
>  
>   /* First, consistency check. */
>   err = -EINVAL;
> @@ -1574,9 +1576,10 @@ exp_addclient(struct nfsctl_client *ncp)
>   goto out_unlock;
>  
>   /* Insert client into hashtable. */
> - for (i = 0; i < ncp->cl_naddr; i++)
> - auth_unix_add_addr(ncp->cl_addrlist[i], dom);
> -
> + for (i = 0; i < ncp->cl_naddr; i++) {
> + ipv6_addr_set_v4mapped(ncp->cl_addrlist[i].s_addr, &addr6);
> + auth_unix_add_addr(&addr6, dom);
> + }
>   auth_unix_forget_old(dom);
>   auth_domain_put(dom);
>  
> diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> index 77dc989..13d6b6b 100644
> --- a/fs/nfsd/nfsctl.c
> +++ b/fs/nfsd/nfsctl.c
> @@ -37,6 +37,7 @@
>  #include 
>  
>  #include 
> +#include 
>  
>  /*
>   *   We have a single directory with 9 nodes in it.
> @@ -222,6 +223,7 @@ static ssize_t write_getfs(struct file *file, char *buf, 
> size_t size)
>   struct auth_domain *clp;
>   int err = 0;
>   struct knfsd_fh *res;
> + struct in6_addr in6;
>  
>   if (size < sizeof(*data))
>   return -EINVAL;
> @@ -236,7 +238,11 @@ static ssize_t write_getfs(struct file *file, char *buf, 
> size_t size)
>   res = (struct knfsd_fh*)buf;
>  
>   exp_readlock();
> - if (!(clp = auth_unix_lookup(sin->sin_addr)))
> +
> + ipv6_addr_set_v4mapped(sin->sin_addr.s_addr, &in6);
> +
> + clp = auth_unix_lookup(&in6);
> + if (!clp)
>   err = -EPERM;
>   else {
>   err = exp_rootfh(clp, data->gd_path, res, data->gd_maxlen);
> @@ -257,6 +263,7 @@ static ssize_t write_getfd(struct file *file, char *buf, 
> size_t size)
>   int err = 0;
>   struct knfsd_fh fh;
>   char *res;
> + struct in6_addr in6;
>  
>   if (size < sizeof(*data))
>   return -EINVAL;
> @@ -271,7 +278,11 @@ static ssize_t write_getfd(struct file *file, char *buf, 
> size_t size)
>   res = buf;
>   sin = (struct sockaddr_in *)&data->gd_addr;
>   exp_readlock();
> - if (!(clp = auth_unix_lookup(sin->sin_addr)))
> +
> + ipv6_addr_set_v4mapped(sin->sin_addr.s_addr,&in6);
> +
> + clp = auth_unix_lookup(&in6);
> + if (!clp)
>   err = -EPERM;
>   else {
>   err = exp_rootfh(clp, data->gd_path, &fh, NFS_FHSIZE);
> diff --git a/include/linux/sunrpc/svcauth.h b/include/linux/sunrpc/svcauth.h
> index 22e1ef8..9e6fb86 100644
> --- a/include/linux/sunrpc/svcauth.h
> +++ b/include/linux/sunrpc/svcauth.h
> @@ -120,10 +120,10 @@ extern void svc_auth_unregister(rpc_authflavor_t 
> flavor);
>  
>  extern struct auth_domain *unix_domain_find(char *name);
>  extern void auth_domain_put(struct auth_domain *item);
> -extern int auth_unix_add_addr(struct in_addr addr, struct auth_domain *dom);
> +extern int auth_unix_add_addr(struct in6_addr *addr, struct auth_domain 
> *dom);
>  extern struct auth_domain *auth_domain_lookup(char *name, struct auth_domain 
> *new);
>  extern struct auth_domain *auth_domain_find(char *name);
> -extern struct auth_domain *auth_unix_lookup(struct in_addr addr);
> +extern struct auth_domain *auth_unix_lookup(struct in6_addr *addr);
>  extern int auth_unix_forget_

[PATCH][TRIVIAL] ibm_emac/ibm_emac_mal.c:mal_poll: Fix MAL_DBG2 invocation

2008-01-17 Thread Hal Rosenstock
ibm_emac/ibm_emac_mal.c:mal_poll: Fix MAL_DBG2 invocation

Signed-off-by: Hal Rosenstock <[EMAIL PROTECTED]>

diff --git a/drivers/net/ibm_emac/ibm_emac_mal.c 
b/drivers/net/ibm_emac/ibm_emac_mal.c
index dcd8826..1977791 100644
--- a/drivers/net/ibm_emac/ibm_emac_mal.c
+++ b/drivers/net/ibm_emac/ibm_emac_mal.c
@@ -279,7 +279,7 @@ static int mal_poll(struct napi_struct *napi, int budget)
struct list_head *l;
int received = 0;
 
-   MAL_DBG2("%d: poll(%d) %d ->" NL, mal->def->index, *budget,
+   MAL_DBG2("%d: poll(%d) %d ->" NL, mal->def->index, budget,
 rx_work_limit);
   again:
/* Process TX skbs */

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IPv6 support for NFS server

2008-01-17 Thread Brian Haley

Aurélien Charbon wrote:

Thanks for your comments.
Here is the patch with some cleanups.


Hi Aurelien,

Just two nits.


--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -400,6 +400,15 @@ static inline int ipv6_addr_v4mapped(const struct in6_addr 
*a)
 a->s6_addr32[2] == htonl(0x));
 }
 
+static inline void ipv6_addr_set_v4mapped(const __be32 addr,

+ struct in6_addr *v4mapped)
+{
+   ipv6_addr_set(v4mapped,
+   0, 0,
+   htonl(0x),
+   addr);
+}


I think Bruce wanted you to put as much on one line here as possible.


@@ -641,9 +668,24 @@ static int unix_gid_find(uid_t uid, struct group_info 
**gip,
 int
 svcauth_unix_set_client(struct svc_rqst *rqstp)
 {
-   struct sockaddr_in *sin = svc_addr_in(rqstp);
+   struct sockaddr_in *sin;
+   struct sockaddr_in6 *sin6, sin6_storage;
struct ip_map *ipm;
 
+	switch (rqstp->rq_addr.ss_family) {

+   case AF_INET:
+   sin = svc_addr_in(rqstp);
+   sin6 = &sin6_storage;
+   ipv6_addr_set(&sin6->sin6_addr, 0, 0,
+   htonl(0x), sin->sin_addr.s_addr);
+   break;


ipv6_addr_set_v4mapped(sin->sin_addr.s_addr, &sin6->sin6_addr);

-Brian
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IPv6 support for NFS server

2008-01-17 Thread Aurélien Charbon

Hi Bruce.

Thanks for your comments.
Here is the patch with some cleanups.

Regards,
Aurélien


--


  Aurelien Charbon
  Linux NFSv4 team
  Bull SAS
Echirolles - France
http://nfsv4.bullopensource.org/


>From 51755892e19186cd18230bac3f783b0382bf9ae0 Mon Sep 17 00:00:00 2001
From: Aurelien Charbon <[EMAIL PROTECTED]>
Date: Thu, 17 Jan 2008 14:55:03 +0100
Subject: [PATCH 1/1] IPv6 support for NFS server

---
 fs/nfsd/export.c   |9 ++-
 fs/nfsd/nfsctl.c   |   15 +-
 include/linux/sunrpc/svcauth.h |4 +-
 include/net/ipv6.h |9 +++
 net/sunrpc/svcauth_unix.c  |  118 +++-
 5 files changed, 110 insertions(+), 45 deletions(-)

diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c
index 66d0aeb..208db3a 100644
--- a/fs/nfsd/export.c
+++ b/fs/nfsd/export.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define NFSDDBG_FACILITY	NFSDDBG_EXPORT
 
@@ -1556,6 +1557,7 @@ exp_addclient(struct nfsctl_client *ncp)
 {
 	struct auth_domain	*dom;
 	int			i, err;
+	struct in6_addr addr6;
 
 	/* First, consistency check. */
 	err = -EINVAL;
@@ -1574,9 +1576,10 @@ exp_addclient(struct nfsctl_client *ncp)
 		goto out_unlock;
 
 	/* Insert client into hashtable. */
-	for (i = 0; i < ncp->cl_naddr; i++)
-		auth_unix_add_addr(ncp->cl_addrlist[i], dom);
-
+	for (i = 0; i < ncp->cl_naddr; i++) {
+		ipv6_addr_set_v4mapped(ncp->cl_addrlist[i].s_addr, &addr6);
+		auth_unix_add_addr(&addr6, dom);
+	}
 	auth_unix_forget_old(dom);
 	auth_domain_put(dom);
 
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index 77dc989..13d6b6b 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -37,6 +37,7 @@
 #include 
 
 #include 
+#include 
 
 /*
  *	We have a single directory with 9 nodes in it.
@@ -222,6 +223,7 @@ static ssize_t write_getfs(struct file *file, char *buf, size_t size)
 	struct auth_domain *clp;
 	int err = 0;
 	struct knfsd_fh *res;
+	struct in6_addr in6;
 
 	if (size < sizeof(*data))
 		return -EINVAL;
@@ -236,7 +238,11 @@ static ssize_t write_getfs(struct file *file, char *buf, size_t size)
 	res = (struct knfsd_fh*)buf;
 
 	exp_readlock();
-	if (!(clp = auth_unix_lookup(sin->sin_addr)))
+
+	ipv6_addr_set_v4mapped(sin->sin_addr.s_addr, &in6);
+
+	clp = auth_unix_lookup(&in6);
+	if (!clp)
 		err = -EPERM;
 	else {
 		err = exp_rootfh(clp, data->gd_path, res, data->gd_maxlen);
@@ -257,6 +263,7 @@ static ssize_t write_getfd(struct file *file, char *buf, size_t size)
 	int err = 0;
 	struct knfsd_fh fh;
 	char *res;
+	struct in6_addr in6;
 
 	if (size < sizeof(*data))
 		return -EINVAL;
@@ -271,7 +278,11 @@ static ssize_t write_getfd(struct file *file, char *buf, size_t size)
 	res = buf;
 	sin = (struct sockaddr_in *)&data->gd_addr;
 	exp_readlock();
-	if (!(clp = auth_unix_lookup(sin->sin_addr)))
+
+	ipv6_addr_set_v4mapped(sin->sin_addr.s_addr,&in6);
+
+	clp = auth_unix_lookup(&in6);
+	if (!clp)
 		err = -EPERM;
 	else {
 		err = exp_rootfh(clp, data->gd_path, &fh, NFS_FHSIZE);
diff --git a/include/linux/sunrpc/svcauth.h b/include/linux/sunrpc/svcauth.h
index 22e1ef8..9e6fb86 100644
--- a/include/linux/sunrpc/svcauth.h
+++ b/include/linux/sunrpc/svcauth.h
@@ -120,10 +120,10 @@ extern void	svc_auth_unregister(rpc_authflavor_t flavor);
 
 extern struct auth_domain *unix_domain_find(char *name);
 extern void auth_domain_put(struct auth_domain *item);
-extern int auth_unix_add_addr(struct in_addr addr, struct auth_domain *dom);
+extern int auth_unix_add_addr(struct in6_addr *addr, struct auth_domain *dom);
 extern struct auth_domain *auth_domain_lookup(char *name, struct auth_domain *new);
 extern struct auth_domain *auth_domain_find(char *name);
-extern struct auth_domain *auth_unix_lookup(struct in_addr addr);
+extern struct auth_domain *auth_unix_lookup(struct in6_addr *addr);
 extern int auth_unix_forget_old(struct auth_domain *dom);
 extern void svcauth_unix_purge(void);
 extern void svcauth_unix_info_release(void *);
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index ae328b6..9394710 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -400,6 +400,15 @@ static inline int ipv6_addr_v4mapped(const struct in6_addr *a)
 		 a->s6_addr32[2] == htonl(0x));
 }
 
+static inline void ipv6_addr_set_v4mapped(const __be32 addr,
+	  struct in6_addr *v4mapped)
+{
+	ipv6_addr_set(v4mapped,
+			0, 0,
+			htonl(0x),
+			addr);
+}
+
 /*
  * find the first different bit between two addresses
  * length of address must be a multiple of 32bits
diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c
index 4114794..10ba208 100644
--- a/net/sunrpc/svcauth_unix.c
+++ b/net/sunrpc/svcauth_unix.c
@@ -11,7 +11,8 @@
 #include 
 #include 
 #include 
-
+#include 
+#include 
 #define RPCDBG_FACILITY	RPCDBG_AUTH
 
 
@@ -84,7 +85,7 @@ static void svcauth_unix_domain_release(struct auth_domain *dom)
 struct ip_map {
 	struct cache_head	

Re: [PATCH] request_irq() always returns -EINVAL with a NULL handler.

2008-01-17 Thread Stephen Hemminger
On Thu, 17 Jan 2008 17:57:58 +1100
Rusty Russell <[EMAIL PROTECTED]> wrote:

> I assume that these ancient network drivers were trying to find out if
> an irq is available.  eepro.c expecting +EBUSY was doubly wrong.
> 
> I'm not sure that can_request_irq() is the right thing, but these drivers
> are definitely wrong.
> 
> request_irq should BUG() on bad input, and these would have been found
> earlier.
> 
> Signed-off-by: Rusty Russell <[EMAIL PROTECTED]>
> ---
>  drivers/net/3c503.c |2 +-
>  drivers/net/e2100.c |2 +-
>  drivers/net/eepro.c |2 +-
>  drivers/net/hp.c|2 +-
>  kernel/irq/manage.c |1 +
>  5 files changed, 5 insertions(+), 4 deletions(-)
> 
> diff -r 0b7e4fbb6238 drivers/net/3c503.c
> --- a/drivers/net/3c503.c Thu Jan 17 15:49:34 2008 +1100
> +++ b/drivers/net/3c503.c Thu Jan 17 16:40:28 2008 +1100
> @@ -379,7 +379,7 @@ el2_open(struct net_device *dev)
>  
>   outb(EGACFR_NORM, E33G_GACFR);  /* Enable RAM and interrupts. */
>   do {
> - if (request_irq (*irqp, NULL, 0, "bogus", dev) != -EBUSY) {
> + if (can_request_irq(*irqp, 0)) {
>   /* Twinkle the interrupt, and check if it's seen. */
>   unsigned long cookie = probe_irq_on();
>   outb_p(0x04 << ((*irqp == 9) ? 2 : *irqp), E33G_IDCFR);
> diff -r 0b7e4fbb6238 drivers/net/e2100.c
> --- a/drivers/net/e2100.c Thu Jan 17 15:49:34 2008 +1100
> +++ b/drivers/net/e2100.c Thu Jan 17 16:40:28 2008 +1100
> @@ -202,7 +202,7 @@ static int __init e21_probe1(struct net_
>   if (dev->irq < 2) {
>   int irqlist[] = {15,11,10,12,5,9,3,4}, i;
>   for (i = 0; i < 8; i++)
> - if (request_irq (irqlist[i], NULL, 0, "bogus", NULL) != 
> -EBUSY) {
> + if (can_request_irq(irqlist[i], 0)) {
>   dev->irq = irqlist[i];
>   break;
>   }
> diff -r 0b7e4fbb6238 drivers/net/eepro.c
> --- a/drivers/net/eepro.c Thu Jan 17 15:49:34 2008 +1100
> +++ b/drivers/net/eepro.c Thu Jan 17 16:40:28 2008 +1100
> @@ -914,7 +914,7 @@ static inteepro_grab_irq(struct net_dev
>  
>   eepro_sw2bank0(ioaddr); /* Switch back to Bank 0 */
>  
> - if (request_irq (*irqp, NULL, IRQF_SHARED, "bogus", dev) != 
> EBUSY) {
> + if (can_request_irq(*irqp, IRQF_SHARED)) {
>   unsigned long irq_mask;
>   /* Twinkle the interrupt, and check if it's seen */
>   irq_mask = probe_irq_on();
> diff -r 0b7e4fbb6238 drivers/net/hp.c
> --- a/drivers/net/hp.cThu Jan 17 15:49:34 2008 +1100
> +++ b/drivers/net/hp.cThu Jan 17 16:40:28 2008 +1100
> @@ -170,7 +170,7 @@ static int __init hp_probe1(struct net_d
>   int *irqp = wordmode ? irq_16list : irq_8list;
>   do {
>   int irq = *irqp;
> - if (request_irq (irq, NULL, 0, "bogus", NULL) != 
> -EBUSY) {
> + if (can_request_irq(irq, 0)) {
>   unsigned long cookie = probe_irq_on();
>   /* Twinkle the interrupt, and check if it's 
> seen. */
>   outb_p(irqmap[irq] | HP_RUN, ioaddr + 
> HP_CONFIGURE);
> diff -r 0b7e4fbb6238 kernel/irq/manage.c
> --- a/kernel/irq/manage.c Thu Jan 17 15:49:34 2008 +1100
> +++ b/kernel/irq/manage.c Thu Jan 17 16:40:28 2008 +1100
> @@ -252,6 +252,7 @@ int can_request_irq(unsigned int irq, un
>  
>   return !action;
>  }
> +EXPORT_SYMBOL(can_request_irq);
>  
>  void compat_irq_chip_set_default_handler(struct irq_desc *desc)
>  {

Isn't this just inherently racy, like the old check_resource stuff that got 
pulled
out 2.5?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH] Fixing SA/SP dumps on netlink/af_key

2008-01-17 Thread Timo Teräs
Herbert Xu wrote:
> On Thu, Jan 17, 2008 at 07:42:30AM -0500, jamal wrote:
>> Looking at the pfkey RFC one more time, heres a funny quote:
>> "
>> The dump message is used for debugging
>> purposes only and is not intended for production use.
>> "
> 
> In fact it goes much further:
> 
>Support for the dump message MAY be discontinued in future versions
>of PF_KEY.  Key management applications MUST NOT depend on this
>message for basic operation.

I guess the idea was that application should know about the SAs it
created. Though a SA dump needs to be done if you want to check
for existing entries (created by other processes, or if you are
recovering from a crash).

SPD dumping is still a must if you want to work nicely with kernel.

As noted earlier pfkey is not really standardized. E.g. the SPD
dumping message are not in the RFC as David noted. The above RFC
comments and the fact that SPD stuff is unspecified made me think
that making non-atomic dumps would be a lot better alternative then
leaving the socket to bad state which would make the application
completely unusable.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH] Fixing SA/SP dumps on netlink/af_key

2008-01-17 Thread jamal
On Thu, 2008-17-01 at 23:50 +1100, Herbert Xu wrote:

> In fact it goes much further:
> 
>Support for the dump message MAY be discontinued in future versions
>of PF_KEY.  Key management applications MUST NOT depend on this
>message for basic operation.

No doubt PF_KEY being an RFC has caused a lot of damage. 
Once something is deployed, unfortunately, it grows a foot and sometimes
a head[1]. 
Note: it's a big dilema in my mind as well and i agree in principle with
both Dave and you (we really should not be helping pfkey grow another
ear on the forehead); the only way i am convincing myself otherwise is
to note that Racoon/ipsec-tools is out there, shipped as default ipsec
user management tools by most if not all linux distros. If we really
want to stop the beast lets cut out the umbilicall code - just take out
pfkey altogether from Linux ;->


cheers,
jamal

[1] Whatever fix/approach Timo has will eventually show up in the BSDs
and solaris for example 

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH] Fixing SA/SP dumps on netlink/af_key

2008-01-17 Thread Herbert Xu
On Thu, Jan 17, 2008 at 07:42:30AM -0500, jamal wrote:
> 
> Looking at the pfkey RFC one more time, heres a funny quote:
> "
> The dump message is used for debugging
> purposes only and is not intended for production use.
> "

In fact it goes much further:

   Support for the dump message MAY be discontinued in future versions
   of PF_KEY.  Key management applications MUST NOT depend on this
   message for basic operation.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH] Fixing SA/SP dumps on netlink/af_key

2008-01-17 Thread jamal
On Thu, 2008-17-01 at 07:54 +0200, Timo Teräs wrote:

> You listen for the events. It is guaranteed that if the dumping code
> does return the entry to be deleted, the deletion notification will
> occur after that dump entry.

Ok, sounds reasonable - as long as there is a known order for
occurances, then there will be no ambiguity.
I am assuming that the same ordering will happen with
updates/modifications?
To go back to what i suggested earlier - is it possible to have this in
two stages? First pfkey with expected behavior being the same as current
netlink; then later the optimizations you are talking about.

Looking at the pfkey RFC one more time, heres a funny quote:
"
The dump message is used for debugging
purposes only and is not intended for production use.
"

One thing Dave mentioned thats extremely important is to ensure no ABI 
breakage. 
Think of racoon 0.6 which knows nothing of this; it should continue to work.

Dave: One reason i paid attention to this is because it was on your TODO
list from netconf 2005 ;-> It has just been sitting in the background
memory cells since.

cheers,
jamal

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/8] [TCP]: Uninline tcp_set_state

2008-01-17 Thread Andi Kleen
"Ilpo Järvinen" <[EMAIL PROTECTED]> writes:
>
> Besides, it not always that obvious that gcc is able to determine "the
> constant state", considering e.g., the complexity in the cases with
> tcp_rcv_synsent_state_process, tcp_close_state, tcp_done. In such cases
> uninlining should be done and gcc is probably not able to mix both cases
> nicely for a single function?

I think it would be cleanest to completely unswitch the function 
and split into tcp_set_closed()  / tcp_set_established() / tcp_other_state() 
called by the callers directly.

That would probably lose the state trace, but I never found 
that useful for anything.

-Andi
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH] Fixing SA/SP dumps on netlink/af_key

2008-01-17 Thread Timo Teräs
David Miller wrote:
> From: Timo_Teräs <[EMAIL PROTECTED]>
> Date: Thu, 17 Jan 2008 13:00:09 +0200
> 
>> IMHO, it's a lot better then losing >50% of entries and the end
>> of sequence message on big dumps. SPD and SADB are not that
>> volatile; in most of the cases the dump would be as good as an
>> atomic one.
> 
> I humbly disagree with you.  Interface behavior stability
> is more important.

Small SPDs/SADBs would still be dumped atomically. The patch
affects only the cases when the receive queue is getting full.

>> I'm not sure if there's other major applications that we should
>> be concerned about, but at least ipsec-tools racoon does not
>> expect to get atomic dumps (which btw, comes originally from BSD).
> 
> Racoon was written as an addon to the BSD stack by an IPV6/IPSEC
> project in Japan named KAME, it did not "come from BSD".  It was
> added to BSD.
> 
> There are also other BSD based IPSEC daemons such as the one written
> by the OpenBSD folks.

Yes. I meant that it was originally written to be used in BSD. The
Linux port came later. Sorry for the ambiguous wording.

> I don't think this is arguable at all.  We're not changing semantics
> over what we've done for 4+ years and applications might depend upon.
> It's for a deprecated interface, which makes any semantic changes that
> much less inviting.
> 
> You can argue all you want, but it will not change the invariants in
> the previous paragraph.

True. If no one else agrees with me, I'll drop it. I can always run
my own patched kernel.

I'd appreciate feedback on the xfrm changes. I'll try to make that
part usable patch against net-2.6.25 git tree next week.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH] Fixing SA/SP dumps on netlink/af_key

2008-01-17 Thread jamal
On Thu, 2008-17-01 at 22:11 +1100, Herbert Xu wrote:

> Sure racoon uses pfkey but the question is does it use pfkey dumping?
> 

it does when trying to purge phase 2 SAs...

cheers,
jamal

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH] Fixing SA/SP dumps on netlink/af_key

2008-01-17 Thread Timo Teräs
Herbert Xu wrote:
> On Thu, Jan 17, 2008 at 07:54:32AM +0200, Timo Teräs wrote:
>>> Racoon doesn't use pfkey dumping as far as I know.
>> ipsec-tools racoon uses pfkey and only pfkey. And it's non trivial to
>> make it use netlink; it relies heavily all around the code to pfkey
>> structs. It also runs on BSD so we cannot rip pfkey away; adding a
>> layer to work with both pfkey and netlink would be doable, but just a
>> lot of work.
> 
> Sure racoon uses pfkey but the question is does it use pfkey dumping?

Yes it does.

It does SPD dump at startup and keeps the SP database in sync by
listening to notifications.

It also does SA dumps when it is figuring out which SAs to purge.

I started to work on the xfrm/pfkey patch only because racoon is
having problems with (as is anything else using pfkey).

- Timo

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Broken "Make ip6_frags per namespace" patch

2008-01-17 Thread Daniel Lezcano

Alexey Dobriyan wrote:

On Thu, Jan 17, 2008 at 11:40:42AM +0100, Daniel Lezcano wrote:

Alexey Dobriyan wrote:

commit c064c4811b3e87ff8202f5a966ff4eea0bc54575
Author: Daniel Lezcano <[EMAIL PROTECTED]>
Date:   Thu Jan 10 02:56:03 2008 -0800

   [NETNS][IPV6]: Make ip6_frags per namespace.
   
   The ip6_frags is moved to the network namespace structure.  Because

   there can be multiple instances of the network namespaces, and the
   ip6_frags is no longer a global static variable, a helper function has
   been added to facilitate the initialization of the variables.
   
   Until the ipv6 protocol is not per namespace, the variables are

   accessed relatively from the initial network namespace.
--- a/include/net/netns/ipv6.h
+++ b/include/net/netns/ipv6.h
@@ -11,6 +13,7 @@ struct netns_sysctl_ipv6 {
#ifdef CONFIG_SYSCTL
struct ctl_table_header *table;
#endif
+   struct inet_frags_ctl frags;
--- a/net/ipv6/reassembly.c
+++ b/net/ipv6/reassembly.c
@@ -632,6 +625,11 @@ static struct inet6_protocol frag_protocol =
.flags  =   INET6_PROTO_NOPOLICY,
};

+void ipv6_frag_sysctl_init(struct net *net)
+{
+   ip6_frags.ctl = &net->ipv6.sysctl.frags;
+}

_This_ can't work. ip6frags is only one and ->ctl pointer is flipped
onto per-netns data. Changelog is also misleading: ip6_frags_ctl is
moved to netns not all ip6_frags.

Oopsing place below -- f->ctl dereference in preparation of mod_timer() 
call.




BUG: unable to handle kernel paging request at virtual address f5da8fc8
printing eip: c11d868a *pdpt = 3001 *pde = 01728067 
*pte = 35da8000 Oops:  [#1] PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in: ebt_ip ebt_dnat ebt_arpreply ebt_arp ebt_among 
ebtable_nat ip6t_REJECT ip6table_filter ip6_tables ebtable_filter 
ebtable_broute ebt_802_3 ebtables des_generic nf_conntrack_netbios_ns 
nf_conntrack_ipv4 xt_state nf_conntrack xt_tcpudp ipt_REJECT 
iptable_filter ip_tables deflate zlib_deflate zlib_inflate cryptomgr 
crypto_hash cpufreq_stats cpufreq_ondemand cdrom cbc bridge llc blkcipher 
crypto_algapi arpt_mangle arptable_filter arp_tables x_tables ah6 
af_packet ipv6


Pid: 0, comm: swapper Not tainted (2.6.24-rc7-net-2.6.25-nf-sysfs-n #30)
EIP: 0060:[] EFLAGS: 00010246 CPU: 1
EIP is at inet_frag_secret_rebuild+0xaa/0xd0
EAX: f5da8fbc EBX:  ECX: c131 EDX: 0100
ESI: f7cba000 EDI: f898f7a0 EBP: 0040 ESP: c1310f90
DS: 007b ES: 007b FS: 00d8 GS:  SS: 0068
Process swapper (pid: 0, ti=c131 task=f7c9a580 task.ti=f7c9b000)
Stack: f898f7a8 f898f8a8 000ddcbd f898f7a0 f7cba000 c1310fc4 0100 
c1026d60 0002 0001 c1191183 c4779ddc c11d85e0 f898c860 
  f898c860 c12c4a88 0001 c1308da0 000a c1023477 0001 
  c130b640 c130b640 f7c9bf34 Call Trace:

[] run_timer_softirq+0x120/0x190
[] net_rx_action+0x53/0x220
[] inet_frag_secret_rebuild+0x0/0xd0
[] __do_softirq+0x87/0x100
[] do_softirq+0xaf/0x110
[] irq_exit+0x83/0x90
[] smp_apic_timer_interrupt+0x57/0x90
[] apic_timer_interrupt+0x29/0x38
[] apic_timer_interrupt+0x33/0x38
[] default_idle+0x0/0x60
[] default_idle+0x40/0x60
[] cpu_idle+0x73/0xb0
===
Code: 8b 10 85 d2 89 13 74 03 89 5a 04 89 18 89 43 04 85 f6 89 f3 75 bb 45 
83 fd 40 75 a5 8b 44 24 04 e8 4c 3f 01 00 8b 87 50 01 00 00 <8b> 50 0c 01 
54 24 08 8d 87 38 01 00 00 8b 54 24 08 83 c4 0c 5b EIP: [] 
inet_frag_secret_rebuild+0xaa/0xd0 SS:ESP 0068:c1310f90

Kernel panic - not syncing: Fatal exception in interrupt

Hi Alexey,

does it happen after unsharing the network ?


Yep. clone(CLONE_NEWNET) in a loop and sooner or later you'll see this.


Thanks.

The network namespace is not yet complete, this is normal that you have 
not ip6_frag per namespace. Pavel is doing that.


Perhaps, I should disable ipv6_frag_sysctl_init when not in the init_net 
and enable it again when Pavel send its fragment patchset ?

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] bonding: Fix some RTNL taking

2008-01-17 Thread Jarek Poplawski
On Thu, Jan 17, 2008 at 02:30:37PM +0900, Makito SHIOKAWA wrote:
>> Maybe I'm wrong, but since this read_lock() is given and taken anyway,
>> it seems this looks a bit better to me (why hold this rtnl longer
>> than needed?):
>>  read_unlock(&bond->lock);
>>  rtnl_unlock();
>>  read_lock(&bond->lock);
> Seems better.
>
>> On the other hand, probably 'if (bond->kill_timers)' could be repeated
>> after this read_lock() retaking.
> Sorry, what do you mean? (A case that bond->kill_timers = 1 is done during 
> lock retaking, and work being queued only to do 'if (bond->kill_timers)'? 
> If so, I think that won't differ much.)

Probably the difference is not much, but since this all double locking,
unlocking and something between could take a while, and such a check
looks cheaper than re-queueing... But I don't persist in this.

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Broken "Make ip6_frags per namespace" patch

2008-01-17 Thread Alexey Dobriyan
On Thu, Jan 17, 2008 at 11:40:42AM +0100, Daniel Lezcano wrote:
> Alexey Dobriyan wrote:
> >>commit c064c4811b3e87ff8202f5a966ff4eea0bc54575
> >>Author: Daniel Lezcano <[EMAIL PROTECTED]>
> >>Date:   Thu Jan 10 02:56:03 2008 -0800
> >>
> >>[NETNS][IPV6]: Make ip6_frags per namespace.
> >>
> >>The ip6_frags is moved to the network namespace structure.  Because
> >>there can be multiple instances of the network namespaces, and the
> >>ip6_frags is no longer a global static variable, a helper function has
> >>been added to facilitate the initialization of the variables.
> >>
> >>Until the ipv6 protocol is not per namespace, the variables are
> >>accessed relatively from the initial network namespace.
> >
> >>--- a/include/net/netns/ipv6.h
> >>+++ b/include/net/netns/ipv6.h
> >
> >>@@ -11,6 +13,7 @@ struct netns_sysctl_ipv6 {
> >> #ifdef CONFIG_SYSCTL
> >>struct ctl_table_header *table;
> >> #endif
> >>+   struct inet_frags_ctl frags;
> >
> >>--- a/net/ipv6/reassembly.c
> >>+++ b/net/ipv6/reassembly.c
> >
> >>@@ -632,6 +625,11 @@ static struct inet6_protocol frag_protocol =
> >>.flags  =   INET6_PROTO_NOPOLICY,
> >> };
> >> 
> >>+void ipv6_frag_sysctl_init(struct net *net)
> >>+{
> >>+   ip6_frags.ctl = &net->ipv6.sysctl.frags;
> >>+}
> >
> >_This_ can't work. ip6frags is only one and ->ctl pointer is flipped
> >onto per-netns data. Changelog is also misleading: ip6_frags_ctl is
> >moved to netns not all ip6_frags.
> >
> >Oopsing place below -- f->ctl dereference in preparation of mod_timer() 
> >call.
> >
> >
> >
> >BUG: unable to handle kernel paging request at virtual address f5da8fc8
> >printing eip: c11d868a *pdpt = 3001 *pde = 01728067 
> >*pte = 35da8000 Oops:  [#1] PREEMPT SMP DEBUG_PAGEALLOC
> >Modules linked in: ebt_ip ebt_dnat ebt_arpreply ebt_arp ebt_among 
> >ebtable_nat ip6t_REJECT ip6table_filter ip6_tables ebtable_filter 
> >ebtable_broute ebt_802_3 ebtables des_generic nf_conntrack_netbios_ns 
> >nf_conntrack_ipv4 xt_state nf_conntrack xt_tcpudp ipt_REJECT 
> >iptable_filter ip_tables deflate zlib_deflate zlib_inflate cryptomgr 
> >crypto_hash cpufreq_stats cpufreq_ondemand cdrom cbc bridge llc blkcipher 
> >crypto_algapi arpt_mangle arptable_filter arp_tables x_tables ah6 
> >af_packet ipv6
> >
> >Pid: 0, comm: swapper Not tainted (2.6.24-rc7-net-2.6.25-nf-sysfs-n #30)
> >EIP: 0060:[] EFLAGS: 00010246 CPU: 1
> >EIP is at inet_frag_secret_rebuild+0xaa/0xd0
> >EAX: f5da8fbc EBX:  ECX: c131 EDX: 0100
> >ESI: f7cba000 EDI: f898f7a0 EBP: 0040 ESP: c1310f90
> > DS: 007b ES: 007b FS: 00d8 GS:  SS: 0068
> >Process swapper (pid: 0, ti=c131 task=f7c9a580 task.ti=f7c9b000)
> >Stack: f898f7a8 f898f8a8 000ddcbd f898f7a0 f7cba000 c1310fc4 0100 
> >c1026d60 0002 0001 c1191183 c4779ddc c11d85e0 f898c860 
> >   f898c860 c12c4a88 0001 c1308da0 000a c1023477 0001 
> >   c130b640 c130b640 f7c9bf34 Call Trace:
> > [] run_timer_softirq+0x120/0x190
> > [] net_rx_action+0x53/0x220
> > [] inet_frag_secret_rebuild+0x0/0xd0
> > [] __do_softirq+0x87/0x100
> > [] do_softirq+0xaf/0x110
> > [] irq_exit+0x83/0x90
> > [] smp_apic_timer_interrupt+0x57/0x90
> > [] apic_timer_interrupt+0x29/0x38
> > [] apic_timer_interrupt+0x33/0x38
> > [] default_idle+0x0/0x60
> > [] default_idle+0x40/0x60
> > [] cpu_idle+0x73/0xb0
> >===
> >Code: 8b 10 85 d2 89 13 74 03 89 5a 04 89 18 89 43 04 85 f6 89 f3 75 bb 45 
> >83 fd 40 75 a5 8b 44 24 04 e8 4c 3f 01 00 8b 87 50 01 00 00 <8b> 50 0c 01 
> >54 24 08 8d 87 38 01 00 00 8b 54 24 08 83 c4 0c 5b EIP: [] 
> >inet_frag_secret_rebuild+0xaa/0xd0 SS:ESP 0068:c1310f90
> >Kernel panic - not syncing: Fatal exception in interrupt
> 
> Hi Alexey,
> 
> does it happen after unsharing the network ?

Yep. clone(CLONE_NEWNET) in a loop and sooner or later you'll see this.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-2.6.25] net: Improve cache line coherency of ingress qdisc

2008-01-17 Thread Neil Turton
Move the ingress qdisc members of struct net_device from the transmit
cache line to the receive cache line to avoid cache line ping-pong.
These members are only used on the receive path.

Signed-off-by: Neil Turton <[EMAIL PROTECTED]>
---

--- net-2.6.25.git-orig/include/linux/netdevice.h   2008-01-15 
17:43:08.0 +
+++ net-2.6.25.git-ndt1/include/linux/netdevice.h   2008-01-16 
09:46:19.0 +
@@ -597,37 +597,37 @@ struct net_device
 /*
  * Cache line mostly used on receive path (including eth_type_trans())
  */
unsigned long   last_rx;/* Time of last Rx  */
/* Interface address info used in eth_type_trans() */
unsigned char   dev_addr[MAX_ADDR_LEN]; /* hw address, (before 
bcast 
because most packets 
are unicast) */
 
unsigned char   broadcast[MAX_ADDR_LEN];/* hw bcast add 
*/
 
+   /* ingress path synchronizer */
+   spinlock_t  ingress_lock;
+   struct Qdisc*qdisc_ingress;
+
 /*
  * Cache line mostly used on queue transmit path (qdisc)
  */
/* device queue lock */
spinlock_t  queue_lock cacheline_aligned_in_smp;
struct Qdisc*qdisc;
struct Qdisc*qdisc_sleeping;
struct list_headqdisc_list;
unsigned long   tx_queue_len;   /* Max frames per queue allowed 
*/
 
/* Partially transmitted GSO packet. */
struct sk_buff  *gso_skb;
 
-   /* ingress path synchronizer */
-   spinlock_t  ingress_lock;
-   struct Qdisc*qdisc_ingress;
-
 /*
  * One part is mostly used on xmit path (device)
  */
/* hard_start_xmit synchronizer */
spinlock_t  _xmit_lock cacheline_aligned_in_smp;
/* cpu id of processor entered to hard_start_xmit or -1,
   if nobody entered there.
 */
int xmit_lock_owner;
void*priv;  /* pointer to private data  */

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] bonding: Fix work rearming

2008-01-17 Thread Jarek Poplawski
On Thu, Jan 17, 2008 at 02:30:36PM +0900, Makito SHIOKAWA wrote:
>> But, since during this change from sysfs cancel_delayed_work_sync()
>> could be probably used, and it's rather efficient with killing
>> rearming works, it seems this check could be unnecessary yet.
> What going to be cancelled in bonding_store_miimon() when setting miimon to 
> 0 is arp monitor, not mii monitor. So, this check will be needed to stop 
> rearming mii monitor when value 0 is set to miimon.

Hmm... I'm not sure I understand your point, but it seems both
bonding_store_arp_interval() and bonding_store_miimon() where this
field could be changed, currently use cancel_delayed_work() with
flush_workqueue(), so I presume, there is no rtnl_lock() nor
write_lock(&bond->lock) held, so cancel_delayed_work_sync() could
be used, which doesn't require this additional check.

...Unless you mean that despite miimon value is changed there,
mii_work for some reason can't be cancelled at the same time?

Of course, if there is such a reason for doing this check each time
a work runs instead of controlling where the value changes, then OK!

Regards,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH] Fixing SA/SP dumps on netlink/af_key

2008-01-17 Thread Herbert Xu
On Thu, Jan 17, 2008 at 07:54:32AM +0200, Timo Teräs wrote:
>
> > Racoon doesn't use pfkey dumping as far as I know.
> 
> ipsec-tools racoon uses pfkey and only pfkey. And it's non trivial to
> make it use netlink; it relies heavily all around the code to pfkey
> structs. It also runs on BSD so we cannot rip pfkey away; adding a
> layer to work with both pfkey and netlink would be doable, but just a
> lot of work.

Sure racoon uses pfkey but the question is does it use pfkey dumping?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH] Fixing SA/SP dumps on netlink/af_key

2008-01-17 Thread David Miller
From: Timo_Teräs <[EMAIL PROTECTED]>
Date: Thu, 17 Jan 2008 13:00:09 +0200

> IMHO, it's a lot better then losing >50% of entries and the end
> of sequence message on big dumps. SPD and SADB are not that
> volatile; in most of the cases the dump would be as good as an
> atomic one.

I humbly disagree with you.  Interface behavior stability
is more important.

> I'm not sure if there's other major applications that we should
> be concerned about, but at least ipsec-tools racoon does not
> expect to get atomic dumps (which btw, comes originally from BSD).

Racoon was written as an addon to the BSD stack by an IPV6/IPSEC
project in Japan named KAME, it did not "come from BSD".  It was
added to BSD.

There are also other BSD based IPSEC daemons such as the one written
by the OpenBSD folks.

I don't think this is arguable at all.  We're not changing semantics
over what we've done for 4+ years and applications might depend upon.
It's for a deprecated interface, which makes any semantic changes that
much less inviting.

You can argue all you want, but it will not change the invariants in
the previous paragraph.

All of the time you've spent arguing is time not spent on adding
netlink support to the daemons that do not do so already.  And that
would be 2 steps forwards compared to the 1 step backwards your
desired change would be.

I've stated my position as well as I can at this point so
respectfully, since I have tons of other things to do, I'm stepping
out of this specific discussion for now.

Thank you.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH] Fixing SA/SP dumps on netlink/af_key

2008-01-17 Thread Timo Teräs
David Miller wrote:
> From: Timo_Teräs <[EMAIL PROTECTED]>
> Date: Thu, 17 Jan 2008 12:01:17 +0200
> 
>> David Miller wrote:
>>> This is an inherent aspect of AF_KEY (and what it was
>>> derived from, BSD routing sockets).
>> Yes, this is the way BSD does it.
>>  
>>> It has to provide dumps atomically, and if there is no
>>> space there is no way to provide those entries which
>>> would require more rcvbuf space.
>> RFC does not say it has to be atomic.
> 
> Every application out there in the universe expects BSD socket
> semantics, and therefore atomic dumps.  You cannot "fix" things
> without breaking applications.

IMHO, it's a lot better then losing >50% of entries and the end
of sequence message on big dumps. SPD and SADB are not that
volatile; in most of the cases the dump would be as good as an
atomic one.

Even if it did change during ongoing dump you still get an usable
dump. All the entries reflect real data and there is no dependency
between different entries.

I'm not sure if there's other major applications that we should
be concerned about, but at least ipsec-tools racoon does not
expect to get atomic dumps (which btw, comes originally from BSD).

Cheers,
  Timo

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3 net-2.6.25] call FIB rule->action in the correct namespace

2008-01-17 Thread Daniel Lezcano

Denis V. Lunev wrote:

FIB rule->action should operate in the same namespace as fib_lookup.
This is definitely missed right now.

There are two ways to implement this: pass struct net into another rules
API call (2 levels) or place netns into rule struct directly. The second
approach seems better as the code will grow less.

Additionally, the patchset cleanups struct net from
fib_rules_register/unregister to have network namespace context at the
time of default rules creation.

Signed-off-by: Denis V. Lunev <[EMAIL PROTECTED]>


Acked-by: Daniel Lezcano <[EMAIL PROTECTED]>

--






















































Sauf indication contraire ci-dessus:
Compagnie IBM France
Sie`ge Social : Tour Descartes, 2, avenue Gambetta, La De'fense 5, 92400
Courbevoie
RCS Nanterre 552 118 465
Forme Sociale : S.A.S.
Capital Social : 542.737.118 ?
SIREN/SIRET : 552 118 465 02430
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Broken "Make ip6_frags per namespace" patch

2008-01-17 Thread Daniel Lezcano

Alexey Dobriyan wrote:

commit c064c4811b3e87ff8202f5a966ff4eea0bc54575
Author: Daniel Lezcano <[EMAIL PROTECTED]>
Date:   Thu Jan 10 02:56:03 2008 -0800

[NETNS][IPV6]: Make ip6_frags per namespace.

The ip6_frags is moved to the network namespace structure.  Because

there can be multiple instances of the network namespaces, and the
ip6_frags is no longer a global static variable, a helper function has
been added to facilitate the initialization of the variables.

Until the ipv6 protocol is not per namespace, the variables are

accessed relatively from the initial network namespace.



--- a/include/net/netns/ipv6.h
+++ b/include/net/netns/ipv6.h



@@ -11,6 +13,7 @@ struct netns_sysctl_ipv6 {
 #ifdef CONFIG_SYSCTL
struct ctl_table_header *table;
 #endif
+   struct inet_frags_ctl frags;



--- a/net/ipv6/reassembly.c
+++ b/net/ipv6/reassembly.c



@@ -632,6 +625,11 @@ static struct inet6_protocol frag_protocol =
.flags  =   INET6_PROTO_NOPOLICY,
 };
 
+void ipv6_frag_sysctl_init(struct net *net)

+{
+   ip6_frags.ctl = &net->ipv6.sysctl.frags;
+}


_This_ can't work. ip6frags is only one and ->ctl pointer is flipped
onto per-netns data. Changelog is also misleading: ip6_frags_ctl is
moved to netns not all ip6_frags.

Oopsing place below -- f->ctl dereference in preparation of mod_timer() call.



BUG: unable to handle kernel paging request at virtual address f5da8fc8
printing eip: c11d868a *pdpt = 3001 *pde = 01728067 *pte = 35da8000 
Oops:  [#1] PREEMPT SMP DEBUG_PAGEALLOC

Modules linked in: ebt_ip ebt_dnat ebt_arpreply ebt_arp ebt_among ebtable_nat 
ip6t_REJECT ip6table_filter ip6_tables ebtable_filter ebtable_broute ebt_802_3 
ebtables des_generic nf_conntrack_netbios_ns nf_conntrack_ipv4 xt_state 
nf_conntrack xt_tcpudp ipt_REJECT iptable_filter ip_tables deflate zlib_deflate 
zlib_inflate cryptomgr crypto_hash cpufreq_stats cpufreq_ondemand cdrom cbc 
bridge llc blkcipher crypto_algapi arpt_mangle arptable_filter arp_tables 
x_tables ah6 af_packet ipv6

Pid: 0, comm: swapper Not tainted (2.6.24-rc7-net-2.6.25-nf-sysfs-n #30)
EIP: 0060:[] EFLAGS: 00010246 CPU: 1
EIP is at inet_frag_secret_rebuild+0xaa/0xd0
EAX: f5da8fbc EBX:  ECX: c131 EDX: 0100
ESI: f7cba000 EDI: f898f7a0 EBP: 0040 ESP: c1310f90
 DS: 007b ES: 007b FS: 00d8 GS:  SS: 0068
Process swapper (pid: 0, ti=c131 task=f7c9a580 task.ti=f7c9b000)
Stack: f898f7a8 f898f8a8 000ddcbd f898f7a0 f7cba000 c1310fc4 0100 c1026d60 
   0002 0001 c1191183 c4779ddc c11d85e0 f898c860 f898c860 c12c4a88 
   0001 c1308da0 000a c1023477 0001 c130b640 c130b640 f7c9bf34 
Call Trace:

 [] run_timer_softirq+0x120/0x190
 [] net_rx_action+0x53/0x220
 [] inet_frag_secret_rebuild+0x0/0xd0
 [] __do_softirq+0x87/0x100
 [] do_softirq+0xaf/0x110
 [] irq_exit+0x83/0x90
 [] smp_apic_timer_interrupt+0x57/0x90
 [] apic_timer_interrupt+0x29/0x38
 [] apic_timer_interrupt+0x33/0x38
 [] default_idle+0x0/0x60
 [] default_idle+0x40/0x60
 [] cpu_idle+0x73/0xb0
===
Code: 8b 10 85 d2 89 13 74 03 89 5a 04 89 18 89 43 04 85 f6 89 f3 75 bb 45 83 fd 40 75 a5 8b 44 24 04 e8 4c 3f 01 00 8b 87 50 01 00 00 <8b> 50 0c 01 54 24 08 8d 87 38 01 00 00 8b 54 24 08 83 c4 0c 5b 
EIP: [] inet_frag_secret_rebuild+0xaa/0xd0 SS:ESP 0068:c1310f90

Kernel panic - not syncing: Fatal exception in interrupt


Hi Alexey,

does it happen after unsharing the network ?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/5] spidernet: add missing initialization

2008-01-17 Thread Ishizaki Kou
Jens-san,

> Hi Ishizaki,
>
> Linas has left the company and is no longer doing kernel related stuff,
> so I suggest, given Jeff is ok with that, that the two of us take over
> spidernet maintainership.
 (snip)
> Change maintainership for spidernet.
>
> Signed-off-by: Jens Osterkamp <[EMAIL PROTECTED]>

I apologize to my late reply.

I hope to accept your suggestion. But I have to get authorization
to take maintainership in my company. I have started negotiation
to my boss.


I can't check that spidernet driver works on Cell Blade, because I
don't have one.  So I hope you check spidernet driver works on Cell
Blade when it changes.

And then, will you review our latest patches?

Best regards,
Kou Ishizaki
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3 net-2.6.25] Process FIB rule action in the context of the namespace.

2008-01-17 Thread Denis V. Lunev
Save namespace context on the fib rule at the rule creation time and call
routing lookup in the correct namespace.

Signed-off-by: Denis V. Lunev <[EMAIL PROTECTED]>
Acked-by: Daniel Lezcano <[EMAIL PROTECTED]>
---
 include/net/fib_rules.h |1 +
 net/core/fib_rules.c|2 ++
 net/ipv4/fib_rules.c|2 +-
 3 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/include/net/fib_rules.h b/include/net/fib_rules.h
index 7f9f4ae..34349f9 100644
--- a/include/net/fib_rules.h
+++ b/include/net/fib_rules.h
@@ -22,6 +22,7 @@ struct fib_rule
u32 target;
struct fib_rule *   ctarget;
struct rcu_head rcu;
+   struct net *fr_net;
 };
 
 struct fib_lookup_arg
diff --git a/net/core/fib_rules.c b/net/core/fib_rules.c
index 3cd4f13..42ccaf5 100644
--- a/net/core/fib_rules.c
+++ b/net/core/fib_rules.c
@@ -29,6 +29,7 @@ int fib_default_rule_add(struct fib_rules_ops *ops,
r->pref = pref;
r->table = table;
r->flags = flags;
+   r->fr_net = ops->fro_net;
 
/* The lock is not required here, the list in unreacheable
 * at the moment this function is called */
@@ -242,6 +243,7 @@ static int fib_nl_newrule(struct sk_buff *skb, struct 
nlmsghdr* nlh, void *arg)
err = -ENOMEM;
goto errout;
}
+   rule->fr_net = net;
 
if (tb[FRA_PRIORITY])
rule->pref = nla_get_u32(tb[FRA_PRIORITY]);
diff --git a/net/ipv4/fib_rules.c b/net/ipv4/fib_rules.c
index 3b7affd..d2001f1 100644
--- a/net/ipv4/fib_rules.c
+++ b/net/ipv4/fib_rules.c
@@ -91,7 +91,7 @@ static int fib4_rule_action(struct fib_rule *rule, struct 
flowi *flp,
goto errout;
}
 
-   if ((tbl = fib_get_table(&init_net, rule->table)) == NULL)
+   if ((tbl = fib_get_table(rule->fr_net, rule->table)) == NULL)
goto errout;
 
err = tbl->tb_lookup(tbl, flp, (struct fib_result *) arg->result);
-- 
1.5.3.rc5

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3 net-2.6.25] Add netns to fib_rules_ops.

2008-01-17 Thread Denis V. Lunev
The backward link from FIB rules operations to the network namespace will
allow to simplify the API a bit.

Signed-off-by: Denis V. Lunev <[EMAIL PROTECTED]>
Acked-by: Daniel Lezcano <[EMAIL PROTECTED]>
---
 include/net/fib_rules.h |1 +
 net/decnet/dn_rules.c   |1 +
 net/ipv4/fib_rules.c|2 ++
 net/ipv6/fib6_rules.c   |1 +
 4 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/include/net/fib_rules.h b/include/net/fib_rules.h
index 4f47250..6910e01 100644
--- a/include/net/fib_rules.h
+++ b/include/net/fib_rules.h
@@ -67,6 +67,7 @@ struct fib_rules_ops
const struct nla_policy *policy;
struct list_headrules_list;
struct module   *owner;
+   struct net  *fro_net;
 };
 
 #define FRA_GENERIC_POLICY \
diff --git a/net/decnet/dn_rules.c b/net/decnet/dn_rules.c
index c1fae23..964e658 100644
--- a/net/decnet/dn_rules.c
+++ b/net/decnet/dn_rules.c
@@ -249,6 +249,7 @@ static struct fib_rules_ops dn_fib_rules_ops = {
.policy = dn_fib_rule_policy,
.rules_list = LIST_HEAD_INIT(dn_fib_rules_ops.rules_list),
.owner  = THIS_MODULE,
+   .fro_net= &init_net,
 };
 
 void __init dn_fib_rules_init(void)
diff --git a/net/ipv4/fib_rules.c b/net/ipv4/fib_rules.c
index 72232ab..8d0ebe7 100644
--- a/net/ipv4/fib_rules.c
+++ b/net/ipv4/fib_rules.c
@@ -315,6 +315,8 @@ int __net_init fib4_rules_init(struct net *net)
if (ops == NULL)
return -ENOMEM;
INIT_LIST_HEAD(&ops->rules_list);
+   ops->fro_net = net;
+
fib_rules_register(net, ops);
 
err = fib_default_rules_init(ops);
diff --git a/net/ipv6/fib6_rules.c b/net/ipv6/fib6_rules.c
index 76437a1..ead5ab2 100644
--- a/net/ipv6/fib6_rules.c
+++ b/net/ipv6/fib6_rules.c
@@ -249,6 +249,7 @@ static struct fib_rules_ops fib6_rules_ops = {
.policy = fib6_rule_policy,
.rules_list = LIST_HEAD_INIT(fib6_rules_ops.rules_list),
.owner  = THIS_MODULE,
+   .fro_net= &init_net,
 };
 
 static int __init fib6_default_rules_init(void)
-- 
1.5.3.rc5

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3 net-2.6.25] [NETNS] FIB rules API cleanup.

2008-01-17 Thread Denis V. Lunev
Remove struct net from fib_rules_register(unregister)/notify_change paths
and diet code size a bit.

add/remove: 0/0 grow/shrink: 10/12 up/down: 35/-100 (-65)
function old new   delta
notify_rule_change   273 280  +7
trie_show_stats  471 475  +4
fn_trie_delete   473 477  +4
fib_rules_unregister 144 148  +4
fib4_rule_compare119 123  +4
resize  28422845  +3
fn_trie_select_default   515 518  +3
inet_sk_rebuild_header   836 838  +2
fib_trie_seq_show764 766  +2
__devinet_sysctl_register276 278  +2
fn_trie_lookup  11241123  -1
ip_fib_check_default 133 131  -2
devinet_conf_sysctl  223 221  -2
snmp_fold_field  126 123  -3
fn_trie_insert  20912086  -5
inet_create  876 870  -6
fib4_rules_init  197 191  -6
fib_sync_down452 444  -8
inet_gso_send_check  334 325  -9
fib_create_info 30032991 -12
fib_nl_delrule   568 553 -15
fib_nl_newrule   883 852 -31

Signed-off-by: Denis V. Lunev <[EMAIL PROTECTED]>
Acked-by: Daniel Lezcano <[EMAIL PROTECTED]>
---
 include/net/fib_rules.h |4 ++--
 net/core/fib_rules.c|   20 +---
 net/decnet/dn_rules.c   |4 ++--
 net/ipv4/fib_rules.c|6 +++---
 net/ipv6/fib6_rules.c   |4 ++--
 5 files changed, 22 insertions(+), 16 deletions(-)

diff --git a/include/net/fib_rules.h b/include/net/fib_rules.h
index 6910e01..7f9f4ae 100644
--- a/include/net/fib_rules.h
+++ b/include/net/fib_rules.h
@@ -102,8 +102,8 @@ static inline u32 frh_get_table(struct fib_rule_hdr *frh, 
struct nlattr **nla)
return frh->table;
 }
 
-extern int fib_rules_register(struct net *, struct fib_rules_ops *);
-extern void fib_rules_unregister(struct net *, struct fib_rules_ops *);
+extern int fib_rules_register(struct fib_rules_ops *);
+extern void fib_rules_unregister(struct fib_rules_ops *);
 extern void fib_rules_cleanup_ops(struct fib_rules_ops *);
 
 extern int fib_rules_lookup(struct fib_rules_ops *,
diff --git a/net/core/fib_rules.c b/net/core/fib_rules.c
index 541728a..3cd4f13 100644
--- a/net/core/fib_rules.c
+++ b/net/core/fib_rules.c
@@ -37,8 +37,7 @@ int fib_default_rule_add(struct fib_rules_ops *ops,
 }
 EXPORT_SYMBOL(fib_default_rule_add);
 
-static void notify_rule_change(struct net *net, int event,
-  struct fib_rule *rule,
+static void notify_rule_change(int event, struct fib_rule *rule,
   struct fib_rules_ops *ops, struct nlmsghdr *nlh,
   u32 pid);
 
@@ -72,10 +71,13 @@ static void flush_route_cache(struct fib_rules_ops *ops)
ops->flush_cache();
 }
 
-int fib_rules_register(struct net *net, struct fib_rules_ops *ops)
+int fib_rules_register(struct fib_rules_ops *ops)
 {
int err = -EEXIST;
struct fib_rules_ops *o;
+   struct net *net;
+
+   net = ops->fro_net;
 
if (ops->rule_size < sizeof(struct fib_rule))
return -EINVAL;
@@ -112,8 +114,9 @@ void fib_rules_cleanup_ops(struct fib_rules_ops *ops)
 }
 EXPORT_SYMBOL_GPL(fib_rules_cleanup_ops);
 
-void fib_rules_unregister(struct net *net, struct fib_rules_ops *ops)
+void fib_rules_unregister(struct fib_rules_ops *ops)
 {
+   struct net *net = ops->fro_net;
 
spin_lock(&net->rules_mod_lock);
list_del_rcu(&ops->list);
@@ -333,7 +336,7 @@ static int fib_nl_newrule(struct sk_buff *skb, struct 
nlmsghdr* nlh, void *arg)
else
list_add_rcu(&rule->list, &ops->rules_list);
 
-   notify_rule_change(net, RTM_NEWRULE, rule, ops, nlh, 
NETLINK_CB(skb).pid);
+   notify_rule_change(RTM_NEWRULE, rule, ops, nlh, NETLINK_CB(skb).pid);
flush_route_cache(ops);
rules_ops_put(ops);
return 0;
@@ -423,7 +426,7 @@ static int fib_nl_delrule(struct sk_buff *skb, struct 
nlmsghdr* nlh, void *arg)
}
 
synchronize_rcu();
-   notify_rule_change(net, RTM_DELRULE, rule, ops, nlh,
+   notify_rule_change(RTM_DELRULE, rule, ops, nlh,
   NETLINK_CB(skb).pid);
fib_rule_put(rule);
flush_route_cache(ops);
@@ -561,13 +564,15 @@ static int fib_nl_dumprule(struct sk_buff *skb, struct 
netlink

[PATCH 0/3 net-2.6.25] call FIB rule->action in the correct namespace

2008-01-17 Thread Denis V. Lunev
FIB rule->action should operate in the same namespace as fib_lookup.
This is definitely missed right now.

There are two ways to implement this: pass struct net into another rules
API call (2 levels) or place netns into rule struct directly. The second
approach seems better as the code will grow less.

Additionally, the patchset cleanups struct net from
fib_rules_register/unregister to have network namespace context at the
time of default rules creation.

Signed-off-by: Denis V. Lunev <[EMAIL PROTECTED]>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH] Fixing SA/SP dumps on netlink/af_key

2008-01-17 Thread David Miller
From: Timo_Teräs <[EMAIL PROTECTED]>
Date: Thu, 17 Jan 2008 12:01:17 +0200

> David Miller wrote:
> > This is an inherent aspect of AF_KEY (and what it was
> > derived from, BSD routing sockets).
> 
> Yes, this is the way BSD does it.
>  
> > It has to provide dumps atomically, and if there is no
> > space there is no way to provide those entries which
> > would require more rcvbuf space.
> 
> RFC does not say it has to be atomic.

Every application out there in the universe expects BSD socket
semantics, and therefore atomic dumps.  You cannot "fix" things
without breaking applications.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Broken "Make ip6_frags per namespace" patch

2008-01-17 Thread Alexey Dobriyan
> commit c064c4811b3e87ff8202f5a966ff4eea0bc54575
> Author: Daniel Lezcano <[EMAIL PROTECTED]>
> Date:   Thu Jan 10 02:56:03 2008 -0800
> 
> [NETNS][IPV6]: Make ip6_frags per namespace.
> 
> The ip6_frags is moved to the network namespace structure.  Because
> there can be multiple instances of the network namespaces, and the
> ip6_frags is no longer a global static variable, a helper function has
> been added to facilitate the initialization of the variables.
> 
> Until the ipv6 protocol is not per namespace, the variables are
> accessed relatively from the initial network namespace.

> --- a/include/net/netns/ipv6.h
> +++ b/include/net/netns/ipv6.h

> @@ -11,6 +13,7 @@ struct netns_sysctl_ipv6 {
>  #ifdef CONFIG_SYSCTL
>   struct ctl_table_header *table;
>  #endif
> + struct inet_frags_ctl frags;

> --- a/net/ipv6/reassembly.c
> +++ b/net/ipv6/reassembly.c

> @@ -632,6 +625,11 @@ static struct inet6_protocol frag_protocol =
>   .flags  =   INET6_PROTO_NOPOLICY,
>  };
>  
> +void ipv6_frag_sysctl_init(struct net *net)
> +{
> + ip6_frags.ctl = &net->ipv6.sysctl.frags;
> +}

_This_ can't work. ip6frags is only one and ->ctl pointer is flipped
onto per-netns data. Changelog is also misleading: ip6_frags_ctl is
moved to netns not all ip6_frags.

Oopsing place below -- f->ctl dereference in preparation of mod_timer() call.



BUG: unable to handle kernel paging request at virtual address f5da8fc8
printing eip: c11d868a *pdpt = 3001 *pde = 01728067 *pte = 
35da8000 
Oops:  [#1] PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in: ebt_ip ebt_dnat ebt_arpreply ebt_arp ebt_among ebtable_nat 
ip6t_REJECT ip6table_filter ip6_tables ebtable_filter ebtable_broute ebt_802_3 
ebtables des_generic nf_conntrack_netbios_ns nf_conntrack_ipv4 xt_state 
nf_conntrack xt_tcpudp ipt_REJECT iptable_filter ip_tables deflate zlib_deflate 
zlib_inflate cryptomgr crypto_hash cpufreq_stats cpufreq_ondemand cdrom cbc 
bridge llc blkcipher crypto_algapi arpt_mangle arptable_filter arp_tables 
x_tables ah6 af_packet ipv6

Pid: 0, comm: swapper Not tainted (2.6.24-rc7-net-2.6.25-nf-sysfs-n #30)
EIP: 0060:[] EFLAGS: 00010246 CPU: 1
EIP is at inet_frag_secret_rebuild+0xaa/0xd0
EAX: f5da8fbc EBX:  ECX: c131 EDX: 0100
ESI: f7cba000 EDI: f898f7a0 EBP: 0040 ESP: c1310f90
 DS: 007b ES: 007b FS: 00d8 GS:  SS: 0068
Process swapper (pid: 0, ti=c131 task=f7c9a580 task.ti=f7c9b000)
Stack: f898f7a8 f898f8a8 000ddcbd f898f7a0 f7cba000 c1310fc4 0100 c1026d60 
   0002 0001 c1191183 c4779ddc c11d85e0 f898c860 f898c860 c12c4a88 
   0001 c1308da0 000a c1023477 0001 c130b640 c130b640 f7c9bf34 
Call Trace:
 [] run_timer_softirq+0x120/0x190
 [] net_rx_action+0x53/0x220
 [] inet_frag_secret_rebuild+0x0/0xd0
 [] __do_softirq+0x87/0x100
 [] do_softirq+0xaf/0x110
 [] irq_exit+0x83/0x90
 [] smp_apic_timer_interrupt+0x57/0x90
 [] apic_timer_interrupt+0x29/0x38
 [] apic_timer_interrupt+0x33/0x38
 [] default_idle+0x0/0x60
 [] default_idle+0x40/0x60
 [] cpu_idle+0x73/0xb0
===
Code: 8b 10 85 d2 89 13 74 03 89 5a 04 89 18 89 43 04 85 f6 89 f3 75 bb 45 83 
fd 40 75 a5 8b 44 24 04 e8 4c 3f 01 00 8b 87 50 01 00 00 <8b> 50 0c 01 54 24 08 
8d 87 38 01 00 00 8b 54 24 08 83 c4 0c 5b 
EIP: [] inet_frag_secret_rebuild+0xaa/0xd0 SS:ESP 0068:c1310f90
Kernel panic - not syncing: Fatal exception in interrupt

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH] Fixing SA/SP dumps on netlink/af_key

2008-01-17 Thread Timo Teräs
David Miller wrote:
> From: Timo_Teräs <[EMAIL PROTECTED]>
> Date: Thu, 17 Jan 2008 11:38:13 +0200
> 
>> The af_key issue is that in big dumps you get only first X
>> entries. The rest of the entries are dropped because the
>> socket receive buffer goes full. You get data corruption:
>> missing entries.
> 
> This is an inherent aspect of AF_KEY (and what it was
> derived from, BSD routing sockets).

Yes, this is the way BSD does it.
 
> It has to provide dumps atomically, and if there is no
> space there is no way to provide those entries which
> would require more rcvbuf space.

RFC does not say it has to be atomic.

It does say that the dump is terminated with SADB_DUMP
message having sadb_seq field set to zero. Currently
that is dropped too when the problem occurs. Thus the
socket is left in a bad state: dump ends never. This
can cause applications without any workarounds to hang.

- Timo
 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-17 Thread David Miller
From: Arnaldo Carvalho de Melo <[EMAIL PROTECTED]>
Date: Thu, 17 Jan 2008 07:40:07 -0200

> I'll update this machine today to 2.6.24-rc8-git + net-2.6 and try again
> to reproduce.

Thanks for the datapoints and testing.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH] Fixing SA/SP dumps on netlink/af_key

2008-01-17 Thread David Miller
From: Timo_Teräs <[EMAIL PROTECTED]>
Date: Thu, 17 Jan 2008 11:38:13 +0200

> The af_key issue is that in big dumps you get only first X
> entries. The rest of the entries are dropped because the
> socket receive buffer goes full. You get data corruption:
> missing entries.

This is an inherent aspect of AF_KEY (and what it was
derived from, BSD routing sockets).

It has to provide dumps atomically, and if there is no
space there is no way to provide those entries which
would require more rcvbuf space.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-17 Thread Arnaldo Carvalho de Melo
Em Thu, Jan 17, 2008 at 12:00:02AM -0800, David Miller escreveu:
> From: Frans Pop <[EMAIL PROTECTED]>
> Date: Thu, 17 Jan 2008 08:51:55 +0100
> 
> > On Thursday 17 January 2008, David Miller wrote:
> > > From: "Brandeburg, Jesse" <[EMAIL PROTECTED]>
> > >
> > > > We spent Wednesday trying to reproduce (without the patch) these issues
> > > > without much luck, and have applied the patch cleanly and will continue
> > > > testing it.  Given the simplicity of the changes, and the community
> > > > testing, I'll give my ack and we will continue testing.
> > >
> > > You need a slow CPU, and you need to make sure you do actually
> > > trigger the TX limiting code there.
> > 
> > Hmmm. Is a dual core Pentium D 3.20GHz considered slow these days?
> 
> No of course :-)  I guess it therefore depends upon the load
> as well.

I saw it just once, yesterday:

[EMAIL PROTECTED] ~]# uname -r
2.6.24-rc5
e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
  Tx Queue <0>
  TDH  <58>
  TDT  <8f>
  next_to_use  <8f>
  next_to_clean<55>
buffer_info[next_to_clean]
  time_stamp   <105e973a9>
  next_to_watch<56>
  jiffies  <105e97992>
  next_to_watch.status <1>
[EMAIL PROTECTED] ~]#

on a lenovo T60W, core2duo machine (2GHz), when using it to stress test
another machine, I was using netperf TCP_STREAM ranging from 1 to 8
streams + a ping -f using various packet sizes.

I'll update this machine today to 2.6.24-rc8-git + net-2.6 and try again
to reproduce.

I also applied David's patch while trying some RT experiments on
another, 8 way machine used as a server, but on this machine I didn't
experience the Tx Unit Hang message with or without the patch.

- Arnaldo
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH] Fixing SA/SP dumps on netlink/af_key

2008-01-17 Thread Timo Teräs
David Miller wrote:
> From: Timo_Teräs <[EMAIL PROTECTED]>
> Date: Thu, 17 Jan 2008 11:20:42 +0200
> 
>> Where as the pfkey bug fix is non-intrusive and helps all
>> legacy applications still using af_key by _fixing a bug in
>> kernel_.
> 
> It's not a bug.  You're fixing a speed issue, not a crash
> or a case where AF_KEY is providing incorrect data.
> 
> That is what I mean when I mean "life support", we fix crashes and
> data corruption.  We don't make performance tweaks.

No. The speed issue is complitely handled in xfrm_state
and xfrm_user changes.

The af_key issue is that in big dumps you get only first X
entries. The rest of the entries are dropped because the
socket receive buffer goes full. You get data corruption:
missing entries.

- Timo

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH] Fixing SA/SP dumps on netlink/af_key

2008-01-17 Thread David Miller
From: Timo_Teräs <[EMAIL PROTECTED]>
Date: Thu, 17 Jan 2008 11:20:42 +0200

> Where as the pfkey bug fix is non-intrusive and helps all
> legacy applications still using af_key by _fixing a bug in
> kernel_.

It's not a bug.  You're fixing a speed issue, not a crash
or a case where AF_KEY is providing incorrect data.

That is what I mean when I mean "life support", we fix crashes and
data corruption.  We don't make performance tweaks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH] Fixing SA/SP dumps on netlink/af_key

2008-01-17 Thread Timo Teräs
David Miller wrote:
> From: Timo_Teräs <[EMAIL PROTECTED]>
> Date: Thu, 17 Jan 2008 10:11:17 +0200
> 
>> I thought my patch would qualify as "life support" bug fix.
>> Currently racoon fails to work if there are too many SPDs or SAs
>> because the kernel cannot handle the dump request properly. And this
>> is what my patch fixes for pfkey. It adds no new features or
>> functionality; just makes the dumping work with large databases.
> 
> Racoon should use netlink for reasons far and beyond the
> problem you are trying to address.

Yes. But this is fairly major thing to do. One needs to create
API abstraction layer (still need to use pfkey in *BSD). Test it.
A lot of work that is not going to happen very soon.

Where as the pfkey bug fix is non-intrusive and helps all
legacy applications still using af_key by _fixing a bug in
kernel_.

> The dumping behavior of AF_KEY is just horrific, as one of
> several examples.

If af_key is all that bad and does not qualify to get maintanace
bug fixes, why not remove it complitely?

That would make userland adapt faster.

>> Then there's also the xfrm dumping changes which change the
>> algorithm from O(n^2) to O(n) with some memory overhead, but
>> that is a different story. Any comments on that?
> 
> I have no general objections to those changes although I am
> backlogged and thus have not studied them in detail.  Jamal
> is having what appears to be a healthy dialogue with you about
> the details so I'm not concerned much :)

Ok. I hope someone can also give feedback on the naming
conventions. And about the api changes to xfrm policy/state
walking.

- Timo

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH] Fixing SA/SP dumps on netlink/af_key

2008-01-17 Thread David Miller
From: Timo_Teräs <[EMAIL PROTECTED]>
Date: Thu, 17 Jan 2008 10:11:17 +0200

> I thought my patch would qualify as "life support" bug fix.
> Currently racoon fails to work if there are too many SPDs or SAs
> because the kernel cannot handle the dump request properly. And this
> is what my patch fixes for pfkey. It adds no new features or
> functionality; just makes the dumping work with large databases.

Racoon should use netlink for reasons far and beyond the
problem you are trying to address.

The dumping behavior of AF_KEY is just horrific, as one of
several examples.

> Then there's also the xfrm dumping changes which change the
> algorithm from O(n^2) to O(n) with some memory overhead, but
> that is a different story. Any comments on that?

I have no general objections to those changes although I am
backlogged and thus have not studied them in detail.  Jamal
is having what appears to be a healthy dialogue with you about
the details so I'm not concerned much :)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugme-new] [Bug 9767] New: missing native u32 classifier for routing policy

2008-01-17 Thread Andrew Morton
On Thu, 17 Jan 2008 00:30:49 -0800 (PST) [EMAIL PROTECTED] wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=9767
> 
>Summary: missing native u32 classifier for routing policy
>Product: Networking
>Version: 2.5
>  KernelVersion: all since 2.2
>   Platform: All
> OS/Version: Linux
>   Tree: Mainline
> Status: NEW
>   Severity: low
>   Priority: P1
>  Component: IPV4
> AssignedTo: [EMAIL PROTECTED]
> ReportedBy: [EMAIL PROTECTED]
> 
> 
> This is not a bug report, but a feature request.
> routing policy database management is supported since linux 2.2, but it lacks
> u32 selector (matching by IP protocols, transport ports).
> fwmark is a workaround for this missing feature, but source ip address
> selection will not work anyway: the mark value can't be used for source 
> address
> selection because at the time source address selection is performed, there is
> no packet yet and thus no mark value.
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH] Fixing SA/SP dumps on netlink/af_key

2008-01-17 Thread Timo Teräs
David Miller wrote:
> Doing anything other than "life support" bug fixes for AF_KEY is
> inappropriate.

Yes. I thought my patch would qualify as "life support" bug fix.
Currently racoon fails to work if there are too many SPDs or SAs
because the kernel cannot handle the dump request properly. And
this is what my patch fixes for pfkey. It adds no new features or
functionality; just makes the dumping work with large databases.

Then there's also the xfrm dumping changes which change the
algorithm from O(n^2) to O(n) with some memory overhead, but
that is a different story. Any comments on that?

- Timo
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-17 Thread David Miller
From: Frans Pop <[EMAIL PROTECTED]>
Date: Thu, 17 Jan 2008 08:51:55 +0100

> On Thursday 17 January 2008, David Miller wrote:
> > From: "Brandeburg, Jesse" <[EMAIL PROTECTED]>
> >
> > > We spent Wednesday trying to reproduce (without the patch) these issues
> > > without much luck, and have applied the patch cleanly and will continue
> > > testing it.  Given the simplicity of the changes, and the community
> > > testing, I'll give my ack and we will continue testing.
> >
> > You need a slow CPU, and you need to make sure you do actually
> > trigger the TX limiting code there.
> 
> Hmmm. Is a dual core Pentium D 3.20GHz considered slow these days?

No of course :-)  I guess it therefore depends upon the load
as well.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html