[PATCH] IPv6 - Add missing initializations of the new nl_info.nl_net field

2008-02-25 Thread Benjamin Thery
Here is an updated version of the patch without the initializations to 
zero.


Add some more missing initializations of the new nl_info.nl_net field in 
IPv6 stack. This field will be used when network namespaces are fully 
supported.

Signed-off-by: Benjamin Thery [EMAIL PROTECTED]
---
 net/ipv6/addrconf.c |3 +++
 net/ipv6/route.c|2 ++
 2 files changed, 5 insertions(+)

Index: net-2.6.26/net/ipv6/addrconf.c
===
--- net-2.6.26.orig/net/ipv6/addrconf.c
+++ net-2.6.26/net/ipv6/addrconf.c
@@ -1557,6 +1557,7 @@ addrconf_prefix_route(struct in6_addr *p
.fc_expires = expires,
.fc_dst_len = plen,
.fc_flags = RTF_UP | flags,
+   .fc_nlinfo.nl_net = init_net,
};
 
ipv6_addr_copy(cfg.fc_dst, pfx);
@@ -1583,6 +1584,7 @@ static void addrconf_add_mroute(struct n
.fc_ifindex = dev-ifindex,
.fc_dst_len = 8,
.fc_flags = RTF_UP,
+   .fc_nlinfo.nl_net = init_net,
};
 
ipv6_addr_set(cfg.fc_dst, htonl(0xFF00), 0, 0, 0);
@@ -1599,6 +1601,7 @@ static void sit_route_add(struct net_dev
.fc_ifindex = dev-ifindex,
.fc_dst_len = 96,
.fc_flags = RTF_UP | RTF_NONEXTHOP,
+   .fc_nlinfo.nl_net = init_net,
};
 
/* prefix length - 96 bits ::d.d.d.d */
Index: net-2.6.26/net/ipv6/route.c
===
--- net-2.6.26.orig/net/ipv6/route.c
+++ net-2.6.26/net/ipv6/route.c
@@ -1719,6 +1719,8 @@ static void rtmsg_to_fib6_config(struct 
cfg-fc_src_len = rtmsg-rtmsg_src_len;
cfg-fc_flags = rtmsg-rtmsg_flags;
 
+   cfg-fc_nlinfo.nl_net = init_net;
+
ipv6_addr_copy(cfg-fc_dst, rtmsg-rtmsg_dst);
ipv6_addr_copy(cfg-fc_src, rtmsg-rtmsg_src);
ipv6_addr_copy(cfg-fc_gateway, rtmsg-rtmsg_gateway);

-- 
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] IPv6 Add more initializations of the new nl_info.nl_net field

2008-02-21 Thread benjamin . thery
Add more missing initializations of the new nl_info.nl_net field in 
IPv6 stack. 
This field will be used when network namespaces are fully supported.

Signed-off-by: Benjamin Thery [EMAIL PROTECTED]
---
 net/ipv6/addrconf.c |9 +
 net/ipv6/route.c|6 ++
 2 files changed, 15 insertions(+)

Index: net-2.6.26/net/ipv6/addrconf.c
===
--- net-2.6.26.orig/net/ipv6/addrconf.c
+++ net-2.6.26/net/ipv6/addrconf.c
@@ -1557,6 +1557,9 @@ addrconf_prefix_route(struct in6_addr *p
.fc_expires = expires,
.fc_dst_len = plen,
.fc_flags = RTF_UP | flags,
+   .fc_nlinfo.pid = 0,
+   .fc_nlinfo.nlh = NULL,
+   .fc_nlinfo.nl_net = init_net,
};
 
ipv6_addr_copy(cfg.fc_dst, pfx);
@@ -1583,6 +1586,9 @@ static void addrconf_add_mroute(struct n
.fc_ifindex = dev-ifindex,
.fc_dst_len = 8,
.fc_flags = RTF_UP,
+   .fc_nlinfo.pid = 0,
+   .fc_nlinfo.nlh = NULL,
+   .fc_nlinfo.nl_net = init_net,
};
 
ipv6_addr_set(cfg.fc_dst, htonl(0xFF00), 0, 0, 0);
@@ -1599,6 +1605,9 @@ static void sit_route_add(struct net_dev
.fc_ifindex = dev-ifindex,
.fc_dst_len = 96,
.fc_flags = RTF_UP | RTF_NONEXTHOP,
+   .fc_nlinfo.pid = 0,
+   .fc_nlinfo.nlh = NULL,
+   .fc_nlinfo.nl_net = init_net,
};
 
/* prefix length - 96 bits ::d.d.d.d */
Index: net-2.6.26/net/ipv6/route.c
===
--- net-2.6.26.orig/net/ipv6/route.c
+++ net-2.6.26/net/ipv6/route.c
@@ -604,6 +604,8 @@ static int __ip6_ins_rt(struct rt6_info 
 int ip6_ins_rt(struct rt6_info *rt)
 {
struct nl_info info = {
+   .pid = 0,
+   .nlh = NULL,
.nl_net = init_net,
};
return __ip6_ins_rt(rt, info);
@@ -1264,6 +1266,8 @@ static int __ip6_del_rt(struct rt6_info 
 int ip6_del_rt(struct rt6_info *rt)
 {
struct nl_info info = {
+   .pid = 0,
+   .nlh = NULL,
.nl_net = init_net,
};
return __ip6_del_rt(rt, info);
@@ -1719,6 +1723,8 @@ static void rtmsg_to_fib6_config(struct 
cfg-fc_src_len = rtmsg-rtmsg_src_len;
cfg-fc_flags = rtmsg-rtmsg_flags;
 
+   cfg-fc_nlinfo.nl_net = init_net;
+
ipv6_addr_copy(cfg-fc_dst, rtmsg-rtmsg_dst);
ipv6_addr_copy(cfg-fc_src, rtmsg-rtmsg_src);
ipv6_addr_copy(cfg-fc_gateway, rtmsg-rtmsg_gateway);

-- 
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6.25-rc2] System freezes ca. 1 minute after logging into KDE

2008-02-17 Thread Benjamin Thery
On Feb 17, 2008 11:39 AM, Frans Pop [EMAIL PROTECTED] wrote:
 (resend a third time because previous attempts never reached the lists
 due to a bug in my MUA; my apologies to David for spamming his inbox)

 Linus Torvalds wrote:
  But hey, you can try to prove me wrong.  I dare you.
 Me too, me too!

 Weird issue this.
 About a minute after logging into KDE the system freezes, but only
 partially. The keyboard is completely dead in all cases (no console
 switching, no SysRq), but some tasks stay running. One time music continued
 playing, other times it stopped. One time the desktop clock continued
 ticking, other times it stopped. One time I could close a window using the
 mouse, but other windows were frozen.
 It's not just KDE that's frozen; one time I switched to VT1 before the
 freeze happened, but that became unusable too.
 Zilch in the logs.

 I've bisected it down to:
 commit 69cc64d8d92bf852f933e90c888dfff083bd4fc9
 Author: David S. Miller [EMAIL PROTECTED]
 [NDISC]: Fix race in generic address resolution

 Confirmed that this is really the culprit by reverting this commit on top
 of -rc2, which is now running fine.

 I'm using IPv6 (local network only) together with IPv4, use a bridge (br0)
 and have an NFS4 mount active.

I've encountered the same issue last Thursday. Here, I can hang my machine
with ping6. I've also bisected it down to the same commit.

I've sent some kernel traces which shows how the soft lock up occurs. See
thread: [PATCH][RFC] race in generic address resolution
http://www.spinics.net/lists/netdev/msg55375.html

Benjamin



 Cheers,
 FJP


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] IPv6 recursive locking

2008-02-17 Thread Benjamin Thery
On Feb 17, 2008 7:30 PM, Daniel Lezcano [EMAIL PROTECTED] wrote:
 Kristof Provost wrote:
  -BEGIN PGP SIGNED MESSAGE-
  Hash: SHA1
 
  Hi,
 
  I'm running the current git (1309d4e68497184d2fd87e892ddf14076c2bda98)
  without problems. While I was toying with IPv6 on my local network I managed
  to completely hang my machine whenever it receives or sends a neighbour
  sollictation. At least, I think that's the cause. It started as soon as I
  installed radvd on the router. The included trace seems to point in the same
  direction.
 
  The machine is a Dell Latitude D505 (so x86). Network interfaces are e100 
  and
  ipw2200 (firmware not loaded). I'm currently using the e100.
 
  I'll try to bisect it but here's the trace already. Let me know if
  there's anything else you'd like to know.

 I think this bug was introduced by the commit:

 69cc64d8d92bf852f933e90c888dfff083bd4fc9
 [NDISC]: Fix race in generic address resolution.

I confirm this commit is the culprit.
I reported the same bug last Thursday, but it seems I made a mistake: I replied
to the original thread which led to this commit to report it. But as the thread
was a bit old it seems my answer hadn't been noticed.
See http://www.spinics.net/lists/netdev/msg55373.html

The lockup happens very quickly when you have IPv6 configured.
I think we should revert this commit for now.

Benjamin




  [  124.439831] =
  [  124.443689] [ INFO: possible recursive locking detected ]
  [  124.443689] 2.6.25-rc2 #33
  [  124.443689] -
  [  124.443689] swapper/0 is trying to acquire lock:
  [  124.443689]  (n-lock){-+-+}, at: [c0468d39] 
  neigh_resolve_output+0x139/0x290
  [  124.443689]
  [  124.443689] but task is already holding lock:
  [  124.443689]  (n-lock){-+-+}, at: [c0468ea4] 
  neigh_timer_handler+0x14/0x280
  [  124.443689]
  [  124.443689] other info that might help us debug this:
  [  124.443689] 1 lock held by swapper/0:
  [  124.443689]  #0:  (n-lock){-+-+}, at: [c0468ea4] 
  neigh_timer_handler+0x14/0x280
  [  124.443689]
  [  124.443689] stack backtrace:
  [  124.443689] Pid: 0, comm: swapper Not tainted 2.6.25-rc2 #33
  [  124.443689]  [c014863a] __lock_acquire+0xd3a/0xf40
  [  124.443689]  [c0137ec8] __kernel_text_address+0x18/0x30
  [  124.443689]  [c01488a0] lock_acquire+0x60/0x80
  [  124.443689]  [c0468d39] neigh_resolve_output+0x139/0x290
  [  124.443689]  [c059287e] _write_lock_bh+0x2e/0x40
  [  124.443689]  [c0468d39] neigh_resolve_output+0x139/0x290
  [  124.443689]  [c0468d39] neigh_resolve_output+0x139/0x290
  [  124.443689]  [c0148805] __lock_acquire+0xf05/0xf40
  [  124.443689]  [c04e1650] ndisc_dst_alloc+0xe0/0x170
  [  124.443689]  [c04d39f4] ip6_output_finish+0xa4/0x110
  [  124.443689]  [c0147a1d] __lock_acquire+0x11d/0xf40
  [  124.443689]  [c04d4759] ip6_output+0x5b9/0xba0
  [  124.443689]  [c0456eb6] sock_alloc_send_skb+0x176/0x1d0
  [  124.443689]  [c04e4eab] __ndisc_send+0x33b/0x540
  [  124.443690]  [c04e4d6e] __ndisc_send+0x1fe/0x540
  [  124.443690]  [c04e5b69] ndisc_send_ns+0x69/0xa0
  [  124.443690]  [c04e6c8e] ndisc_solicit+0xee/0x1b0
  [  124.443690]  [c01472b5] mark_held_locks+0x35/0x80
  [  124.443690]  [c0592c65] _spin_unlock_irqrestore+0x45/0x60
  [  124.443690]  [c01473f9] trace_hardirqs_on+0x79/0x130
  [  124.443690]  [c012f99f] __mod_timer+0x9f/0xb0
  [  124.443690]  [c0468fd3] neigh_timer_handler+0x143/0x280
  [  124.443690]  [c012f2ca] run_timer_softirq+0x14a/0x1c0
  [  124.443690]  [c0468e90] neigh_timer_handler+0x0/0x280
  [  124.443690]  [c0468e90] neigh_timer_handler+0x0/0x280
  [  124.443690]  [c012b4c4] __do_softirq+0x84/0x100
  [  124.443690]  [c012b595] do_softirq+0x55/0x60
  [  124.443690]  [c012b9e5] irq_exit+0x65/0x80
  [  124.443690]  [c01073b0] do_IRQ+0x40/0x70
  [  124.443690]  [c010585e] common_interrupt+0x2e/0x34
  [  124.443690]  [c032007b] acpi_power_on+0x3b/0x104
  [  124.443690]  [c0322af6] acpi_idle_enter_simple+0x194/0x1fe
  [  124.443690]  [c0322727] acpi_idle_enter_bm+0xc1/0x2fc
  [  124.443690]  [c03fff43] cpuidle_idle_call+0x63/0xb0
  [  124.443690]  [c03ffee0] cpuidle_idle_call+0x0/0xb0
  [  124.443690]  [c010380d] cpu_idle+0x5d/0xf0
  [  124.443690]  ===
 
  Kristof
 
 --

 To unsubscribe from this list: send the line unsubscribe netdev in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RFC] race in generic address resolution

2008-02-14 Thread Benjamin Thery
Hi,

It seems this patch hangs my  machine very quickly when there are some
ICMPv6 traffic.

I'm using net-2.6, pulled today (14th Feb).

I had some unexpected hangs on my SMP test machines and I bisected the
problem to  69cc64d8d92bf852f933e90c888dfff083bd4fc9
[NDISC]: Fix race in generic address resolution.

Looks like a deadlock:
BUG: soft lockup - CPU#1 stuck for 61s! [swapper:0]

Here are some traces printed on the console:

Pid: 0, comm: swapper Not tainted (2.6.25-rc1-netns-00113-g69cc64d-dirty #34)
EIP: 0060:[c02eb5f6] EFLAGS: 0287 CPU: 0
EIP is at __write_lock_failed+0xa/0x20
EAX: c7b3fab4 EBX: c7b3fab4 ECX:  EDX: c0377986
ESI: c7b3fa90 EDI: c7b6f290 EBP: c03cbd24 ESP: c03cbd24
 DS: 007b ES: 007b FS: 00d8 GS:  SS: 0068
CR0: 8005003b CR2: b7f9b404 CR3: 07ac8000 CR4: 0690
DR0:  DR1:  DR2:  DR3: 
DR6:  DR7: 
 [c020e43f] _raw_write_lock+0x57/0x6c
 [c02eba95] _write_lock_bh+0x25/0x2d
 [c026b107] ? neigh_resolve_output+0x93/0x238
 [c026b107] neigh_resolve_output+0x93/0x238
 [c02a5635] ip6_output2+0x241/0x289
 [c02a61cd] ip6_output+0xa92/0xaad
 [c025ff11] ? __alloc_skb+0x4f/0xfb
 [c02b2596] ? __ndisc_send+0x1fb/0x3f5
 [c02b26a0] __ndisc_send+0x305/0x3f5
 [c02b2fb5] ndisc_send_ns+0x63/0x6e
 [c02b3f3e] ndisc_solicit+0x183/0x18d
 [c0121071] ? __mod_timer+0x96/0xa1
 [c026b81e] neigh_timer_handler+0x214/0x252
 [c0120c90] run_timer_softirq+0xfe/0x159
 [c026b60a] ? neigh_timer_handler+0x0/0x252
 [c011dbfa] __do_softirq+0x6f/0xe9
 [c011dcae] do_softirq+0x3a/0x52
 [c011dfc3] irq_exit+0x44/0x46
 [c0105273] do_IRQ+0x5a/0x73
 [c0103666] common_interrupt+0x2e/0x34
 [c0101954] ? default_idle+0x4a/0x77
 [c010190a] ? default_idle+0x0/0x77
 [c0101855] cpu_idle+0x89/0x9d
 [c02e6135] rest_init+0x49/0x4b
 ===
BUG: soft lockup - CPU#1 stuck for 61s! [swapper:0]

Pid: 0, comm: swapper Not tainted (2.6.25-rc1-netns-00113-g69cc64d-dirty #34)
EIP: 0060:[c02eb5f6] EFLAGS: 0287 CPU: 1
EIP is at __write_lock_failed+0xa/0x20
EAX: c7b3fab4 EBX: c7b3fab4 ECX:  EDX: 
ESI: c03bb9c0 EDI: c7b3fab4 EBP: c7841eb0 ESP: c7841eb0
 DS: 007b ES: 007b FS: 00d8 GS:  SS: 0068
CR0: 8005003b CR2: 08560008 CR3: 07b04000 CR4: 0690
DR0:  DR1:  DR2:  DR3: 
DR6:  DR7: 
 [c020e43f] _raw_write_lock+0x57/0x6c
 [c02eba68] _write_lock+0x20/0x28
 [c026982c] ? neigh_periodic_timer+0x99/0x142
 [c026982c] neigh_periodic_timer+0x99/0x142
 [c0120c90] run_timer_softirq+0xfe/0x159
 [c0269793] ? neigh_periodic_timer+0x0/0x142
 [c011dbfa] __do_softirq+0x6f/0xe9
 [c011dcae] do_softirq+0x3a/0x52
 [c011dfc3] irq_exit+0x44/0x46
 [c010d680] smp_apic_timer_interrupt+0x71/0x81
 [c0103747] apic_timer_interrupt+0x33/0x38
 [c0101954] ? default_idle+0x4a/0x77
 [c010190a] ? default_idle+0x0/0x77
 [c0101855] cpu_idle+0x89/0x9d
 ===
BUG: soft lockup - CPU#0 stuck for 61s! [swapper:0]

Pid: 0, comm: swapper Not tainted (2.6.25-rc1-netns-00113-g69cc64d-dirty #34)
EIP: 0060:[c02eb5f6] EFLAGS: 0287 CPU: 0
EIP is at __write_lock_failed+0xa/0x20
EAX: c7b3fab4 EBX: c7b3fab4 ECX:  EDX: c0377986
ESI: c7b3fa90 EDI: c7b6f290 EBP: c03cbd24 ESP: c03cbd24
 DS: 007b ES: 007b FS: 00d8 GS:  SS: 0068
CR0: 8005003b CR2: b7f9b404 CR3: 07ac8000 CR4: 0690
DR0:  DR1:  DR2:  DR3: 
DR6:  DR7: 
 [c020e43f] _raw_write_lock+0x57/0x6c
 [c02eba95] _write_lock_bh+0x25/0x2d
 [c026b107] ? neigh_resolve_output+0x93/0x238
 [c026b107] neigh_resolve_output+0x93/0x238
 [c02a5635] ip6_output2+0x241/0x289
 [c02a61cd] ip6_output+0xa92/0xaad
 [c025ff11] ? __alloc_skb+0x4f/0xfb
 [c02b2596] ? __ndisc_send+0x1fb/0x3f5
 [c02b26a0] __ndisc_send+0x305/0x3f5
 [c02b2fb5] ndisc_send_ns+0x63/0x6e
 [c02b3f3e] ndisc_solicit+0x183/0x18d
 [c0121071] ? __mod_timer+0x96/0xa1
 [c026b81e] neigh_timer_handler+0x214/0x252
 [c0120c90] run_timer_softirq+0xfe/0x159
 [c026b60a] ? neigh_timer_handler+0x0/0x252
 [c011dbfa] __do_softirq+0x6f/0xe9
 [c011dcae] do_softirq+0x3a/0x52
 [c011dfc3] irq_exit+0x44/0x46
 [c0105273] do_IRQ+0x5a/0x73
 [c0103666] common_interrupt+0x2e/0x34
 [c0101954] ? default_idle+0x4a/0x77
 [c010190a] ? default_idle+0x0/0x77
 [c0101855] cpu_idle+0x89/0x9d
 [c02e6135] rest_init+0x49/0x4b
 ===
BUG: soft lockup - CPU#1 stuck for 61s! [swapper:0]
 ...


Benjamin

On Tue, Feb 12, 2008 at 6:47 AM, David Miller [EMAIL PROTECTED] wrote:
 From: Frank Blaschka [EMAIL PROTECTED]
  Date: Mon, 11 Feb 2008 10:01:20 +0100


   we run your patch during the weekend on single CPU and SMP
   machines. We do not see any problems. Thanks for providing the fix.

  Thanks for testing Frank, I can now push this fix upstream.


 --
  To unsubscribe from this list: send the line unsubscribe netdev in
  the body of a message to [EMAIL PROTECTED]
  More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe netdev in
the 

Re: [PATCH][RFC] race in generic address resolution

2008-02-14 Thread Benjamin Thery
I ran some additional tests and these traces may also be usefull.
They appears before the soft-lockup are detected.

fermi:~# ping6  -c 500 -f 2007::1
PING 2007::1(2007::1) 56 data bytes
.
===
[ INFO: possible circular locking dependency detected ]
2.6.25-rc1-00113-g69cc64d-dirty #34
---
ping6/1058 is trying to acquire lock:
 (tbl-lock){-+-+}, at: [c02691ac] neigh_lookup+0x43/0xa2

but task is already holding lock:
 (n-lock){-+..}, at: [c026b620] neigh_timer_handler+0x16/0x252

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

- #1 (n-lock){-+..}:
   [c01330b8] __lock_acquire+0x947/0xafc
   [c026982c] neigh_periodic_timer+0x99/0x142
   [c01332d0] lock_acquire+0x63/0x80
   [c026982c] neigh_periodic_timer+0x99/0x142
   [c02eba61] _write_lock+0x19/0x28
   [c026982c] neigh_periodic_timer+0x99/0x142
   [c026982c] neigh_periodic_timer+0x99/0x142
   [c0120c90] run_timer_softirq+0xfe/0x159
   [c0269793] neigh_periodic_timer+0x0/0x142
   [c011dbfa] __do_softirq+0x6f/0xe9
   [c011dcae] do_softirq+0x3a/0x52
   [c011dfc3] irq_exit+0x44/0x46
   [c010d680] smp_apic_timer_interrupt+0x71/0x81
   [c0103747] apic_timer_interrupt+0x33/0x38
   [c014e0ce] mmap_region+0xe1/0x376
   [c014e680] arch_get_unmapped_area_topdown+0x0/0x12e
   [c014e625] do_mmap_pgoff+0x1e2/0x23d
   [c0181895] elf_map+0xd8/0x104
   [c0182072] load_elf_binary+0x5b4/0x11cd
   [c015ed73] search_binary_handler+0x74/0x164
   [c0181abe] load_elf_binary+0x0/0x11cd
   [c015ed7a] search_binary_handler+0x7b/0x164
   [c015ff2e] do_execve+0x121/0x16a
   [c01012e3] sys_execve+0x29/0x52
   [c0102c56] syscall_call+0x7/0xb
   [] 0x

- #0 (tbl-lock){-+-+}:
   [c0132fdf] __lock_acquire+0x86e/0xafc
   [c01332d0] lock_acquire+0x63/0x80
   [c02691ac] neigh_lookup+0x43/0xa2
   [c02ebae9] _read_lock_bh+0x1e/0x2d
   [c02691ac] neigh_lookup+0x43/0xa2
   [c02691ac] neigh_lookup+0x43/0xa2
   [c02af858] ndisc_dst_alloc+0xb5/0x155
   [c02b240d] __ndisc_send+0x72/0x3f5
   [c02a573b] ip6_output+0x0/0xaad
   [c0133225] __lock_acquire+0xab4/0xafc
   [c02b2fb5] ndisc_send_ns+0x63/0x6e
   [c02eb92c] _read_unlock_bh+0x25/0x28
   [c02b3f3e] ndisc_solicit+0x183/0x18d
   [c0121071] __mod_timer+0x96/0xa1
   [c026b81e] neigh_timer_handler+0x214/0x252
   [c0120c90] run_timer_softirq+0xfe/0x159
   [c026b60a] neigh_timer_handler+0x0/0x252
   [c011dbfa] __do_softirq+0x6f/0xe9
   [c011dcae] do_softirq+0x3a/0x52
   [c011dfc3] irq_exit+0x44/0x46
   [c0105273] do_IRQ+0x5a/0x73
   [c0103666] common_interrupt+0x2e/0x34
   [c02ebd3a] _spin_unlock_irqrestore+0x38/0x3c
   [c02188fb] tty_ldisc_deref+0x5c/0x63
   [c021a5bd] tty_write+0x1a8/0x1b9
   [c021c5e1] write_chan+0x0/0x2a9
   [c021a633] redirected_tty_write+0x65/0x72
   [c021a5ce] redirected_tty_write+0x0/0x72
   [c015be18] vfs_write+0x8c/0x108
   [c015c3a2] sys_write+0x3b/0x60
   [c0102c56] syscall_call+0x7/0xb
   [] 0x

other info that might help us debug this:

1 lock held by ping6/1058:
 #0:  (n-lock){-+..}, at: [c026b620] neigh_timer_handler+0x16/0x252

stack backtrace:
Pid: 1058, comm: ping6 Not tainted 2.6.25-rc1-netns-00113-g69cc64d-dirty #34
 [c013176b] print_circular_bug_tail+0x5b/0x66
 [c0132fdf] __lock_acquire+0x86e/0xafc
 [c01332d0] lock_acquire+0x63/0x80
 [c02691ac] ? neigh_lookup+0x43/0xa2
 [c02ebae9] _read_lock_bh+0x1e/0x2d
 [c02691ac] ? neigh_lookup+0x43/0xa2
 [c02691ac] neigh_lookup+0x43/0xa2
 [c02af858] ndisc_dst_alloc+0xb5/0x155
 [c02b240d] __ndisc_send+0x72/0x3f5
 [c02a573b] ? ip6_output+0x0/0xaad
 [c0133225] ? __lock_acquire+0xab4/0xafc
 [c02b2fb5] ndisc_send_ns+0x63/0x6e
 [c02eb92c] ? _read_unlock_bh+0x25/0x28
 [c02b3f3e] ndisc_solicit+0x183/0x18d
 [c0121071] ? __mod_timer+0x96/0xa1
 [c026b81e] neigh_timer_handler+0x214/0x252
 [c0120c90] run_timer_softirq+0xfe/0x159
 [c026b60a] ? neigh_timer_handler+0x0/0x252
 [c011dbfa] __do_softirq+0x6f/0xe9
 [c011dcae] do_softirq+0x3a/0x52
 [c011dfc3] irq_exit+0x44/0x46
 [c0105273] do_IRQ+0x5a/0x73
 [c0103666] common_interrupt+0x2e/0x34
 [c02ebd3a] ? _spin_unlock_irqrestore+0x38/0x3c
 [c02188fb] tty_ldisc_deref+0x5c/0x63
 [c021a5bd] tty_write+0x1a8/0x1b9
 [c021c5e1] ? write_chan+0x0/0x2a9
 [c021a633] redirected_tty_write+0x65/0x72
 [c021a5ce] ? redirected_tty_write+0x0/0x72
 [c015be18] vfs_write+0x8c/0x108
 [c015c3a2] sys_write+0x3b/0x60
 [c0102c56] syscall_call+0x7/0xb
 ===


On Thu, Feb 14, 2008 at 5:56 PM, Benjamin Thery [EMAIL PROTECTED] wrote:
 Hi,

  It seems this patch hangs my  machine very quickly when there are some
  ICMPv6 traffic.

  I'm using net-2.6, pulled today (14th Feb).

  I had some unexpected hangs on my SMP test machines and I bisected

[PATCH 1/1][NETNS] Add missing initialization of nl_info.nl_net in rtm_to_fib6_config()

2008-01-24 Thread Benjamin Thery
Add missing initialization of the new nl_info.nl_net field in 
rtm_to_fib6_config(). This will be needed the store network namespace
associated to the fib6_config struct.

Signed-off-by: Benjamin Thery [EMAIL PROTECTED]
---
 net/ipv6/route.c |1 +
 1 file changed, 1 insertion(+)

Index: net-2.6.25/net/ipv6/route.c
===
--- net-2.6.25.orig/net/ipv6/route.c
+++ net-2.6.25/net/ipv6/route.c
@@ -1955,6 +1955,7 @@ static int rtm_to_fib6_config(struct sk_
 
cfg-fc_nlinfo.pid = NETLINK_CB(skb).pid;
cfg-fc_nlinfo.nlh = nlh;
+   cfg-fc_nlinfo.nl_net = skb-sk-sk_net;
 
if (tb[RTA_GATEWAY]) {
nla_memcpy(cfg-fc_gateway, tb[RTA_GATEWAY], 16);

-- 
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PROCFS] [NETNS] issue with /proc/net entries

2008-01-11 Thread Benjamin Thery

Eric W. Biederman wrote:

Benjamin Thery [EMAIL PROTECTED] writes:


Hi Eric,

While testing the current network namespace stuff merged in net-2.6.25,
I bumped into the following problem with the /proc/net/ entries.
It doesn't always display the actual data of the current namespace,
but sometime displays data from other namespaces.

I bisected the problem to the commit:
proc: remove/Fix proc generic d_revalidate
3790ee4bd86396558eedd86faac1052cb782e4e1

The problem: If a process in a particular network namespace changes
current directory to /proc/net, then processes in other network
namespaces trying to look at /proc/net entries will see data from the
first namespace (the one with CWD /proc/net). (See test case below).

As you comments in the commit suggest, you seem to be aware of some
issues when CONFIG_NET_NS=y. Is it one of these corner cases you
identified? Any idea on how we can fix it?


Yes.  It isn't especially hard.   I have most of it in my queue
I just need to get the silly patches out of there.

Essentially we need to fix the caching of proc_generic entries,
So that we can have a proper d_revalidate implementation.

To get d_revalidate and the caching correct for /proc/net will take
just a bit more work.  We need to make /proc/net a symlink
to something like /proc/self/net so that we don't get excess
revalidates when switching between different processes.

Or else we can't properly implement the case you have described.
Where being in the directory causes the wrong version of /proc/net
to show up. Changing the contents of the dentry for /proc/net
should only happen during unshare.  Not when we switch between
processes or else we get into the d_revalidate leaks mount points
problem again.

We also need the check to see if something is mounted on top of
us before we call drop the dentry.  But if we don't even try until
we know the dentry is invalid it should not be too bad.


Thanks for all the details.
I'll put this issue on my netns current limitations list until
it's solved.

Benjamin




Eric




--
B e n j a m i n   T h e r y  - BULL/DT/Open Software RD

   http://www.bull.com
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PROCFS] [NETNS] issue with /proc/net entries

2008-01-10 Thread Benjamin Thery

Hi Eric,

While testing the current network namespace stuff merged in net-2.6.25,
I bumped into the following problem with the /proc/net/ entries.
It doesn't always display the actual data of the current namespace,
but sometime displays data from other namespaces.

I bisected the problem to the commit:
proc: remove/Fix proc generic d_revalidate
3790ee4bd86396558eedd86faac1052cb782e4e1

The problem: If a process in a particular network namespace changes
current directory to /proc/net, then processes in other network
namespaces trying to look at /proc/net entries will see data from the
first namespace (the one with CWD /proc/net). (See test case below).

As you comments in the commit suggest, you seem to be aware of some
issues when CONFIG_NET_NS=y. Is it one of these corner cases you
identified? Any idea on how we can fix it?

Thanks.

Benjamin


Test case:
--
(1) Shell 1, in init namespace:
$ cat /proc/net/dev
lo ...
eth0 ...

(2) Shell 2, in another network namespace
$ cat /proc/net/dev
lo ...

(3) Shell 1
$ cd /proc/net
$ cat dev
lo ...
eth0 ...

(4) Shell 2
$ cat /proc/net/dev
lo ...
eth0 ...

Argh, lo + eth0 in child namespace the device list of init netns
is displayed in /proc/net/dev of child namespace :-(

(5) Shell 1
$ cd /

(6) Shell 2
$ cat /proc/net/dev
lo ...

Back to normality.


--
B e n j a m i n   T h e r y  - BULL/DT/Open Software RD

   http://www.bull.com
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 5/9][NETNS][IPV6] make bindv6only sysctl per namespace

2008-01-07 Thread Benjamin Thery

Daniel,

The kernel fails to build with this patch applied when CONFIG_SYSCTL=n
See comment below.

Daniel Lezcano wrote:

This patch moves the bindv6only sysctl to the network namespace
structure. Until the ipv6 protocol is not per namespace, the sysctl
variable is always from the initial network namespace.

Signed-off-by: Daniel Lezcano [EMAIL PROTECTED]
---
 include/net/ipv6.h |1 -
 include/net/netns/ipv6.h   |1 +
 net/ipv6/af_inet6.c|4 +---
 net/ipv6/sysctl_net_ipv6.c |6 +-
 4 files changed, 7 insertions(+), 5 deletions(-)

Index: net-2.6.25/include/net/ipv6.h
===
--- net-2.6.25.orig/include/net/ipv6.h
+++ net-2.6.25/include/net/ipv6.h
@@ -109,7 +109,6 @@ struct frag_hdr {
 #include net/sock.h
 
 /* sysctls */

-extern int sysctl_ipv6_bindv6only;
 extern int sysctl_mld_max_msf;
 
 #define _DEVINC(statname, modifier, idev, field)			\

Index: net-2.6.25/include/net/netns/ipv6.h
===
--- net-2.6.25.orig/include/net/netns/ipv6.h
+++ net-2.6.25/include/net/netns/ipv6.h
@@ -9,6 +9,7 @@ struct ctl_table_header;
 
 struct netns_sysctl_ipv6 {

struct ctl_table_header *table;
+   int bindv6only;
 };
 
 struct netns_ipv6 {

Index: net-2.6.25/net/ipv6/af_inet6.c
===
--- net-2.6.25.orig/net/ipv6/af_inet6.c
+++ net-2.6.25/net/ipv6/af_inet6.c
@@ -66,8 +66,6 @@ MODULE_AUTHOR(Cast of dozens);
 MODULE_DESCRIPTION(IPv6 protocol stack for Linux);
 MODULE_LICENSE(GPL);
 
-int sysctl_ipv6_bindv6only __read_mostly;

-
 /* The inetsw6 table contains everything that inet6_create needs to
  * build a new socket.
  */
@@ -193,7 +191,7 @@ lookup_protocol:
np-mcast_hops   = -1;
np-mc_loop  = 1;
np-pmtudisc = IPV6_PMTUDISC_WANT;
-   np-ipv6only = sysctl_ipv6_bindv6only;
+   np-ipv6only = init_net.ipv6.sysctl.bindv6only;



The problem is here:
init_net.ipv6.sysctl is not defined if CONFIG_SYSCTL=n.

Benjamin

 
 	/* Init the ipv4 part of the socket since we can have sockets

 * using v6 API for ipv4.
Index: net-2.6.25/net/ipv6/sysctl_net_ipv6.c
===
--- net-2.6.25.orig/net/ipv6/sysctl_net_ipv6.c
+++ net-2.6.25/net/ipv6/sysctl_net_ipv6.c
@@ -35,7 +35,7 @@ static ctl_table ipv6_table_template[] =
{
.ctl_name   = NET_IPV6_BINDV6ONLY,
.procname   = bindv6only,
-   .data   = sysctl_ipv6_bindv6only,
+   .data   = init_net.ipv6.sysctl.bindv6only,
.maxlen = sizeof(int),
.mode   = 0644,
.proc_handler   = proc_dointvec
@@ -115,6 +115,10 @@ static int ipv6_sysctl_net_init(struct n
ipv6_table[0].child = ipv6_route_table;
ipv6_table[1].child = ipv6_icmp_table;
 
+  	ipv6_table[2].data = net-ipv6.sysctl.bindv6only;

+
+   net-ipv6.sysctl.bindv6only = 0;
+
net-ipv6.sysctl.table = register_net_sysctl_table(net, ipv6_ctl_path, 
ipv6_table);
if (!net-ipv6.sysctl.table)
goto out_ipv6_icmp_table;

-- -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html 



--
B e n j a m i n   T h e r y  - BULL/DT/Open Software RD

   http://www.bull.com
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.24-rc5-mm1

2007-12-13 Thread Benjamin Thery
The problem comes from the new macro UDPX_INC_STATS_BH introduced
by Herbert, which was a nice addition to increment the correct 
UDP MIB depending on the socket family, but unfortunately 
the use of this macro from kernel code (I mean code not compiled 
as module) requires that IPv6 is also compiled in kernel 
(CONFIG_IPv6=y) in order to have udp_stats_in6 defined at link 
time.

Benjamin

Pierre Peiffer wrote:
 Hi,
 
   My config does not link any more:
 
 ...
   CHK include/linux/compile.h
   UPD include/linux/compile.h
   CC  init/version.o
   LD  init/built-in.o
   LD  .tmp_vmlinux1
 net/built-in.o: In function `xs_udp_data_ready':
 /home/peifferp/containers/kernel/linux-2.6.24-rc5-mm1/net/sunrpc/xprtsock.c:842:
 undefined reference to `udp_stats_in6'
 /home/peifferp/containers/kernel/linux-2.6.24-rc5-mm1/net/sunrpc/xprtsock.c:846:
 undefined reference to `udp_stats_in6'
 make[1]: *** [.tmp_vmlinux1] Error 1
 make: *** [sub-make] Error 2
 
 After a first look, udp_stats_in6 seems to be defined in ipv6 (file
 net/ipv6/udp.c) but I have
 
 CONFIG_IPV6=m
 and
 CONFIG_SUNRPC=y
 
 So, SUNRPC uses something defined in a module in my case ?
 
 ... looking more, this dependency seems to have been introduced by the patch
 [UDP]: Restore missing inDatagrams increments
 ( http://thread.gmane.org/gmane.linux.network/79716/focus=79831 )
 
 (I cc netdev)
 
 I don't know what is the right way to fix this ... ?
 
 P.
 Andrew Morton wrote:
 ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc5/2.6.24-rc5-mm1/

 - If something goes wrong with a PCI device's probing or initialisation, try
   reverting pci-disable-decoding-during-sizing-of-bars.patch.

 - git-sched was dropped due to breaking suspend-to-RAM.

 - git-block has been restored after having had a few problems

 - git-newsetup.patch was dropped due to conflicts with git-x86

 - git-perfmon.patch is still dropped for the same reason

 - git-kgdb.patch is still dropped for the same reason

 - Please do try to cc the correct developer and mailing list when
   reporting problems - I'm just buried in bugs over here.



 Boilerplate:

 - See the `hot-fixes' directory for any important updates to this patchset.

 - To fetch an -mm tree using git, use (for example)

   git-fetch 
 git://git.kernel.org/pub/scm/linux/kernel/git/smurf/linux-trees.git tag 
 v2.6.16-rc2-mm1
   git-checkout -b local-v2.6.16-rc2-mm1 v2.6.16-rc2-mm1

 - -mm kernel commit activity can be reviewed by subscribing to the
   mm-commits mailing list.

 echo subscribe mm-commits | mail [EMAIL PROTECTED]

 - If you hit a bug in -mm and it is not obvious which patch caused it, it is
   most valuable if you can perform a bisection search to identify which patch
   introduced the bug.  Instructions for this process are at

 
 http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt

   But beware that this process takes some time (around ten rebuilds and
   reboots), so consider reporting the bug first and if we cannot immediately
   identify the faulty patch, then perform the bisection search.

 - When reporting bugs, please try to Cc: the relevant maintainer and mailing
   list on any email.

 - When reporting bugs in this kernel via email, please also rewrite the
   email Subject: in some manner to reflect the nature of the bug.  Some
   developers filter by Subject: when looking for messages to read.

 - Occasional snapshots of the -mm lineup are uploaded to
   ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/mm/ and are announced on
   the mm-commits list.  These probably are at least compilable.

 - More-than-daily -mm snapshots may be found at
   http://userweb.kernel.org/~akpm/mmotm/.  These are almost certainly not
   compileable.



 Changes since 2.6.24-rc4-mm1:


  origin.patch
  git-acpi.patch
  git-alsa.patch
  git-agpgart.patch
  git-arm.patch
  git-arm-master.patch
  git-avr32.patch
  git-cpufreq.patch
  git-powerpc.patch
  git-drm.patch
  git-dvb.patch
  git-hwmon.patch
  git-gfs2-nmw.patch
  git-hid.patch
  git-hrt.patch
  git-ieee1394.patch
  git-infiniband.patch
  git-input.patch
  git-jfs.patch
  git-kbuild.patch
  git-kvm.patch
  git-lblnet.patch
  git-leds.patch
  git-libata-all.patch
  git-md-accel.patch
  git-mips.patch
  git-mmc.patch
  git-mtd.patch
  git-ubi.patch
  git-net.patch
  git-netdev-all.patch
  git-battery.patch
  git-nfs.patch
  git-nfsd.patch
  git-ocfs2.patch
  git-s390.patch
  git-sh.patch
  git-scsi-misc.patch
  git-scsi-rc-fixes.patch
  git-block.patch
  git-unionfs.patch
  git-v9fs.patch
  git-watchdog.patch
  git-wireless.patch
  git-ipwireless_cs.patch
  git-x86.patch
  git-xfs.patch
  git-cryptodev.patch
  git-xtensa.patch

  git trees

 -aio-only-account-i-o-wait-time-in-read_events-if-there-are-active-requests.patch
 -fix-cloneclone_newpid.patch
 -rtc-assure-proper-memory-ordering-with-respect-to-rtc_dev_busy-flag.patch
 -ufs-fix-nexstep-dir-block-size.patch
 

Re: [PATCH 1/2] Convert /proc/net/ipv6_route to seq_file interface

2007-10-30 Thread Benjamin Thery
Alexey Dobriyan wrote:
 One proc_net_create() user less.

Funny, I was working on a similar patch.

See comment below.


 Signed-off-by: Alexey Dobriyan [EMAIL PROTECTED]
 ---
 
  net/ipv6/route.c |   70 
 +++
  1 file changed, 25 insertions(+), 45 deletions(-)
 
 --- a/net/ipv6/route.c
 +++ b/net/ipv6/route.c
 @@ -2288,71 +2288,49 @@ struct rt6_proc_arg
  
  static int rt6_info_route(struct rt6_info *rt, void *p_arg)
  {
 - struct rt6_proc_arg *arg = (struct rt6_proc_arg *) p_arg;
 + struct seq_file *m = p_arg;
  
 - if (arg-skip  arg-offset / RT6_INFO_LEN) {
 - arg-skip++;
 - return 0;
 - }
 -
 - if (arg-len = arg-length)
 - return 0;
 -
 - arg-len += sprintf(arg-buffer + arg-len,
 - NIP6_SEQFMT  %02x ,
 - NIP6(rt-rt6i_dst.addr),
 + seq_printf(m, NIP6_SEQFMT  %02x , NIP6(rt-rt6i_dst.addr),
   rt-rt6i_dst.plen);
  
  #ifdef CONFIG_IPV6_SUBTREES
 - arg-len += sprintf(arg-buffer + arg-len,
 - NIP6_SEQFMT  %02x ,
 - NIP6(rt-rt6i_src.addr),
 + seq_printf(m, NIP6_SEQFMT  %02x , NIP6(rt-rt6i_src.addr),
   rt-rt6i_src.plen);
  #else
 - arg-len += sprintf(arg-buffer + arg-len,
 -  00 );
 + seq_puts(m,  00 );
  #endif
  
   if (rt-rt6i_nexthop) {
 - arg-len += sprintf(arg-buffer + arg-len,
 - NIP6_SEQFMT,
 + seq_printf(m, NIP6_SEQFMT,
   NIP6(*((struct in6_addr 
 *)rt-rt6i_nexthop-primary_key)));
   } else {
 - arg-len += sprintf(arg-buffer + arg-len,
 - );
 + seq_puts(m, );
   }
 - arg-len += sprintf(arg-buffer + arg-len,
 -  %08x %08x %08x %08x %8s\n,
 + seq_printf(m,  %08x %08x %08x %08x %8s\n,
   rt-rt6i_metric, atomic_read(rt-u.dst.__refcnt),
   rt-u.dst.__use, rt-rt6i_flags,
   rt-rt6i_dev ? rt-rt6i_dev-name : );
   return 0;
  }
  
 -static int rt6_proc_info(char *buffer, char **start, off_t offset, int 
 length)
 +static int ipv6_route_show(struct seq_file *m, void *v)
  {
 - struct rt6_proc_arg arg = {
 - .buffer = buffer,
 - .offset = offset,
 - .length = length,
 - };
 -
 - fib6_clean_all(rt6_info_route, 0, arg);
 -
 - *start = buffer;
 - if (offset)
 - *start += offset % RT6_INFO_LEN;
 -
 - arg.len -= offset % RT6_INFO_LEN;
 -
 - if (arg.len  length)
 - arg.len = length;
 - if (arg.len  0)
 - arg.len = 0;
 + fib6_clean_all(rt6_info_route, 0, m);
 + return 0;
 +}
  
 - return arg.len;
 +static int ipv6_route_open(struct inode *inode, struct file *file)
 +{
 + return single_open(file, ipv6_route_show, NULL);
  }
  
 +static const struct file_operations ipv6_route_proc_fops = {
 + .open   = ipv6_route_open,
 + .read   = seq_read,
 + .llseek = seq_lseek,
 + .release= single_release,
 +};
 +
  static int rt6_stats_seq_show(struct seq_file *seq, void *v)
  {
   seq_printf(seq, %04x %04x %04x %04x %04x %04x %04x\n,
 @@ -2499,9 +2477,11 @@ void __init ip6_route_init(void)
  
   fib6_init();
  #ifdef   CONFIG_PROC_FS
 - p = proc_net_create(init_net, ipv6_route, 0, rt6_proc_info);
 - if (p)

 + p = create_proc_entry(ipv6_route, 0, init_net.proc_net);
 + if (p) {
   p-owner = THIS_MODULE;
 + p-proc_fops = ipv6_route_proc_fops;
 + }

You should use proc_net_fops_create() instead of the above code. 
It does the same thing.

Otherwise the patch looks fine to me.
Tested on i386.

Benjamin

   proc_net_fops_create(init_net, rt6_stats, S_IRUGO, 
 rt6_stats_seq_fops);
  #endif
 
 -
 To unsubscribe from this list: send the line unsubscribe netdev in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 


-- 
B e n j a m i n   T h e r y  - BULL/DT/Open Software RD

   http://www.bull.com
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Convert /proc/net/ipv6_route to seq_file interface

2007-10-30 Thread Benjamin Thery
Cosmetic comment:

I forgot to say there are a few indentation errors when
I apply your patch. See below.


Benjamin Thery wrote:
 Alexey Dobriyan wrote:
 One proc_net_create() user less.
 
 Funny, I was working on a similar patch.
 
 See comment below.
 
 
 Signed-off-by: Alexey Dobriyan [EMAIL PROTECTED]
 ---

  net/ipv6/route.c |   70 
 +++
  1 file changed, 25 insertions(+), 45 deletions(-)

 --- a/net/ipv6/route.c
 +++ b/net/ipv6/route.c
 @@ -2288,71 +2288,49 @@ struct rt6_proc_arg
  
  static int rt6_info_route(struct rt6_info *rt, void *p_arg)
  {
 -struct rt6_proc_arg *arg = (struct rt6_proc_arg *) p_arg;
 +struct seq_file *m = p_arg;
  
 -if (arg-skip  arg-offset / RT6_INFO_LEN) {
 -arg-skip++;
 -return 0;
 -}
 -
 -if (arg-len = arg-length)
 -return 0;
 -
 -arg-len += sprintf(arg-buffer + arg-len,
 -NIP6_SEQFMT  %02x ,
 -NIP6(rt-rt6i_dst.addr),
 +seq_printf(m, NIP6_SEQFMT  %02x , NIP6(rt-rt6i_dst.addr),
  rt-rt6i_dst.plen);
  
  #ifdef CONFIG_IPV6_SUBTREES
 -arg-len += sprintf(arg-buffer + arg-len,
 -NIP6_SEQFMT  %02x ,
 -NIP6(rt-rt6i_src.addr),
 +seq_printf(m, NIP6_SEQFMT  %02x , NIP6(rt-rt6i_src.addr),
  rt-rt6i_src.plen);

Indent is wrong for the above line.

  #else
 -arg-len += sprintf(arg-buffer + arg-len,
 - 00 );
 +seq_puts(m,  00 );
  #endif
  
  if (rt-rt6i_nexthop) {
 -arg-len += sprintf(arg-buffer + arg-len,
 -NIP6_SEQFMT,
 +seq_printf(m, NIP6_SEQFMT,
  NIP6(*((struct in6_addr 
 *)rt-rt6i_nexthop-primary_key)));

Idem.

  } else {
 -arg-len += sprintf(arg-buffer + arg-len,
 -);
 +seq_puts(m, );
  }
 -arg-len += sprintf(arg-buffer + arg-len,
 - %08x %08x %08x %08x %8s\n,
 +seq_printf(m,  %08x %08x %08x %08x %8s\n,
  rt-rt6i_metric, atomic_read(rt-u.dst.__refcnt),
  rt-u.dst.__use, rt-rt6i_flags,
  rt-rt6i_dev ? rt-rt6i_dev-name : );

Indent of the 3 above lines.

  return 0;
  }
  
 -static int rt6_proc_info(char *buffer, char **start, off_t offset, int 
 length)
 +static int ipv6_route_show(struct seq_file *m, void *v)
  {
 -struct rt6_proc_arg arg = {
 -.buffer = buffer,
 -.offset = offset,
 -.length = length,
 -};
 -
 -fib6_clean_all(rt6_info_route, 0, arg);
 -
 -*start = buffer;
 -if (offset)
 -*start += offset % RT6_INFO_LEN;
 -
 -arg.len -= offset % RT6_INFO_LEN;
 -
 -if (arg.len  length)
 -arg.len = length;
 -if (arg.len  0)
 -arg.len = 0;
 +fib6_clean_all(rt6_info_route, 0, m);
 +return 0;
 +}
  
 -return arg.len;
 +static int ipv6_route_open(struct inode *inode, struct file *file)
 +{
 +return single_open(file, ipv6_route_show, NULL);
  }
  
 +static const struct file_operations ipv6_route_proc_fops = {
 +.open   = ipv6_route_open,
 +.read   = seq_read,
 +.llseek = seq_lseek,
 +.release= single_release,
 +};
 +
  static int rt6_stats_seq_show(struct seq_file *seq, void *v)
  {
  seq_printf(seq, %04x %04x %04x %04x %04x %04x %04x\n,
 @@ -2499,9 +2477,11 @@ void __init ip6_route_init(void)
  
  fib6_init();
  #ifdef  CONFIG_PROC_FS
 -p = proc_net_create(init_net, ipv6_route, 0, rt6_proc_info);
 -if (p)
 
 +p = create_proc_entry(ipv6_route, 0, init_net.proc_net);
 +if (p) {
  p-owner = THIS_MODULE;
 +p-proc_fops = ipv6_route_proc_fops;
 +}
 
 You should use proc_net_fops_create() instead of the above code. 
 It does the same thing.
 
 Otherwise the patch looks fine to me.
 Tested on i386.
 
 Benjamin
 
  proc_net_fops_create(init_net, rt6_stats, S_IRUGO, 
 rt6_stats_seq_fops);
  #endif

 -
 To unsubscribe from this list: send the line unsubscribe netdev in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

 
 


-- 
B e n j a m i n   T h e r y  - BULL/DT/Open Software RD

   http://www.bull.com
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [NETNS] Oops in register_pernet_operations() with CONFIG_NET_NS=n

2007-10-26 Thread Benjamin Thery
David Miller wrote:
 From: [EMAIL PROTECTED] (Eric W. Biederman)
 Date: Thu, 25 Oct 2007 11:21:55 -0600
 
 By the way, I think that we can in the case of undefined CONFIG_NET_NS
 reduce register to calling -init method and unregister to calling
 -exit method.

 This is a correct thing at least for now and will be welcomed by the all
 embedded/etc people.
 I'm not fundamentally opposed.  Earlier versions of my patchset
 did that and more.   However I think the pain is greater then the
 gain right now.  Especially since this concept seem to require
 having quality inspected into it.
 
 I think the correct thing to do for now is to simply remove these
 __net_* markers and their definitions.  There are so many tricky cases
 that it is easier to just get rid of them.
 
 Could someone send me a patch which does that?

The attached patch revert Pavel's orginal patch from 2.6.23-mm1. 
It should work fine with net-2.6 too.

Benjamin

-- 
B e n j a m i n   T h e r y  - BULL/DT/Open Software RD

   http://www.bull.com
This patch reverts the patch sent by Pavel Emilyanov
that introduced __net_init/__net_exit/__net_initdata defines
to save some memory when CONFIG_NET_NS=n.

http://www.spinics.net/lists/netdev/msg43310.html

When CONFIG_NET_NS=n, this later patch causes an oops when a 
netns-aware module is loaded after boot. When initialized the 
module tries to register its pernet operations and add them in 
the pernet_list. Unfortunately this list is corrupted as its 
first entries have been freed at the end of the boot sequence.

Signed-off-by: Benjamin Thery [EMAIL PROTECTED]
---
 drivers/net/loopback.c  |6 +++---
 fs/proc/proc_net.c  |8 
 include/linux/init.h|1 -
 include/net/net_namespace.h |9 -
 net/core/dev.c  |   16 
 net/core/dev_mcast.c|6 +++---
 net/netlink/af_netlink.c|6 +++---
 scripts/mod/modpost.c   |1 -
 8 files changed, 21 insertions(+), 32 deletions(-)

Index: linux-2.6.23-mm1-lxc1/drivers/net/loopback.c
===
--- linux-2.6.23-mm1-lxc1.orig/drivers/net/loopback.c
+++ linux-2.6.23-mm1-lxc1/drivers/net/loopback.c
@@ -250,7 +250,7 @@ static void loopback_setup(struct net_de
 }
 
 /* Setup and register the loopback device. */
-static __net_init int loopback_net_init(struct net *net)
+static int loopback_net_init(struct net *net)
 {
 	struct net_device *dev;
 	int err;
@@ -278,14 +278,14 @@ out_free_netdev:
 	goto out;
 }
 
-static __net_exit void loopback_net_exit(struct net *net)
+static void loopback_net_exit(struct net *net)
 {
 	struct net_device *dev = net-loopback_dev;
 
 	unregister_netdev(dev);
 }
 
-static struct pernet_operations __net_initdata loopback_net_ops = {
+static struct pernet_operations loopback_net_ops = {
.init = loopback_net_init,
.exit = loopback_net_exit,
 };
Index: linux-2.6.23-mm1-lxc1/fs/proc/proc_net.c
===
--- linux-2.6.23-mm1-lxc1.orig/fs/proc/proc_net.c
+++ linux-2.6.23-mm1-lxc1/fs/proc/proc_net.c
@@ -140,7 +140,7 @@ static struct inode_operations proc_net_
 	.setattr	= proc_net_setattr,
 };
 
-static __net_init int proc_net_ns_init(struct net *net)
+static int proc_net_ns_init(struct net *net)
 {
 	struct proc_dir_entry *root, *netd, *net_statd;
 	int err;
@@ -178,19 +178,19 @@ free_root:
 	goto out;
 }
 
-static __net_exit void proc_net_ns_exit(struct net *net)
+static void proc_net_ns_exit(struct net *net)
 {
 	remove_proc_entry(stat, net-proc_net);
 	remove_proc_entry(net, net-proc_net_root);
 	kfree(net-proc_net_root);
 }
 
-struct pernet_operations __net_initdata proc_net_ns_ops = {
+struct pernet_operations proc_net_ns_ops = {
 	.init = proc_net_ns_init,
 	.exit = proc_net_ns_exit,
 };
 
-int __init proc_net_init(void)
+int proc_net_init(void)
 {
 	proc_net_shadow = proc_mkdir(net, NULL);
 	proc_net_shadow-proc_iops = proc_net_dir_inode_operations;
Index: linux-2.6.23-mm1-lxc1/include/linux/init.h
===
--- linux-2.6.23-mm1-lxc1.orig/include/linux/init.h
+++ linux-2.6.23-mm1-lxc1/include/linux/init.h
@@ -57,7 +57,6 @@
  * The markers follow same syntax rules as __init / __initdata. */
 #define __init_refok noinline __attribute__ ((__section__ (.text.init.refok)))
 #define __initdata_refok  __attribute__ ((__section__ (.data.init.refok)))
-#define __exit_refok noinline __attribute__ ((__section__ (.exit.text.refok)))
 
 #ifdef MODULE
 #define __exit		__attribute__ ((__section__(.exit.text))) __cold
Index: linux-2.6.23-mm1-lxc1/include/net/net_namespace.h
===
--- linux-2.6.23-mm1-lxc1.orig/include/net/net_namespace.h
+++ linux-2.6.23-mm1-lxc1/include/net/net_namespace.h
@@ -99,15 +99,6 @@ static inline void release_net(struct ne
 #define for_each_net(VAR)\
 	list_for_each_entry

[NETNS] Oops in register_pernet_operations() with CONFIG_NET_NS=n

2007-10-25 Thread Benjamin Thery
Hello Pavel,

I've found a problem with one of your patch related to netns:

* [NETNS] Move some code into __init section when CONFIG_NET_NS=n (v2)
   http://www.spinics.net/lists/netdev/msg43310.html

This patch introduces the __net_init/__net_exit/__net_initdata
defines to save some memory when CONFIG_NET_NS is not set.

Cedric Le Goater reported he had a *non-fatal* oops when booting
a 2.6.23-mm1-lxc1 kernel with CONFIG_NET_NS=n. (2.6.23-mm1-lxc1 
contains the NETNS49 patchset). The oops occured when modules 
related to iptables were loaded after the boot completes.

The problem is the following:

- Your patch adds the __net_initdata attribute to pernet_operations 
  structures.

- pernet_operations are registered via register_pernet_subsys() and 
  linked in the pernet_list during boot.

- At the end of boot, pernet_operations are freed (because of the
  __net_initdata attribute), and the pernet_list (or first_device list) 
  points to freed memory.

- After boot, network modules which are netns-aware try to register
  themselves with register_pernet_subsys() and ...KABOOM... page
  fault when accessing pernet_list (or first_device list).
  (I reproduce Cedric's oops with the command: iptables --list)

This is not a problem right now in 2.6.23-mm1 or net-2.6 because 
there are very few netns-aware network subsystems merged and they
are all initialized during boot. But it will be problematic when 
we will merge netns code for subsystems which can be built as 
modules (eg. iptables, ...). I'm not sure we can use 
__net_init_data for pernet_operations then. 
Maybe we can add some checks in register_pernet_operations
when CONFIG_NET_NS=n.

I haven't found a fix yet.

Regards,
Benjamin

-- 
B e n j a m i n   T h e r y  - BULL/DT/Open Software RD

   http://www.bull.com
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [NETNS] Oops in register_pernet_operations() with CONFIG_NET_NS=n

2007-10-25 Thread Benjamin Thery
Denis V. Lunev wrote:
 The patch attached should help. The idea is simple. The init should be
 called only once without NETNS. Period. No need for any lists.

This is the kind of idea I had but I didn't think it could be 
that simple. :) 
Thanks Denis.

 I'll resend it to Dave after the ACK.

Tested on x86_64 with CONFIG_NET_NS=n and y. 
It fixes the issue we observed.

Acked-by: Benjamin Thery [EMAIL PROTECTED]


 Regards,
   Den
 
 Benjamin Thery wrote:
 Hello Pavel,

 I've found a problem with one of your patch related to netns:

 * [NETNS] Move some code into __init section when CONFIG_NET_NS=n (v2)
http://www.spinics.net/lists/netdev/msg43310.html

 This patch introduces the __net_init/__net_exit/__net_initdata
 defines to save some memory when CONFIG_NET_NS is not set.

 Cedric Le Goater reported he had a *non-fatal* oops when booting
 a 2.6.23-mm1-lxc1 kernel with CONFIG_NET_NS=n. (2.6.23-mm1-lxc1 
 contains the NETNS49 patchset). The oops occured when modules 
 related to iptables were loaded after the boot completes.

 The problem is the following:

 - Your patch adds the __net_initdata attribute to pernet_operations 
   structures.

 - pernet_operations are registered via register_pernet_subsys() and 
   linked in the pernet_list during boot.

 - At the end of boot, pernet_operations are freed (because of the
   __net_initdata attribute), and the pernet_list (or first_device list) 
   points to freed memory.

 - After boot, network modules which are netns-aware try to register
   themselves with register_pernet_subsys() and ...KABOOM... page
   fault when accessing pernet_list (or first_device list).
   (I reproduce Cedric's oops with the command: iptables --list)

 This is not a problem right now in 2.6.23-mm1 or net-2.6 because 
 there are very few netns-aware network subsystems merged and they
 are all initialized during boot. But it will be problematic when 
 we will merge netns code for subsystems which can be built as 
 modules (eg. iptables, ...). I'm not sure we can use 
 __net_init_data for pernet_operations then. 
 Maybe we can add some checks in register_pernet_operations
 when CONFIG_NET_NS=n.

 I haven't found a fix yet.

 Regards,
 Benjamin

 


-- 
B e n j a m i n   T h e r y  - BULL/DT/Open Software RD

   http://www.bull.com
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [NETNS] Oops in register_pernet_operations() with CONFIG_NET_NS=n

2007-10-25 Thread Benjamin Thery
Eric W. Biederman wrote:
 Benjamin Thery [EMAIL PROTECTED] writes:
 
 Denis V. Lunev wrote:
 The patch attached should help. The idea is simple. The init should be
 called only once without NETNS. Period. No need for any lists.
 This is the kind of idea I had but I didn't think it could be 
 that simple. :) 
 Thanks Denis.
 
 It isn't.
 
 I'll resend it to Dave after the ACK.
 Tested on x86_64 with CONFIG_NET_NS=n and y. 
 It fixes the issue we observed.

 Acked-by: Benjamin Thery [EMAIL PROTECTED]
 
 Try rmmod.

rmmod was part of my tests and it does work.
I did:

$ iptables --list

  modules x_tables, ip_tables  iptable_filter are loaded
  each calling register_pernet_subsys.

$ rmmod iptable_filter ip_tables x_tables

  No problem here

$ iptables --list

  To be sure I can load the modules again.


 
 Eric
 -
 To unsubscribe from this list: send the line unsubscribe netdev in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 


-- 
B e n j a m i n   T h e r y  - BULL/DT/Open Software RD

   http://www.bull.com
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] [IPv6]: use container_of() macro in fib6_clean_node()

2007-10-08 Thread Benjamin Thery
In ip6_fib.c, fib6_clean_node() casts a fib6_walker_t pointer to
a fib6_cleaner_t pointer assuming a struct fib6_walker_t (field 'w')
is the first field in struct fib6_walker_t.

To prevent any future problems that may occur if one day a field
is inadvertently inserted before the 'w' field in struct fib6_cleaner_t,
(and to improve readability), this patch uses the container_of() macro.

Patch for net-2.6.24

Signed-off-by: Benjamin Thery [EMAIL PROTECTED]
---

 net/ipv6/ip6_fib.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 6a612a7..946cf38 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -1313,7 +1313,7 @@ static int fib6_clean_node(struct fib6_walker_t *w)
 {
int res;
struct rt6_info *rt;
-   struct fib6_cleaner_t *c = (struct fib6_cleaner_t*)w;
+   struct fib6_cleaner_t *c = container_of(w, struct fib6_cleaner_t, w);
 
for (rt = w-leaf; rt; rt = rt-u.dst.rt6_next) {
res = c-func(rt, c-arg);

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/1][NET] : Fix dev_put() and dev_hold() comments

2007-10-02 Thread Benjamin Thery
Trivial fix: Swap comments for dev_put() and dev_hold() to get them 
at the right place.
Typo introduced by 4fa57c9ea9f36f9ca852f3a88ca5d2f1aebbc960.

Signed-of-by: Benjamin Thery [EMAIL PROTECTED]
---
 include/linux/netdevice.h |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Index: net-2.6.24/include/linux/netdevice.h
===
--- net-2.6.24.orig/include/linux/netdevice.h
+++ net-2.6.24/include/linux/netdevice.h
@@ -1054,7 +1054,7 @@ extern void netdev_run_todo(void);
  * dev_put - release reference to device
  * @dev: network device
  *
- * Hold reference to device to keep it from being freed.
+ * Release reference to device to allow it to be freed.
  */
 static inline void dev_put(struct net_device *dev)
 {
@@ -1065,7 +1065,7 @@ static inline void dev_put(struct net_de
  * dev_hold - get reference to device
  * @dev: network device
  *
- * Release reference to device to allow it to be freed.
+ * Hold reference to device to keep it from being freed.
  */
 static inline void dev_hold(struct net_device *dev)
 {

-- 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Rename struct net to struct netns

2007-09-17 Thread Benjamin Thery
Daniel Lezcano wrote:
 Pavel Emelyanov wrote:
 The name struct net is too generic. There already were
 some people who wanted to have some better name (for
 easier grep for example). I propose the struct netns one.

 The patch is (already) huge (sorry), but it's nothing but
   sed -e s/struct net\/struct netns/g

 If this name is bad as well, let's select a new one
 before the struct net floods the kernel.
 
 [ SNIP ]
 
 --- a/include/linux/nsproxy.h
 +++ b/include/linux/nsproxy.h
 @@ -29,7 +29,7 @@ struct nsproxy {
  struct mnt_namespace *mnt_ns;
  struct pid_namespace *pid_ns;
  struct user_namespace *user_ns;
 -struct net  *net_ns;
 +struct netns  *net_ns;
 
 IMHO, if we want to be consistent with all the rest of the namespaces,
 that should be net_namespace.

Sure it's a good argument. 
But I find that 'net', although it is an uber generic name, 
represents its contents appropriately: its function is to store
all data for a network stack, so it is what represent a network
in the kernel.

Anyway, if we want to change it, I think net_namespace is better
than netns because of the consistency argument given by Daniel.
(But it's longer :( )

Just my 2 cents.

Benjamin


 
 [ SNIP ]


 -
 To unsubscribe from this list: send the line unsubscribe netdev in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 


-- 
B e n j a m i n   T h e r y  - BULL/DT/Open Software RD

   http://www.bull.com
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] net/core: Fix crash in dev_mc_sync()/dev_mc_unsync()

2007-08-22 Thread Benjamin Thery

Oops, don't use the previous version of the patch:
the change in dev_mc_unsync() was not correct.
Sorry.

This one is a lot better (it compiles and runs). :)

Benjamin
--
B e n j a m i n   T h e r y  - BULL/DT/Open Software RD

   http://www.bull.com
From: [EMAIL PROTECTED]
Subject: net/core: Fix crash in dev_mc_sync()/dev_mc_unsync()

This patch fixes a crash that may occur when the routine dev_mc_sync()
deletes an address from the list it is currently going through. It 
saves the pointer to the next element before deleting the current one.
The problem may also exist in dev_mc_unsync().

Signed-off-by: Benjamin Thery [EMAIL PROTECTED]
---
 net/core/dev_mcast.c |   14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

Index: linux-2.6.23-rc2/net/core/dev_mcast.c
===
--- linux-2.6.23-rc2.orig/net/core/dev_mcast.c
+++ linux-2.6.23-rc2/net/core/dev_mcast.c
@@ -116,11 +116,13 @@ int dev_mc_add(struct net_device *dev, v
  */
 int dev_mc_sync(struct net_device *to, struct net_device *from)
 {
-	struct dev_addr_list *da;
+	struct dev_addr_list *da, *next;
 	int err = 0;
 
 	netif_tx_lock_bh(to);
-	for (da = from-mc_list; da != NULL; da = da-next) {
+	da = from-mc_list;
+	while (da != NULL) {
+		next = da-next;
 		if (!da-da_synced) {
 			err = __dev_addr_add(to-mc_list, to-mc_count,
 	 da-da_addr, da-da_addrlen, 0);
@@ -134,6 +136,7 @@ int dev_mc_sync(struct net_device *to, s
 			__dev_addr_delete(from-mc_list, from-mc_count,
 	  da-da_addr, da-da_addrlen, 0);
 		}
+		da = next;
 	}
 	if (!err)
 		__dev_set_rx_mode(to);
@@ -156,12 +159,14 @@ EXPORT_SYMBOL(dev_mc_sync);
  */
 void dev_mc_unsync(struct net_device *to, struct net_device *from)
 {
-	struct dev_addr_list *da;
+	struct dev_addr_list *da, *next;
 
 	netif_tx_lock_bh(from);
 	netif_tx_lock_bh(to);
 
-	for (da = from-mc_list; da != NULL; da = da-next) {
+	da = from-mc_list;
+	while (da != NULL) {
+		next = da-next;
 		if (!da-da_synced)
 			continue;
 		__dev_addr_delete(to-mc_list, to-mc_count,
@@ -169,6 +174,7 @@ void dev_mc_unsync(struct net_device *to
 		da-da_synced = 0;
 		__dev_addr_delete(from-mc_list, from-mc_count,
   da-da_addr, da-da_addrlen, 0);
+		da = next;
 	}
 	__dev_set_rx_mode(to);
 


[PATCH 0/1] net/core: Crash in dev_mc_sync() when putting macvlan interface up

2007-08-21 Thread Benjamin Thery
Hi,

My kernel crashed while testing macvlan interfaces on 2.6.23-rc2.
(See kernel panic below)

The culprit is dev_mc_sync(). In this routine, we delete 
elements from 'from-mc_list' unsafely. 
While going through the list, we may delete one of the element 
(__dev_addr_delete(from-mc_list,...)), and then try to continue
from that same element that have just been freed: for(..., da = da-next).

It took me some time to understand why only one of my test machines
was crashing. After a while I discovered my crashing victim has 
CONFIG_DEBUG_SLAB=y set, which poisons the freed 'struct dev_addr_list'.
(Now I love poison!)

The crash can be reproduced by setting the option CONFIG_DEBUG_SLAB=y.
Then, add a macvlan interface and set it up.

$ ip link add link eth0 type macvlan

$ ip link macvlan0 up

BUG: unable to handle kernel paging request at virtual address 6b6b6b6b
 printing eip:
c025e9b4
*pde = 
Oops:  [#1]
Modules linked in:
CPU:0
EIP:0060:[c025e9b4]Not tainted VLI
EFLAGS: 0282   (2.6.23-rc2-eb-netns #6)
EIP is at dev_mc_sync+0x5f/0x197
eax: 0025   ebx: c11e5dec   ecx:    edx: 0046
esi: 6b6b6b6b   edi: c1134060   ebp: c742fe6c   esp: c742fe48
ds: 007b   es: 007b   fs:   gs: 0033  ss: 0068
Process ifconfig (pid: 937, ti=c742e000 task=c1128000 task.ti=c742e000)
Stack: c034c6dc 6b6b6b6b c1134060 c7bd2180  c1134218 c7bd2180 c7bd2338 
   1002 c742fe74 c02238a4 c742fe80 c025a9d8 c7bd2180 c742fe90 c025ab78 
   c7bd2180 1043 c742fe9c c025ce66 c7bd2180 c742fec0 c025b034 c7bd2180 
Call Trace:
 [c0102c66] show_trace_log_lvl+0x1a/0x2f
 [c0102d18] show_stack_log_lvl+0x9d/0xa5
 [c0102ede] show_registers+0x1be/0x28f
 [c0103097] die+0xe8/0x208
 [c010d555] do_page_fault+0x4ba/0x595
 [c02e3e62] error_code+0x6a/0x70
 [c02238a4] macvlan_set_multicast_list+0x15/0x17
 [c025a9d8] __dev_set_rx_mode+0x7e/0x81
 [c025ab78] dev_set_rx_mode+0x25/0x3a
 [c025ce66] dev_open+0x4b/0x6a
 [c025b034] dev_change_flags+0xa4/0x159
 [c028da20] devinet_ioctl+0x204/0x506
 [c028e082] inet_ioctl+0x86/0xa4
 [c02538f6] sock_ioctl+0x159/0x177
 [c0152ac4] do_ioctl+0x1c/0x51
 [c0152ce5] vfs_ioctl+0x1ec/0x203
 [c0152d2d] sys_ioctl+0x31/0x48
 [c01025ea] syscall_call+0x7/0xb
 ===
Code: 87 c8 01 00 00 00 00 00 00 8b b0 f8 00 00 00 c7 45 ec 00 00 00 00 e9 0a 
01 00 00 89 74 24 04 c7 04 24 dc c6 34 c0 e8 57 44 eb ff 8b 06 c7 04 24 f9 c6 
34 c0 89 44 24 04 e8 45 44 eb ff 80 7e 25 
EIP: [c025e9b4] dev_mc_sync+0x5f/0x197 SS:ESP 0068:c742fe48
Kernel panic - not syncing: Fatal exception in interrupt


I think the problem may also exist in dev_mc_unsync().

I have a patch that seems to fix the issue for me.

Hope this helps.

Regards,
Benjamin
-- 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/1] net/core: Fix crash in dev_mc_sync()/dev_mc_unsync()

2007-08-21 Thread Benjamin Thery
This patch fixes a crash that may occur when the routine dev_mc_sync()
deletes an address from the list it is currently going through. It 
saves the pointer to the next element before deleting the current one.
The problem may also exist in dev_mc_unsync().

Signed-off-by: Benjamin Thery [EMAIL PROTECTED]
---
 net/core/dev_mcast.c |   14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

Index: linux-2.6.23-rc2/net/core/dev_mcast.c
===
--- linux-2.6.23-rc2.orig/net/core/dev_mcast.c
+++ linux-2.6.23-rc2/net/core/dev_mcast.c
@@ -116,11 +116,13 @@ int dev_mc_add(struct net_device *dev, v
  */
 int dev_mc_sync(struct net_device *to, struct net_device *from)
 {
-   struct dev_addr_list *da;
+   struct dev_addr_list *da, *next;
int err = 0;
 
netif_tx_lock_bh(to);
-   for (da = from-mc_list; da != NULL; da = da-next) {
+   da = from-mc_list;
+   while (da != NULL) {
+   next = da-next;
if (!da-da_synced) {
err = __dev_addr_add(to-mc_list, to-mc_count,
 da-da_addr, da-da_addrlen, 0);
@@ -134,6 +136,7 @@ int dev_mc_sync(struct net_device *to, s
__dev_addr_delete(from-mc_list, from-mc_count,
  da-da_addr, da-da_addrlen, 0);
}
+   da = next;
}
if (!err)
__dev_set_rx_mode(to);
@@ -156,12 +159,14 @@ EXPORT_SYMBOL(dev_mc_sync);
  */
 void dev_mc_unsync(struct net_device *to, struct net_device *from)
 {
-   struct dev_addr_list *da;
+   struct dev_addr_list *da, next;
 
netif_tx_lock_bh(from);
netif_tx_lock_bh(to);
 
-   for (da = from-mc_list; da != NULL; da = da-next) {
+   da = from-mc_list;
+   while (da != NULL) {
+   next = da-next;
if (!da-da_synced)
continue;
__dev_addr_delete(to-mc_list, to-mc_count,
@@ -169,6 +174,7 @@ void dev_mc_unsync(struct net_device *to
da-da_synced = 0;
__dev_addr_delete(from-mc_list, from-mc_count,
  da-da_addr, da-da_addrlen, 0);
+   da = next;
}
__dev_set_rx_mode(to);
 

-- 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] restore netdev_priv optimization

2007-08-20 Thread Benjamin Thery

Hi,

David Miller wrote:

From: Stephen Hemminger [EMAIL PROTECTED]
Date: Fri, 17 Aug 2007 15:40:22 -0700


Compile tested only!!!


Obviously.  The first loopback transmit is guarenteed to crash.

[...]

And this also breaks loopback again, which uses a static struct netdev
in the kernel image, it doesn't use alloc_netdev(), so egress_subqueue
of loopback will be NULL.


Talking about loopback, don't you think it could be the right time
to make it behave like any other kind of net devices, and allocate it
dynamically.

Having a dynamically allocated loopback could make maintenance easier
(removing special cases).
Also this is something we'll need to support multiple loopbacks for 
example for network namespaces.


Eric Biederman has written a nice patch that does this.
I'm using it on 2.6.23-rc2.

Benjamin

--
B e n j a m i n   T h e r y  - BULL/DT/Open Software RD

   http://www.bull.com
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


L2 network namespaces + macvlan performances

2007-07-06 Thread Benjamin Thery

Following a discussion we had at OLS concerning L2 network namespace
performances and how the new macvlan driver could potentially improve
them, I've ported the macvlan patchset on top of Eric's net namespace
patchset on 2.6.22-rc4-mm2.

A little bit of history:

Some months ago, when we ran some performance tests (using netperf)
on net namespace, we observed the following things:

Using 'etun', the virtual ethernet tunnel driver, and IP routes
from inside a network namespace,

- The throughput is the same as the normal case(*)
  (* normal case: no namespace, using physical adapters).
  No regression. Good.

- But the CPU load increases a lot. Bad.
  The reasons are:
- All checksums are done in software. No hardware offloading.
- Every TCP packets going through the etun devices are
  duplicated in ip_forward() before we decrease the ttl.
  (packets are routed between both ends of etun)

We also made some testing with bridges, and obtained the same results:
CPU load increase:
- No hardware offloading
- Packets are duplicated somewhere in the bridge+netfilter
  code (can't remember where right now)


This time, I've replaced the etun interface by the new macvlan,
which should benefits from the hardware offloading capabilities of the
physical adapter and suppress the forwarding stuff.

My test setup is:

  Host AHost B
 _____
|  _   |  |   |
| | Netns 1 |  |  |   |
| | |  |  |   |
| | macvlan0|  |  |   |
| |___|_|  |  |   |
| ||  |   |
|_||  |___|
  | eth0 (192.168.0.2) | eth0 (192.168.0.1)
  ||
-
macvlan0 (192.168.0.3)

- netperf runs on host A
- netserver runs on host B
- Adapters speed is 1GB/s

On this setup I ran the following netperf tests: TCP_STREAM, 
TCP_MAERTS, TCP_RR, UDP_STREAM, UDP_RR.


Between the normal case and the net namespace + macvlan case, 
results are  about the same for both the throughput and the local CPU 
load for the following test types: TCP_MAERTS, TCP_RR, UDP_STREAM, UDP_RR.


macvlan looks like a very good candidate for network namespace in 
these cases.


But, with the TCP_STREAM test, I observed the CPU load is about the
same (that's what we wanted) but the throughput decreases by about 5%:
from 850MB/s down to 810MB/s.
I haven't investigated yet why the throughput decrease in the case.
Does it come from my setup, from macvlan additional treatments, other? 
I don't know yet


Attached to this email you'll find the raw netperf outputs for the 
three cases:


- netperf through a physical adapter, no namespace:
netperf-results-2.6.22-rc4-mm2-netns1-vanilla.txt   
- netperf through etun, inside a namespace:
netperf-results-2.6.22-rc4-mm2-netns1-using-etun.txt
- netperf through macvlan, inside a namespace:
netperf-results-2.6.22-rc4-mm2-netns1-using-macvlan.txt


macvlan looks promising.

Regards,
Benjamin

--
B e n j a m i n   T h e r y  - BULL/DT/Open Software RD

   http://www.bull.com
NETPERF RESULTS: the normal case : 

No network namespace, traffic goes through real 1GB/s physical adapters.


TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.76.1 
(192.168.76.1) port 0 AF_INET : +/-2.5% @ 95% conf.
Recv   SendSend  Utilization   Service Demand
Socket Socket  Message  Elapsed  Send Recv SendRecv
Size   SizeSize Time Throughput  localremote   local   remote
bytes  bytes   bytessecs.10^6bits/s  % S  % S  us/KB   us/KB

 87380  16384   140020.03   857.39   6.39 9.75 2.444   3.727  

 

TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.76.1 
(192.168.76.1) port 0 AF_INET : +/-2.5% @ 95% conf.
Recv   SendSend  Utilization   Service Demand
Socket Socket  Message  Elapsed  Send Recv SendRecv
Size   SizeSize Time Throughput  localremote   local   remote
bytes  bytes   bytessecs.10^6bits/s  % S  % S  us/KB   us/KB

 87380  16384  8738020.03   763.15   4.75 10.332.038   4.434  

 

TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.76.1 
(192.168.76.1) port 0 AF_INET : +/-2.5% @ 95% conf.
Local /Remote
Socket Size   Request Resp.  Elapsed Trans.   CPUCPUS.dem   S.dem
Send   Recv   SizeSize   TimeRate local  remote local   remote
bytes  bytes  bytes   bytes  secs.  

Re: [Devel] Re: [PATCH] Virtual ethernet tunnel

2007-06-07 Thread Benjamin Thery

David Miller wrote:

From: Kirill Korotaev [EMAIL PROTECTED]
Date: Thu, 07 Jun 2007 12:14:29 +0400


David Miller wrote:

From: Pavel Emelianov [EMAIL PROTECTED]
Date: Wed, 06 Jun 2007 19:11:38 +0400



Veth stands for Virtual ETHernet. It is a simple tunnel driver
that works at the link layer and looks like a pair of ethernet
devices interconnected with each other.


I would suggest choosing a different name.

'veth' is also the name of the virtualized ethernet device
found on IBM machines, driven by driver/net/ibmveth.[ch]

AFAICS, ibmveth.c registers ethX devices, while this driver registers
vethX by default, so there is no much conflict IMHO.


If that's the case, veth is fine with me.


I like Daniel's proposals with the tunnel or pipe thing in the name.
I think it is more explicit about what the device really is.

I'm currently using etun, Eric Biederman's implementation. It will be 
nice to have this kind of device merged.


-- Benjamin

--
B e n j a m i n   T h e r y  - BULL/DT/Open Software RD

   http://www.bull.com
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: L2 network namespace benchmarking (resend with Service Demand)

2007-04-06 Thread Benjamin Thery

Eric W. Biederman wrote:

Daniel Lezcano [EMAIL PROTECTED] writes:


Hi,

as suggested Rick, I added the Service Demand results to the matrix.


A couple of random thoughts in trying to understand the numbers you are
seeing.

- Checksum offloading?

  You have noted that with the bridge netfilter support disabled you
  are still seeing additional checksum overhead.  Just like you are
  seeing in the routing case.

  Is it possible the problem is simply that etun doesn't support
  checksum offloading, while your normal test hardware does?


Looks like you are 100% correct.
I feel a bit stupid I didn't think about this small difference 
between real NIC and etun.


If I turn off checksum offloading on my physical NIC, the checksum 
overhead (load) measured by oprofile is about the same in both case: 
when running netperf through a real NIC or through an etun tunnel first.


Benjamin


- Tagged VLANs?
  
  Currently you have tested bridging and routing to get the packets to

  a network namespace.  Could you test tagged vlans?

  I'm just curious if we have anything in the network stack today that
  will multiplex a NIC without measurable overhead.

- Without NETNS?

  We should probably see if we can setup the same configuration we are
  testing without network namespaces (just multiple interfaces on the
  same machine) and see if we can still measure the same overhead.
  Just to confirm the overhead is not a network namespace related
  thing.

  I know we can configure the same case with bridging and I am fairly
  confident that we will see the same overhead without network
  namespaces.

  Of the top of my head I am insufficiently clever to think how we
  could configure the routing case without network namespaces,
  although we might be able to force it and if so it would be
  interesting to measure.

I will work to get the etun setup races fixed and to fix whatever
obvious feature deficiencies it has (like no configurable MTU support)
and see if I can get that pushed upstream.  That should make it easier
for other people to reproduce what we are seeing.

Eric
___
Containers mailing list
[EMAIL PROTECTED]
https://lists.linux-foundation.org/mailman/listinfo/containers




--
B e n j a m i n   T h e r y  - BULL/DT/Open Software RD

   http://www.bull.com
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: L2 network namespace benchmarking

2007-03-29 Thread Benjamin Thery

Eric W. Biederman wrote:

Daniel Lezcano [EMAIL PROTECTED] writes:


[...]


* When do you expect to have the network namespace into mainline ?

My current goal is to finish my rebase against 2.6.linus_lastest in
the next couple of days after having figured out how to deal with sysfs.


Great news!
I also have some questions about this updated version:

- Have you integrated the bug fixes and cleanups(*) Daniel wrote for
your previous netns patchset (and the few glitches I reported too)?

(*) available in LXC8 patchset

- Do you already have a public git repository set up for the rebase?
- If it is private, any plan to make it public soon? (That would be great)


I have been doing reviewing in more code then I know what to do with,
and fighting some very strange bugs during the stabilization window.
Which has kept me from doing additional development.  Plus I have
had a cold.


I hope you're getting better... and you'll be able to provide us the
updated patchset very soon :)

[...]


If I read the results right it took a 32bit machine from AMD with
a gigabit interface before you could measure a throughput difference.
That isn't shabby for a non-optimized code path.


Indeed the throughput difference is not significant.
This is very good to see that it stays constant when using the container.
What I'm more worried about is the CPU load increase. But it seems
we've identified some of the culprits.

This afternoon I had a look at why the bridge setup isn't better than
the route setup (section 2.3 and 2.4 of Daniel's report).

In the bridge case, we encounter the same problems as the routes case.
 The oprofile profile is the same: the most demanding routines are
pskb_expand_head and csum_partial_copy_generic.
pskb_expand_head() is also called by skb_cow(), but this time
skb_cow() is called by netfilter's nf_bridge_copy_header().

We can avoid this copy by removing option CONFIG_BRIDGE_NETFILTER.
This copy is made even if netfilter is not used on the host.
Maybe some optimizations can be made in netfilter's code to prevent this.


Regards,
Benjamin

--
B e n j a m i n   T h e r y  - BULL/DT/Open Software RD

   http://www.bull.com

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC 17/31] net: Factor out __dev_alloc_name from dev_alloc_name

2007-03-05 Thread Benjamin Thery

Hello Eric,

See comments about __dev_alloc_name() below.

Regards,
Benjamin

Eric W. Biederman wrote:

From: Eric W. Biederman [EMAIL PROTECTED] - unquoted

When forcibly changing the network namespace of a device
I need something that can generate a name for the device
in the new namespace without overwriting the old name.

__dev_alloc_name provides me that functionality.

Signed-off-by: Eric W. Biederman [EMAIL PROTECTED]
---
 net/core/dev.c |   44 +---
 1 files changed, 33 insertions(+), 11 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 32fe905..fc0d2af 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -655,9 +655,10 @@ int dev_valid_name(const char *name)
 }
 
 /**

- * dev_alloc_name - allocate a name for a device
- * @dev: device
+ * __dev_alloc_name - allocate a name for a device
+ * @net: network namespace to allocate the device name in
  * @name: name format string
+ * @buf:  scratch buffer and result name string
  *
  * Passed a format string - eg lt%d it will try and find a suitable
  * id. It scans list of devices to build up a free map, then chooses
@@ -668,18 +669,13 @@ int dev_valid_name(const char *name)
  * Returns the number of the unit assigned or a negative errno code.
  */
 
-int dev_alloc_name(struct net_device *dev, const char *name)

+static int __dev_alloc_name(net_t net, const char *name, char buf[IFNAMSIZ])


IMHO the third parameter should be: char *buf
Indeed using char buf[IFNAMSIZ] is misleading because later in the 
routine sizeof(buf) is used (with an expected result of IFNAMSIZ).
Unfortunately this is no longer the case: sizeof(buf) value is only 4 
now (buf is pointer parameter).


This corrupts the registration of network devices (now I understand 
why only one of my e1000 showed up after each reboot :).


Also sizeof(buf) should be replaced by IFNAMSIZ in this new routine.
(See below)


 {
int i = 0;
-   char buf[IFNAMSIZ];
const char *p;
const int max_netdevices = 8*PAGE_SIZE;
long *inuse;
struct net_device *d;
-   net_t net;
-
-   BUG_ON(null_net(dev-nd_net));
-   net = dev-nd_net;
 
 	p = strnchr(name, IFNAMSIZ-1, '%');

if (p) {
@@ -713,10 +709,8 @@ int dev_alloc_name(struct net_device *dev, const char 
*name)
}
 
 	snprintf(buf, sizeof(buf), name, i);


Replace snprintf(buf, IFNAMSIZ, name, i); or i will never be 
appended to name and all your ethernet devices will all try to 
register the name eth.


There is another occurence of snprintf(buf, sizeof(buf), ...) to 
replace in the for loop above.



-   if (!__dev_get_by_name(net, buf)) {
-   strlcpy(dev-name, buf, IFNAMSIZ);
+   if (!__dev_get_by_name(net, buf))
return i;
-   }
 
 	/* It is possible to run out of possible slots

 * when the name is long and there isn't enough space left
@@ -725,6 +719,34 @@ int dev_alloc_name(struct net_device *dev, const char 
*name)
return -ENFILE;
 }
 
+/**

+ * dev_alloc_name - allocate a name for a device
+ * @dev: device
+ * @name: name format string
+ *
+ * Passed a format string - eg lt%d it will try and find a suitable
+ * id. It scans list of devices to build up a free map, then chooses
+ * the first empty slot. The caller must hold the dev_base or rtnl lock
+ * while allocating the name and adding the device in order to avoid
+ * duplicates.
+ * Limited to bits_per_byte * page size devices (ie 32K on most platforms).
+ * Returns the number of the unit assigned or a negative errno code.
+ */
+
+int dev_alloc_name(struct net_device *dev, const char *name)
+{
+   char buf[IFNAMSIZ];
+   net_t net;
+   int ret;
+
+   BUG_ON(null_net(dev-nd_net));
+   net = dev-nd_net;
+   ret = __dev_alloc_name(net, name, buf);
+   if (ret = 0)
+   strlcpy(dev-name, buf, IFNAMSIZ);
+   return ret;
+}
+
 
 /**

  * dev_change_name - change name of a device



--
B e n j a m i n   T h e r y  - BULL/DT/Open Software RD

   http://www.bull.com
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Lost packets after switching Wi-Fi AP

2006-10-31 Thread Benjamin Thery

Jiri Benc wrote:

On Mon, 30 Oct 2006 15:55:57 +0100, Benjamin Thery wrote:
When I switch my Mobile Node between 2 Wi-Fi Access Points, there is a 
period of time where all the packets I send are lost, although I got 
the netlink event SIOCGIWAP 'up' for the new AP. The device is 
supposed to be ready, but the packets are lost.


Which wireless card are you using? Which version of the kernel?


Hi Jiri,

The kernel version is 2.6.16.20 (the latest kernel version officially 
supported by MIPv6). I'd like to use a 2.6.19 but unfortunately not 
all the IPv6 mobility patches are in.


I reproduced the problem with an Intel Pro Wireless 2200 (latest 
driver version: 1.2.0) and a pcmcia D-Link Airplus G+ DWL-G650+ 
using the ndiswrapper version 1.25.


But I'm not sure the problem is wireless-specific.

And as I wrote in my first message I'm also surprised that when 
noop_enqueue() is used, the return code is NET_XMIT_CN, whereas the 
packet seems to be dropped.


Thanks for your help.

Benjamin


 Jiri




--
B e n j a m i n   T h e r y  - BULL/DT/Open Software RD

   http://www.bull.com
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Lost packets after switching Wi-Fi AP

2006-10-30 Thread Benjamin Thery

Hello,

I work on an extension of the mobility protocol for IPv6 
(FMIPv6-RFC4068) and I've noticed the following problem:


When I switch my Mobile Node between 2 Wi-Fi Access Points, there is a 
period of time where all the packets I send are lost, although I got 
the netlink event SIOCGIWAP 'up' for the new AP. The device is 
supposed to be ready, but the packets are lost.


By 'lost', I mean: silently discarded by the kernel, no error returned 
to the user-space application sending the packet, packets never appear 
on the network monitored with wireshark.


Here is the setup:
--

1. The daemon decides to switch from one AP to the other for some 
reason (better link quality, ...) and set the new ESSID, etc.


2. The daemon waits for the SIOCGIWAP 'up' netlink event.

3. SIOCGIWAP received: the daemon sends a unique Mobility Header 
packet using a raw socket to its new router to signal it has 
successfully moved. sendmsg() returned 0, no error, but the packet 
never shows up.


- The interface has an IPv6 address configured for the new network 
(previously created).

- There is a route between the node and the router.
- I set the socket option IPV6_RECVERR to get all the errors, but none 
shows up.
- The black hole period lasts for about 500ms after the SIOCGIWAP 
event. Every packets sent during this period are lost.
- I tried to get the interface status before sending the packet 
(ioctl(SIOCSIFFLAGS)) but I got a perfect IFF_UP|IFF_RUNNING.


What I've found in the kernel:
--

- The packet is discarded in the packet scheduler in 
net/sched/sch_generic.c::noop_enqueue() which returns NET_XMIT_CN.


- The error doesn't go up to the application because 
net/ipv6/ip6_output.c::ip6_push_pending_frames() filters this type of 
error (using net_xmit_errno(err)).


- noop_enqueue() is used to enqueue the packet because the device has 
been deactivated by link_watch_run_queue() calling dev_deactivate(). 
The device is re-activated about 500ms later.


- According to net/sched/sch_api.c, NET_XMIT_CN means probably this 
packet enqueued, but another one dropped. But it seems to me that 
this packet IS actually dropped in noop_enqueue() (kfree_skb()). 
Shouldn't the routine return NET_XMIT_DROP instead? Then the 
application should be able to get the error code when the device is 
deactivated and retry later?



My questions:
-

- Am I doing something obviously wrong? Is there another event I 
should expect before sending my packet? An event that signals more 
reliably that the link is up and running and associated with the new AP?


- Shouldn't we change the return code in noop_enqueue()?


Thanks a lot for your help,
Benjamin

--
B e n j a m i n   T h e r y  - BULL/DT/Open Software RD

   http://www.bull.com
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html