Re: [PATCH] [NET]: Fix Ooops of napi net_rx_action.
From: Stephen Hemminger <[EMAIL PROTECTED]> Date: Tue, 11 Dec 2007 15:42:22 -0800 > Perhaps we should change the warning to identify the guilty device. Applied. Stephen, you often don't supply a proper signoff line for one-off changes like this and I find it very irritating. It doesn't cost you anything to hit one or two keys in your outgoing email editor to include the signoff line. So please do so, or I'll silently drop patches without them in the future. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [NET]: Fix Ooops of napi net_rx_action.
From: Stephen Hemminger [EMAIL PROTECTED] Date: Tue, 11 Dec 2007 15:42:22 -0800 Perhaps we should change the warning to identify the guilty device. Applied. Stephen, you often don't supply a proper signoff line for one-off changes like this and I find it very irritating. It doesn't cost you anything to hit one or two keys in your outgoing email editor to include the signoff line. So please do so, or I'll silently drop patches without them in the future. Thanks. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [NET]: Fix Ooops of napi net_rx_action.
From: "Brandeburg, Jesse" <[EMAIL PROTECTED]> Date: Tue, 11 Dec 2007 16:38:37 -0800 > @@ -3933,6 +3933,10 @@ quit_polling: > e1000_set_itr(adapter); > netif_rx_complete(poll_dev, napi); > e1000_irq_enable(adapter); > + if (work_done == weight) > + return work_done - 1; > + else > + return work_done; Don't do this. If you processed "weight" worth of packets, return that exact value and do not netif_rx_complete() and do not re-enable interrupts. That is the only correct fix. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [NET]: Fix Ooops of napi net_rx_action.
From: Brandeburg, Jesse [EMAIL PROTECTED] Date: Tue, 11 Dec 2007 16:38:37 -0800 @@ -3933,6 +3933,10 @@ quit_polling: e1000_set_itr(adapter); netif_rx_complete(poll_dev, napi); e1000_irq_enable(adapter); + if (work_done == weight) + return work_done - 1; + else + return work_done; Don't do this. If you processed weight worth of packets, return that exact value and do not netif_rx_complete() and do not re-enable interrupts. That is the only correct fix. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] [NET]: Fix Ooops of napi net_rx_action.
Joonwoo Park wrote: > 2007/12/12, Brandeburg, Jesse <[EMAIL PROTECTED]>: >> >> all drivers using NAPI in 2.6.24+ (NNAPI??) must return zero here, >> after calling netif_rx_complete. netif_rx_complete plus work_done >> != 0 causes a bug. >> > > Brandeburg, > Don't we need to return non-zero work_done after netif_rx_complete if > work_done != weight? Actually we just need to make sure we don't return work_done == weight as it is checked on line 2210 of dev.c. I should also note that I was wrong above, and that the requirement is that we MUST not return work_done == weight when indicating netif_rx_complete. So, returning 0 when we actually did some work, but are being removed from the poll list because something like !netif_running, is probably okay, but I'm sure someone will disagree with me. Maybe returning like this would be better diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c index 724f067..76d5e3b 100644 --- a/drivers/net/e1000/e1000_main.c +++ b/drivers/net/e1000/e1000_main.c @@ -3933,6 +3933,10 @@ quit_polling: e1000_set_itr(adapter); netif_rx_complete(poll_dev, napi); e1000_irq_enable(adapter); + if (work_done == weight) + return work_done - 1; + else + return work_done; } return work_done; Jesse -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [NET]: Fix Ooops of napi net_rx_action.
2007/12/12, Brandeburg, Jesse <[EMAIL PROTECTED]>: > > all drivers using NAPI in 2.6.24+ (NNAPI??) must return zero here, after > calling netif_rx_complete. netif_rx_complete plus work_done != 0 causes > a bug. > Brandeburg, Don't we need to return non-zero work_done after netif_rx_complete if work_done != weight? Thanks, Joonwoo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [NET]: Fix Ooops of napi net_rx_action.
Perhaps we should change the warning to identify the guilty device. --- a/net/core/dev.c2007-11-19 09:09:57.0 -0800 +++ b/net/core/dev.c2007-12-07 15:54:03.0 -0800 @@ -2196,7 +2196,13 @@ static void net_rx_action(struct softirq if (test_bit(NAPI_STATE_SCHED, >state)) work = n->poll(n, weight); - WARN_ON_ONCE(work > weight); + if (unlikely(work > weight)) { + if (net_ratelimit()) + printk(KERN_WARNING + "%s: driver poll bug (work=%d weight=%d)\n", + work, weight); + work = weight; + } budget -= work; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [NET]: Fix Ooops of napi net_rx_action.
2007/12/12, Brandeburg, Jesse <[EMAIL PROTECTED]>: > Joonwoo Park wrote: > > /* If no Tx and not enough Rx work done, exit the polling mode */ > > if ((!tx_cleaned && (work_done == 0)) || > >!netif_running(poll_dev)) { > > quit_polling: > > if (likely(adapter->itr_setting & 3)) > > e1000_set_itr(adapter); > > netif_rx_complete(poll_dev, napi); > > e1000_irq_enable(adapter); > > all drivers using NAPI in 2.6.24+ (NNAPI??) must return zero here, after > calling netif_rx_complete. netif_rx_complete plus work_done != 0 causes > a bug. > > > } > > > > return work_done; > > } > > > I'm working another patch for drivers (maybe patches) Thanks. Joonwoo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] [NET]: Fix Ooops of napi net_rx_action.
Joonwoo Park wrote: > /* If no Tx and not enough Rx work done, exit the polling mode */ > if ((!tx_cleaned && (work_done == 0)) || >!netif_running(poll_dev)) { > quit_polling: > if (likely(adapter->itr_setting & 3)) > e1000_set_itr(adapter); > netif_rx_complete(poll_dev, napi); > e1000_irq_enable(adapter); all drivers using NAPI in 2.6.24+ (NNAPI??) must return zero here, after calling netif_rx_complete. netif_rx_complete plus work_done != 0 causes a bug. > } > > return work_done; > } > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [NET]: Fix Ooops of napi net_rx_action.
2007/12/11, David Miller <[EMAIL PROTECTED]>: > From: "Joonwoo Park" <[EMAIL PROTECTED]> > Date: Tue, 11 Dec 2007 18:13:34 +0900 > > Joonwoo-ssi annyoung haseyo, Wow Great! :-) > How can the NAPI_STATE_SCHED bit be cleared externally yet we take > this list_move_tail() code path? > > If NAPI_STATE_SCHED is cleared, work will be zero which will never be > equal to 'weight', and this we'll never attempt the list_move_tail(). > > If something clears NAPI_STATE_SCHED meanwhile, we have a serious race > and your patch is an incomplete bandaid. For example, if it can > happen, then a case like: > >if (test_bit(NAPI_STATE_SCHED, >state)) >... something clears NAPI_STATE_SCHED right now ... >work = n->poll(n, weight); > > can crash too. > David, With your suggestions, I'm feeling a doubt at e1000. It's seems to me e1000_clean does call netif_rx_complete and returns non-zero work_done when !netif_running() So net_rx_action has (work == weight) as true with NAPI_STATE_SCHED cleared. Here is the e1000_clean. static int e1000_clean(struct napi_struct *napi, int budget) { /* Keep link state information with original netdev */ if (!netif_carrier_ok(poll_dev)) goto quit_polling; ... adapter->clean_rx(adapter, >rx_ring[0], _done, budget); /* If no Tx and not enough Rx work done, exit the polling mode */ if ((!tx_cleaned && (work_done == 0)) || !netif_running(poll_dev)) { quit_polling: if (likely(adapter->itr_setting & 3)) e1000_set_itr(adapter); netif_rx_complete(poll_dev, napi); e1000_irq_enable(adapter); } return work_done; } Thanks. Joonwoo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [NET]: Fix Ooops of napi net_rx_action.
From: Herbert Xu <[EMAIL PROTECTED]> Date: Tue, 11 Dec 2007 20:36:21 +0800 > David Miller <[EMAIL PROTECTED]> wrote: > > > > How can the NAPI_STATE_SCHED bit be cleared externally yet we take > > this list_move_tail() code path? > > His driver is probably buggy. When we had two drivers beginning > with e100 we often forgot to apply fixes to the both of them. Now > that we have three it's even more confusing. > > I just checked and indeed e1000e seems to be missing the NAPI fix > that was applied to e1000. Of course it doesn't rule out the > possibility of another NAPI bug in e1000. Thanks for checking. Indeed I stuck the huge comment there in net_rx_action() above the list move to try and explain things to people, so that if you saw a crash in the list manipulation, you're go check the driver first. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [NET]: Fix Ooops of napi net_rx_action.
David Miller <[EMAIL PROTECTED]> wrote: > > How can the NAPI_STATE_SCHED bit be cleared externally yet we take > this list_move_tail() code path? His driver is probably buggy. When we had two drivers beginning with e100 we often forgot to apply fixes to the both of them. Now that we have three it's even more confusing. I just checked and indeed e1000e seems to be missing the NAPI fix that was applied to e1000. Of course it doesn't rule out the possibility of another NAPI bug in e1000. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [NET]: Fix Ooops of napi net_rx_action.
From: "Joonwoo Park" <[EMAIL PROTECTED]> Date: Tue, 11 Dec 2007 18:13:34 +0900 Joonwoo-ssi annyoung haseyo, > [NET]: Fix Ooops of napi net_rx_action. > Before doing list_move_tail napi poll_list, it should be ensured > > Signed-off-by: Joonwoo Park <[EMAIL PROTECTED]> > --- > diff --git a/net/core/dev.c b/net/core/dev.c > index 86d6261..74bd5ab 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -2207,7 +2207,8 @@ static void net_rx_action(struct softirq_action *h) >* still "owns" the NAPI instance and therefore can >* move the instance around on the list at-will. >*/ > - if (unlikely(work == weight)) > + if (unlikely((test_bit(NAPI_STATE_SCHED, >state)) > + && (work == weight))) > list_move_tail(>poll_list, list); > > netpoll_poll_unlock(have); How can the NAPI_STATE_SCHED bit be cleared externally yet we take this list_move_tail() code path? If NAPI_STATE_SCHED is cleared, work will be zero which will never be equal to 'weight', and this we'll never attempt the list_move_tail(). If something clears NAPI_STATE_SCHED meanwhile, we have a serious race and your patch is an incomplete bandaid. For example, if it can happen, then a case like: if (test_bit(NAPI_STATE_SCHED, >state)) ... something clears NAPI_STATE_SCHED right now ... work = n->poll(n, weight); can crash too. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [NET]: Fix Ooops of napi net_rx_action.
[NET]: Fix Ooops of napi net_rx_action. Before doing list_move_tail napi poll_list, it should be ensured Signed-off-by: Joonwoo Park <[EMAIL PROTECTED]> --- diff --git a/net/core/dev.c b/net/core/dev.c index 86d6261..74bd5ab 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -2207,7 +2207,8 @@ static void net_rx_action(struct softirq_action *h) * still "owns" the NAPI instance and therefore can * move the instance around on the list at-will. */ - if (unlikely(work == weight)) + if (unlikely((test_bit(NAPI_STATE_SCHED, >state)) + && (work == weight))) list_move_tail(>poll_list, list); netpoll_poll_unlock(have); --- I was able to make it real oops on 2.6.24-rc4 + e1000 napi + smp. eth0 of my laptop is connected directly to a packet generator. The packet generator is making unicast udp packets to laptop (approximately 7pps). At that time, if the interface is went down, Ooops occurred. Thanks. Joonwoo [EMAIL PROTECTED]:~# ifconfig eth0 down BUG: unable to handle kernel paging request at virtual address 00100104 printing eip: c42ecc35 *pdpt = 1fc89001 *pde = Ooops: 0002 [#1] SMP Modules linked in: Pid: 4, comm:ksoftirqd/0 Not tainted (2.6.24-rc4 #27) EIP: 0060:[] EFLAGS: 00010046 CPU: 0 EIP is at net_rx_action+0x1d5/0x1f0 EAX: 00100100 EBX: f77c965c ECX: EDX: 00200200 ESI: 0040 EDI: f77c965c EBP: f7c6df94 ESP: f7c6df64 DS: 007b ES: 007b FS: 00d8 GS: SS: 0068 Process ksoftirqd/0 (pid: 4, ti=f7c6c000 task=f7c68ab0 task.ti=f7c6c000) Stack: 0002 0001 c42ecabe c4598f40 c1d0f5cc c7da 00ac f77c965c 0040 0001 c4543b18 c4598f40 f7c6dfb0 c4032a07 0004 0246 c4032f60 f7c6dfbc c4032ad7 c459ba80 f6c6dfcc c4032fb9 Call Trace: [] show_trace_log_lvl+0x1a/0x30 [] show_stack_log_lvl+0xa9/0xd0 [] show_registers+0xca/0x1c0 [] die+0x115/0x230 [] do_page_fault+0x34c/0x7f0 [] error_code+0x72/0x78 [] __do_softirq+0x87/0x60 [] do_softirq+0x57/0xe0 [] kthread+0x42/0x70 [] kernel_thread_helper+0x7/0x14 gdb outputs (gdb) info line *net_rx_action+0x1d5 Line 157 of "include/linux/list.h" starts at address 0xc42ecc35 and ends at 0xc42ecc38 . (gdb) l include/linux/list.h:157 152 * This is only for internal list manipulation where we know 153 * the prev/next entries already! 154 */ 155 static inline void __list_del(struct list_head * prev, struct list_head * next) 156 { 157 next->prev = prev; 158 prev->next = next; 159 } 160 161 /** (gdb) info line *__do_softirq+0x87 Line 130 of "include/linux/rcupdate.h" starts at address 0xc4032a07 <__do_softirq+135> and ends at 0xc4032a0a <__do_softirq+138>. (gdb) l include/linux/rcupdate.h:130 125 struct rcu_data *rdp = _cpu(rcu_data, cpu); 126 rdp->passed_quiesc = 1; 127 } 128 static inline void rcu_bh_qsctr_inc(int cpu) 129 { 130 struct rcu_data *rdp = _cpu(rcu_bh_data, cpu); 131 rdp->passed_quiesc = 1; 132 } 133 134 extern int rcu_pending(int cpu); (gdb) info line *do_softirq+0x57 Line 269 of "kernel/softirq.c" starts at address 0xc4032ad2 and ends at 0xc4032ae0 . (gdb) l kernel/softirq.c:269 264 local_irq_save(flags); 265 266 pending = local_softirq_pending(); 267 268 if (pending) 269 __do_softirq(); 270 271 local_irq_restore(flags); 272 } 273 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [NET]: Fix Ooops of napi net_rx_action.
[NET]: Fix Ooops of napi net_rx_action. Before doing list_move_tail napi poll_list, it should be ensured Signed-off-by: Joonwoo Park [EMAIL PROTECTED] --- diff --git a/net/core/dev.c b/net/core/dev.c index 86d6261..74bd5ab 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -2207,7 +2207,8 @@ static void net_rx_action(struct softirq_action *h) * still owns the NAPI instance and therefore can * move the instance around on the list at-will. */ - if (unlikely(work == weight)) + if (unlikely((test_bit(NAPI_STATE_SCHED, n-state)) +(work == weight))) list_move_tail(n-poll_list, list); netpoll_poll_unlock(have); --- I was able to make it real oops on 2.6.24-rc4 + e1000 napi + smp. eth0 of my laptop is connected directly to a packet generator. The packet generator is making unicast udp packets to laptop (approximately 7pps). At that time, if the interface is went down, Ooops occurred. Thanks. Joonwoo [EMAIL PROTECTED]:~# ifconfig eth0 down BUG: unable to handle kernel paging request at virtual address 00100104 printing eip: c42ecc35 *pdpt = 1fc89001 *pde = Ooops: 0002 [#1] SMP Modules linked in: Pid: 4, comm:ksoftirqd/0 Not tainted (2.6.24-rc4 #27) EIP: 0060:[c42ecc35] EFLAGS: 00010046 CPU: 0 EIP is at net_rx_action+0x1d5/0x1f0 EAX: 00100100 EBX: f77c965c ECX: EDX: 00200200 ESI: 0040 EDI: f77c965c EBP: f7c6df94 ESP: f7c6df64 DS: 007b ES: 007b FS: 00d8 GS: SS: 0068 Process ksoftirqd/0 (pid: 4, ti=f7c6c000 task=f7c68ab0 task.ti=f7c6c000) Stack: 0002 0001 c42ecabe c4598f40 c1d0f5cc c7da 00ac f77c965c 0040 0001 c4543b18 c4598f40 f7c6dfb0 c4032a07 0004 0246 c4032f60 f7c6dfbc c4032ad7 c459ba80 f6c6dfcc c4032fb9 Call Trace: [c40053ca] show_trace_log_lvl+0x1a/0x30 [c4005489] show_stack_log_lvl+0xa9/0xd0 [c400557a] show_registers+0xca/0x1c0 [c4005785] die+0x115/0x230 [c4020f0c] do_page_fault+0x34c/0x7f0 [c43c9cea] error_code+0x72/0x78 [c4032a07] __do_softirq+0x87/0x60 [c4032ad7] do_softirq+0x57/0xe0 [c4041ee2] kthread+0x42/0x70 [c4004fc3] kernel_thread_helper+0x7/0x14 gdb outputs (gdb) info line *net_rx_action+0x1d5 Line 157 of include/linux/list.h starts at address 0xc42ecc35 net_rx_action+469 and ends at 0xc42ecc38 net_rx_action+472. (gdb) l include/linux/list.h:157 152 * This is only for internal list manipulation where we know 153 * the prev/next entries already! 154 */ 155 static inline void __list_del(struct list_head * prev, struct list_head * next) 156 { 157 next-prev = prev; 158 prev-next = next; 159 } 160 161 /** (gdb) info line *__do_softirq+0x87 Line 130 of include/linux/rcupdate.h starts at address 0xc4032a07 __do_softirq+135 and ends at 0xc4032a0a __do_softirq+138. (gdb) l include/linux/rcupdate.h:130 125 struct rcu_data *rdp = per_cpu(rcu_data, cpu); 126 rdp-passed_quiesc = 1; 127 } 128 static inline void rcu_bh_qsctr_inc(int cpu) 129 { 130 struct rcu_data *rdp = per_cpu(rcu_bh_data, cpu); 131 rdp-passed_quiesc = 1; 132 } 133 134 extern int rcu_pending(int cpu); (gdb) info line *do_softirq+0x57 Line 269 of kernel/softirq.c starts at address 0xc4032ad2 do_softirq+82 and ends at 0xc4032ae0 tasklet_kill. (gdb) l kernel/softirq.c:269 264 local_irq_save(flags); 265 266 pending = local_softirq_pending(); 267 268 if (pending) 269 __do_softirq(); 270 271 local_irq_restore(flags); 272 } 273 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [NET]: Fix Ooops of napi net_rx_action.
From: Joonwoo Park [EMAIL PROTECTED] Date: Tue, 11 Dec 2007 18:13:34 +0900 Joonwoo-ssi annyoung haseyo, [NET]: Fix Ooops of napi net_rx_action. Before doing list_move_tail napi poll_list, it should be ensured Signed-off-by: Joonwoo Park [EMAIL PROTECTED] --- diff --git a/net/core/dev.c b/net/core/dev.c index 86d6261..74bd5ab 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -2207,7 +2207,8 @@ static void net_rx_action(struct softirq_action *h) * still owns the NAPI instance and therefore can * move the instance around on the list at-will. */ - if (unlikely(work == weight)) + if (unlikely((test_bit(NAPI_STATE_SCHED, n-state)) + (work == weight))) list_move_tail(n-poll_list, list); netpoll_poll_unlock(have); How can the NAPI_STATE_SCHED bit be cleared externally yet we take this list_move_tail() code path? If NAPI_STATE_SCHED is cleared, work will be zero which will never be equal to 'weight', and this we'll never attempt the list_move_tail(). If something clears NAPI_STATE_SCHED meanwhile, we have a serious race and your patch is an incomplete bandaid. For example, if it can happen, then a case like: if (test_bit(NAPI_STATE_SCHED, n-state)) ... something clears NAPI_STATE_SCHED right now ... work = n-poll(n, weight); can crash too. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [NET]: Fix Ooops of napi net_rx_action.
David Miller [EMAIL PROTECTED] wrote: How can the NAPI_STATE_SCHED bit be cleared externally yet we take this list_move_tail() code path? His driver is probably buggy. When we had two drivers beginning with e100 we often forgot to apply fixes to the both of them. Now that we have three it's even more confusing. I just checked and indeed e1000e seems to be missing the NAPI fix that was applied to e1000. Of course it doesn't rule out the possibility of another NAPI bug in e1000. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED] Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [NET]: Fix Ooops of napi net_rx_action.
From: Herbert Xu [EMAIL PROTECTED] Date: Tue, 11 Dec 2007 20:36:21 +0800 David Miller [EMAIL PROTECTED] wrote: How can the NAPI_STATE_SCHED bit be cleared externally yet we take this list_move_tail() code path? His driver is probably buggy. When we had two drivers beginning with e100 we often forgot to apply fixes to the both of them. Now that we have three it's even more confusing. I just checked and indeed e1000e seems to be missing the NAPI fix that was applied to e1000. Of course it doesn't rule out the possibility of another NAPI bug in e1000. Thanks for checking. Indeed I stuck the huge comment there in net_rx_action() above the list move to try and explain things to people, so that if you saw a crash in the list manipulation, you're go check the driver first. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [NET]: Fix Ooops of napi net_rx_action.
2007/12/11, David Miller [EMAIL PROTECTED]: From: Joonwoo Park [EMAIL PROTECTED] Date: Tue, 11 Dec 2007 18:13:34 +0900 Joonwoo-ssi annyoung haseyo, Wow Great! :-) How can the NAPI_STATE_SCHED bit be cleared externally yet we take this list_move_tail() code path? If NAPI_STATE_SCHED is cleared, work will be zero which will never be equal to 'weight', and this we'll never attempt the list_move_tail(). If something clears NAPI_STATE_SCHED meanwhile, we have a serious race and your patch is an incomplete bandaid. For example, if it can happen, then a case like: if (test_bit(NAPI_STATE_SCHED, n-state)) ... something clears NAPI_STATE_SCHED right now ... work = n-poll(n, weight); can crash too. David, With your suggestions, I'm feeling a doubt at e1000. It's seems to me e1000_clean does call netif_rx_complete and returns non-zero work_done when !netif_running() So net_rx_action has (work == weight) as true with NAPI_STATE_SCHED cleared. Here is the e1000_clean. static int e1000_clean(struct napi_struct *napi, int budget) { /* Keep link state information with original netdev */ if (!netif_carrier_ok(poll_dev)) goto quit_polling; ... adapter-clean_rx(adapter, adapter-rx_ring[0], work_done, budget); /* If no Tx and not enough Rx work done, exit the polling mode */ if ((!tx_cleaned (work_done == 0)) || !netif_running(poll_dev)) { quit_polling: if (likely(adapter-itr_setting 3)) e1000_set_itr(adapter); netif_rx_complete(poll_dev, napi); e1000_irq_enable(adapter); } return work_done; } Thanks. Joonwoo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] [NET]: Fix Ooops of napi net_rx_action.
Joonwoo Park wrote: /* If no Tx and not enough Rx work done, exit the polling mode */ if ((!tx_cleaned (work_done == 0)) || !netif_running(poll_dev)) { quit_polling: if (likely(adapter-itr_setting 3)) e1000_set_itr(adapter); netif_rx_complete(poll_dev, napi); e1000_irq_enable(adapter); all drivers using NAPI in 2.6.24+ (NNAPI??) must return zero here, after calling netif_rx_complete. netif_rx_complete plus work_done != 0 causes a bug. } return work_done; } -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [NET]: Fix Ooops of napi net_rx_action.
2007/12/12, Brandeburg, Jesse [EMAIL PROTECTED]: Joonwoo Park wrote: /* If no Tx and not enough Rx work done, exit the polling mode */ if ((!tx_cleaned (work_done == 0)) || !netif_running(poll_dev)) { quit_polling: if (likely(adapter-itr_setting 3)) e1000_set_itr(adapter); netif_rx_complete(poll_dev, napi); e1000_irq_enable(adapter); all drivers using NAPI in 2.6.24+ (NNAPI??) must return zero here, after calling netif_rx_complete. netif_rx_complete plus work_done != 0 causes a bug. } return work_done; } I'm working another patch for drivers (maybe patches) Thanks. Joonwoo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [NET]: Fix Ooops of napi net_rx_action.
Perhaps we should change the warning to identify the guilty device. --- a/net/core/dev.c2007-11-19 09:09:57.0 -0800 +++ b/net/core/dev.c2007-12-07 15:54:03.0 -0800 @@ -2196,7 +2196,13 @@ static void net_rx_action(struct softirq if (test_bit(NAPI_STATE_SCHED, n-state)) work = n-poll(n, weight); - WARN_ON_ONCE(work weight); + if (unlikely(work weight)) { + if (net_ratelimit()) + printk(KERN_WARNING + %s: driver poll bug (work=%d weight=%d)\n, + work, weight); + work = weight; + } budget -= work; -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [NET]: Fix Ooops of napi net_rx_action.
2007/12/12, Brandeburg, Jesse [EMAIL PROTECTED]: all drivers using NAPI in 2.6.24+ (NNAPI??) must return zero here, after calling netif_rx_complete. netif_rx_complete plus work_done != 0 causes a bug. Brandeburg, Don't we need to return non-zero work_done after netif_rx_complete if work_done != weight? Thanks, Joonwoo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] [NET]: Fix Ooops of napi net_rx_action.
Joonwoo Park wrote: 2007/12/12, Brandeburg, Jesse [EMAIL PROTECTED]: all drivers using NAPI in 2.6.24+ (NNAPI??) must return zero here, after calling netif_rx_complete. netif_rx_complete plus work_done != 0 causes a bug. Brandeburg, Don't we need to return non-zero work_done after netif_rx_complete if work_done != weight? Actually we just need to make sure we don't return work_done == weight as it is checked on line 2210 of dev.c. I should also note that I was wrong above, and that the requirement is that we MUST not return work_done == weight when indicating netif_rx_complete. So, returning 0 when we actually did some work, but are being removed from the poll list because something like !netif_running, is probably okay, but I'm sure someone will disagree with me. Maybe returning like this would be better diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c index 724f067..76d5e3b 100644 --- a/drivers/net/e1000/e1000_main.c +++ b/drivers/net/e1000/e1000_main.c @@ -3933,6 +3933,10 @@ quit_polling: e1000_set_itr(adapter); netif_rx_complete(poll_dev, napi); e1000_irq_enable(adapter); + if (work_done == weight) + return work_done - 1; + else + return work_done; } return work_done; Jesse -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/