On Tuesday 17 July 2007 10:57, Ingo Molnar wrote: > i've got a new observation: changing CONFIG_HZ from 250 to 1000 makes > the problem go away. So it's somehow also related to jiffies.
There are several "Tx Hang detected" messages in the log, which looks a lot as if net_rx_action never runs, or at least never calls dev->poll on the e1000 nic. Can you check whether/how often it bails out of net_rx_action taking the softnet_break path? My suspicion right now is that dev->quota goes way negative when pushing out netconsole output. Normally, we bump the quota in __net_rx_schedule. But the whole point of the patch is that netpoll has no business removing the device from the poll_list, so it stays there, and we don't end up calling __net_rx_schedule as often as we would otherwise. Can you try what happens if you change netif_rx_complete to something like this: if (test_bit(__LINK_STATE_POLL_LIST_FROZEN, &dev->state)) { dev->quota = dev->weight; return; } This is just a hack to make sure that we don't go to insanely negative quotas while sending packets through netpoll. Olaf -- Olaf Kirch | --- o --- Nous sommes du soleil we love when we play [EMAIL PROTECTED] | / | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/