Hi Thomas,
Can you try this patch please?
if you are using multi-threaded click on smp linux, I think you'd better
apply dev_watchdog patch too.
(https://pdos.csail.mit.edu/pipermail/click/2007-October/006436.html)
This patch fixes (I hope) :
- ifconfig down && up hang on polling device.
- incorrect link detection on polling device.
- never woken from stopped queue on polling device.
Signed-off-by: Joonwoo Park <[EMAIL PROTECTED]>
---
drivers/e1000-7.x/src/e1000_main.c | 7 +++++++
elements/linuxmodule/anydevice.hh | 5 +++--
2 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/drivers/e1000-7.x/src/e1000_main.c
b/drivers/e1000-7.x/src/e1000_main.c
index 2a3c5f0..95cca7f 100644
--- a/drivers/e1000-7.x/src/e1000_main.c
+++ b/drivers/e1000-7.x/src/e1000_main.c
@@ -745,6 +745,12 @@ e1000_reinit_locked(struct e1000_adapter *adapter)
e1000_down(adapter);
e1000_up(adapter);
clear_bit(__E1000_RESETTING, &adapter->flags);
+
+ if (adapter->netdev->polling) {
+ mod_timer(&adapter->tx_fifo_stall_timer, jiffies + 1);
+ mod_timer(&adapter->watchdog_timer, jiffies + 1);
+ mod_timer(&adapter->phy_info_timer, jiffies + 1);
+ }
}
void
@@ -5780,6 +5786,7 @@ e1000_tx_pqueue(struct net_device *netdev, struct sk_buff
*skb)
if(E1000_DESC_UNUSED(adapter->tx_ring) <= (txd_needed + 1)) {
adapter->net_stats.tx_dropped++;
netif_stop_queue(netdev);
+ mod_timer(&adapter->tx_fifo_stall_timer, jiffies + 1);
return -1;
}
diff --git a/elements/linuxmodule/anydevice.hh
b/elements/linuxmodule/anydevice.hh
index 84eb658..16ee738 100644
--- a/elements/linuxmodule/anydevice.hh
+++ b/elements/linuxmodule/anydevice.hh
@@ -173,6 +173,7 @@ class AnyDeviceMap { public:
AnyDevice *_unknown_map;
AnyDevice *_map[MAP_SIZE];
rwlock_t _lock;
+ unsigned long _flags;
};
@@ -190,7 +191,7 @@ inline void
AnyDeviceMap::lock(bool write)
{
if (write)
- write_lock_bh(&_lock);
+ write_lock_irqsave(&_lock, _flags);
else
read_lock(&_lock);
}
@@ -199,7 +200,7 @@ inline void
AnyDeviceMap::unlock(bool write)
{
if (write)
- write_unlock_bh(&_lock);
+ write_unlock_irqrestore(&_lock, _flags);
else
read_unlock(&_lock);
}
---
Thanks.
Joonwoo Park (Jason Park)
2007-10-02 (화), 11:09 -0500, Paine, Thomas Asa wrote:
> Joonwoo,
> I'm not calling up and down from the OS after click is installed.
> I'm simply disconnecting the network cables. It appears to only be affecting
> the first link event. Other polling nics continue to work (even if I pull
> and reconnect their cables).
> I added some debugging to the watchdog handlers, and the watchdog
> callback continues to get called for all nics after the first one goes down,
> but there is no more watchdog events for the first one (the first one can be
> any of the polling nics). So, whatever nic goes down first doesn't ever get
> a chance to have its link detected, in the e1000_watchdog_1() function.
>
> Example (output below), I have eth0, eth1, and eth2 polling. I pull
> the cable on eth2, you see it drop, but also see the watchdog events stop for
> eth2. I then pull eth1 and you see it drop, however you can see the watchdog
> callbacks continue for eth1, thus its able to sense a link coming back up.
> Why does the first link down event cause that Nic's watchdogs to stop?
>
> e1000_poll_on
> e1000_poll_on
> e1000_poll_on
> e1000: eth0: e1000_watchdog: called on polling adapter.
> e1000: eth1: e1000_watchdog: called on polling adapter.
> e1000: eth1: e1000_watchdog_1: NIC Link is Down
> e1000: eth2: e1000_watchdog: called on polling adapter.
> e1000: eth2: e1000_watchdog_1: NIC Link is Down
> e1000: eth0: e1000_watchdog: called on polling adapter.
> e1000: eth1: e1000_watchdog: called on polling adapter.
> e1000: eth2: e1000_watchdog: called on polling adapter.
> e1000: eth0: e1000_watchdog: called on polling adapter.
> e1000: eth1: e1000_watchdog: called on polling adapter.
> e1000: eth1: e1000_watchdog_1: NIC Link is Up 1000 Mbps Full Duplex, Flow
> Control: None
> e1000: eth2: e1000_watchdog: called on polling adapter.
> e1000: eth2: e1000_watchdog_1: NIC Link is Up 1000 Mbps Full Duplex, Flow
> Control: None
> ToDevice eth0 rejected a packet!
> chatter: sniffQueue :: Queue: overflow
> e1000: eth0: e1000_watchdog: called on polling adapter.
> e1000: eth1: e1000_watchdog: called on polling adapter.
> e1000: eth2: e1000_watchdog: called on polling adapter.
> e1000: eth0: e1000_watchdog: called on polling adapter.
> e1000: eth1: e1000_watchdog: called on polling adapter.
> e1000: eth2: e1000_watchdog: called on polling adapter.
> e1000: eth0: e1000_watchdog: called on polling adapter.
> e1000: eth1: e1000_watchdog: called on polling adapter.
> e1000: eth2: e1000_watchdog: called on polling adapter.
> e1000: eth0: e1000_watchdog: called on polling adapter.
> e1000: eth1: e1000_watchdog: called on polling adapter.
> e1000: eth2: e1000_watchdog: called on polling adapter.
> e1000: eth0: e1000_watchdog: called on polling adapter.
> e1000: eth1: e1000_watchdog: called on polling adapter.
> e1000: eth2: e1000_watchdog: called on polling adapter.
> ToDevice eth2 rejected a packet!
> chatter: outFlows/outputQueue :: Queue: overflow
> e1000: eth0: e1000_watchdog: called on polling adapter.
> e1000: eth1: e1000_watchdog: called on polling adapter.
> e1000: eth2: e1000_watchdog: called on polling adapter.
> e1000: eth2: e1000_watchdog_1: NIC Link is Down
>
> < !!!! no more eth2 watchdog callbacks getting called !!!! >
>
> e1000: eth0: e1000_watchdog: called on polling adapter.
> e1000: eth1: e1000_watchdog: called on polling adapter.
> e1000: eth0: e1000_watchdog: called on polling adapter.
> e1000: eth1: e1000_watchdog: called on polling adapter.
> e1000: eth0: e1000_watchdog: called on polling adapter.
> e1000: eth1: e1000_watchdog: called on polling adapter.
> e1000: eth0: e1000_watchdog: called on polling adapter.
> e1000: eth1: e1000_watchdog: called on polling adapter.
> e1000: eth0: e1000_watchdog: called on polling adapter.
> e1000: eth1: e1000_watchdog: called on polling adapter.
> e1000: eth1: e1000_watchdog_1: NIC Link is Down
>
> < eth1 goes down, but the watchdogs continue, which is good >
>
> e1000: eth0: e1000_watchdog: called on polling adapter.
> e1000: eth1: e1000_watchdog: called on polling adapter.
> e1000: eth0: e1000_watchdog: called on polling adapter.
> e1000: eth1: e1000_watchdog: called on polling adapter.
> e1000: eth0: e1000_watchdog: called on polling adapter.
> e1000: eth1: e1000_watchdog: called on polling adapter.
> e1000: eth1: e1000_watchdog_1: NIC Link is Up 1000 Mbps Full Duplex, Flow
> Control: None
>
> < eth1 comes up, and processing continues >
>
> e1000: eth0: e1000_watchdog: called on polling adapter.
> e1000: eth1: e1000_watchdog: called on polling adapter.
> e1000: eth0: e1000_watchdog: called on polling adapter.
> e1000: eth1: e1000_watchdog: called on polling adapter.
>
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Thomas Paine [EMAIL PROTECTED])}
> University of Wisconsin - Eau Claire
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> From: Joonwoo Park [mailto:[EMAIL PROTECTED]
> Sent: Monday, October 01, 2007 9:12 PM
> To: Paine, Thomas Asa
> Cc: [email protected]
> Subject: Re: [Click] 7.x e1000 on click 1.6.0 w/ 2.6.19.2 kernel
>
> Hi Thomas,
> It's seems you are using PollDevice, am I right?
> I couldn't find that problem, which e1000 nic are you using? (eg 82546GB)
> While looking for a problem that you posted, I've just found that ifconfig
> down & up doesn't work on a interface which is running PollDevice.
> Maybe I think It can be related to your problem.
>
> Thanks.
> Joonwoo Park (Jason Park)
> 2007/10/2, Paine, Thomas Asa < [EMAIL PROTECTED]>:
> In updating one of my click packages to run under Click 1.6.0 (git
> pulled this morning), and after swimming through the Changelogs for the last
> year, I got things up and running. However I noticed that the Nic driver, if
> the link drops, will not recover unless the click module is removed. My unit
> will be handling traffic, but when I pull a network cable, I'll see the
> watchdog message for the link going down, but it will not come back up until
> I remove the click kernel module.
> My production units are running under a ~12/2006 CVS release of 1.5.0
> on a 2.6.16.13 kernel, and I *think a 6.x version of the e1000 driver, but
> they do not have the problem I described. So, I'm not sure what changes
> prompted the problem I'm seeing.
>
> < disconnect network cable cable >
> e1000: eth1: e1000_watchdog_1: NIC Link is Down
> < restore cable connection, but no further dmesg occurs >
> < run click-uninstall >
> e1000_poll_off
> e1000: eth1: e1000_watchdog_1: NIC Link is Up 1000 Mbps Full Duplex, Flow
> Control: None
> e1000_poll_off
> e1000_poll_off
> click: stopping router thread pid 1307
> poll e0f7b360: 3684/920934 freed, 1437/460660 allocated
> poll e0f7b580: 8779/5325393 freed, 8329/2658970 allocated
> click module exiting
> click error: 683 outstanding news
>
> Just thought I would post this in case there are some open issues
> that I'm not aware of and to get a thread started. I'll be doing some more
> digging as well.
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Thomas Paine [EMAIL PROTECTED])}
> University of Wisconsin - Eau Claire
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
>
>
> _______________________________________________
> click mailing list
> [email protected]
> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>
>
> _______________________________________________
> click mailing list
> [email protected]
> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
_______________________________________________
click mailing list
[email protected]
https://amsterdam.lcs.mit.edu/mailman/listinfo/click