You don't know what the problem is however even though you made a code change where you don't see the issue any more. Before a patch can be applied there needs to root cause of the issue and there isn't enough information to understand what the issue really is. The e1000 driver has been in use for a _very_ long time without this issue being reported so it's very unclear that the driver is at fault.
So more information is needed. Like what kernel version is being used? What is the .config for that kernel enable? What is you test doing? What other HW is in the system? Does dmesg show other event happening in the system that is causing other problems? What kind of system HW is this (type of motherboard, etc.)? What exact e1000 device is being used? Maybe more after you answer these questions. Cheers, John > -----Original Message----- > From: zhuyj [mailto:[email protected]] > Sent: Thursday, August 15, 2013 12:09 AM > To: [email protected] > Subject: Re: [E1000-devel] e1000 nic hang after a long time running > > Hi, maintainer > > Would you like to comment on this patch? > Thanks a lot. > > Best Regards! > Zhu Yanjun > > On 08/15/2013 03:01 PM, zhuyj wrote: > > Hi, > > > > After a long time networking test case running, e1000 NIC driver may > > not work anymore. At this time, system is okay, we can execute some > > non-network command(such as ls, cp etc.), but if we execute network > > command(ifconfig), system will hang there, can not get response > anymore. > > We add some log in driver and found this was caused by mutex nest, it > > means normaly, one mutex got and then release, another mutex was got, > > but when issue occur, from log, the first mutex was got, did not > > release, then got mutex again: > > > > /*****************************************************/ > > Jul 6 19:08:28 localhost kernel: e1000 0000:02:08.0: eth7: > > e1000_reinit_safe set __E1000_RESETTING Jul 6 19:08:28 localhost > > kernel: e1000 0000:02:08.0: eth7: > > e1000_reinit_safe take adapter's mutex Jul 6 19:08:28 localhost > > kernel: e1000 0000:02:08.0: eth7: > > e1000_watchdog take adapter's mutex > > Jul 6 19:08:28 localhost kernel: e1000 0000:02:03.0: eth3: > > e1000_reinit_safe release adapter's mutex Jul 6 19:08:28 localhost > > kernel: e1000 0000:02:03.0: eth3: > > e1000_reinit_safe reset __E1000_RESETTING > > /*****************************************************/ > > > > We made the following patch and applied this patch. This problem > > disappeared. > > Please comment on this patch. > > Thanks a lot. > > > > /***********************************************/ > > diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c > > b/drivers/net/ethernet/intel/e1000/e1000_main.c > > index 7569ebb..2878308 100644 > > --- a/drivers/net/ethernet/intel/e1000/e1000_main.c > > +++ b/drivers/net/ethernet/intel/e1000/e1000_main.c > > @@ -2441,7 +2441,8 @@ static void e1000_watchdog(struct work_struct > *work) > > struct e1000_tx_ring *txdr = adapter->tx_ring; > > u32 link, tctl; > > > > - if (test_bit(__E1000_DOWN, &adapter->flags)) > > + if (test_bit(__E1000_DOWN, &adapter->flags) || > > + test_bit(__E1000_RESETTING, > > &adapter->flags)) > > return; > > > > /***********************************************/ > > > > zhuyj ------------------------------------------------------------------------------ Get 100% visibility into Java/.NET code with AppDynamics Lite! It's a free troubleshooting tool designed for production. Get down to code-level detail for bottlenecks, with <2% overhead. Download for free and get started troubleshooting in minutes. http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk _______________________________________________ E1000-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired
