Thanks Bernhard, valid points. I have access to one mini-pc platform using I211 only and just tested standard operation of the network. Also I might not have looked as much as you into the details and corner cases. Therefore I think it's better to revert this patch for now.
Heiner Am 09.12.2014 um 19:19 schrieb Bernhard Kaindl: > Hello Heiner and Todd, > > IMHO - I think this patch submitted by Heiner may yield to crashes on at > least some of the many chips and cards this driver needs to support - (maybe > just in unusual setups) and I think it is therefore not easy to test for that > the changes are safe, > so I've some questions regarding this patch, and I'd propose to back out this > patch again from Jeff's queue: > > Do you have proof that __none__ of the sleeps (which schedule) that this > patch introduces __can__ __never__ be called from any atomic context or other > context that has locks (e.g. spinlocks) held or otherwise is bad if it would > schedule? > > I ask this because I've looked at the igb driver especially with the need to > remove sources of latency spikes between 1ms and 10ms and was rather scared > away from attempting to replace these delay loops with calls to schedule > (which the usleeps actually are), because my impression was that these delay > loops were possibly called from context that should not schedule (or at least > were before my own patch which I mention below). > > I think some of the changes only get executed when the HW is locked for a > prolonged amount of time, I am quite certain that its not possible to trigger > them in a simple test. Or did you cove these with code coverage - gcov for > example)? > > Even if you would be able to cover all changes, IMO this does not ensure that > none of these places are never called from an atomic context or other context > (e.g. with any kind of locks or HW semaphores) that may or should not sleep. > > I've a special HW setup which caused delays to kick in. It caused latency > spikes between 1 to 10ms, which I needed to fix. But I think we need to be > very careful with this driver that is in production use, because the sleeping > functions aren't "equivalent" as you write: > > The reschedule which they cause, while desirable when always possible at the > specific place, may otherwise put running systems to a sudden grinding halt > if these sleeps trigger, which may be a long time after the system booted, in > situations like cable problems. > > Linux-3.18 includes my change in the igb driver that fixes latency spikes > which we saw only with custom I350 NICs when they were directly connected to > each other without a switch in between - this is just one example of a > strange case when such delay triggered that you likely won't be able to test > without such specific HW setup: > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=7acf631889ec2ba7431a86a4c8db9698a496e964 > > That patch linked above removes a call to igb_read_phy_reg() that was called > with a spinlock held in igb_update_stats() which itself can be called thru > several routes and I guess would be bad if that would enter the scheduler > while holding the spin lock. > > In that specific setup only, igb_read_phy_reg() ended up calling one of the > delays that (I think) your patch changes to sleeps... > > That's why I would recommend that you check some more and maybe pull back the > patch from Jeff's patch queue, resubmit the individual changes with careful > review (not one big global change - that is in my book also against > Documentation/SubmittingPatches), which mandates to submit different > __logical__ changes separately: > > 3) Separate your changes. > > Separate _logical changes_ into a single patch file. > > Also, I am missing some bits that Documentation/SubmittingPatches request, > __emphasis__ mine - e.g.: > > For these reasons, the "summary" must be no more than 70-75 > characters, and it __must__ describe both what the patch changes, as well > as __why__ the patch might be necessary. It is challenging to be both > succinct and descriptive, but that is what a well-written summary > should do. > > You say in your mail that busy-wait is discouraged in non-atomic contexts, > but I think it should be stated that it is indeed a non-atomic context that > is changed by a specific patch submission, and just that it's discouraged > does not mean that it's otherwise safe to change from delays to sleeps in > such way. > > Maybe you could start with the changes to igb_configure_tx_ring() and > (separate from that, for me these are already logically different changes) to > igb_ethtool.c:igb_integrated_phy_loopback() which I guess you should be able > to make positive code coverage tests (even if you only add a printk for a > quick test). > > What was your issue with the IGB driver that prompted you to change these > logically unrelated delays? > Did you check if my change in 3.18 - linked above would fix your issue? > > Best Regards, > Bernhard > > On 2014-12-07 23:16, Heiner Kallweit wrote: >> The usage of longer busy-wait delays is discouraged in non-atomic context. >> See also Documentation/timers/timers-howto.txt >> Therefore replace them with sleeping equivalents. >> I successfully tested the patch on a device with I211. >> >> Signed-off-by: Heiner Kallweit <hkallwe...@gmail.com> >> --- >> drivers/net/ethernet/intel/igb/e1000_82575.c | 2 +- >> drivers/net/ethernet/intel/igb/e1000_i210.c | 2 +- >> drivers/net/ethernet/intel/igb/e1000_phy.c | 10 ++-------- >> drivers/net/ethernet/intel/igb/e1000_phy.h | 1 + >> drivers/net/ethernet/intel/igb/igb.h | 1 + >> drivers/net/ethernet/intel/igb/igb_ethtool.c | 2 +- >> drivers/net/ethernet/intel/igb/igb_main.c | 12 +++++++++++- >> 7 files changed, 18 insertions(+), 12 deletions(-) >> >> diff --git a/drivers/net/ethernet/intel/igb/e1000_82575.c >> b/drivers/net/ethernet/intel/igb/e1000_82575.c >> index 051ea94..d58dd7b 100644 >> --- a/drivers/net/ethernet/intel/igb/e1000_82575.c >> +++ b/drivers/net/ethernet/intel/igb/e1000_82575.c >> @@ -1141,7 +1141,7 @@ static s32 igb_acquire_swfw_sync_82575(struct e1000_hw >> *hw, u16 mask) >> * or other software thread using resource (swmask) >> */ >> igb_put_hw_semaphore(hw); >> - mdelay(5); >> + usleep_range(5000, 6000); >> i++; >> } >> >> diff --git a/drivers/net/ethernet/intel/igb/e1000_i210.c >> b/drivers/net/ethernet/intel/igb/e1000_i210.c >> index 65d9316..345bb19 100644 >> --- a/drivers/net/ethernet/intel/igb/e1000_i210.c >> +++ b/drivers/net/ethernet/intel/igb/e1000_i210.c >> @@ -154,7 +154,7 @@ s32 igb_acquire_swfw_sync_i210(struct e1000_hw *hw, u16 >> mask) >> >> /* Firmware currently using resource (fwmask) */ >> igb_put_hw_semaphore(hw); >> - mdelay(5); >> + usleep_range(5000, 6000); >> i++; >> } >> >> diff --git a/drivers/net/ethernet/intel/igb/e1000_phy.c >> b/drivers/net/ethernet/intel/igb/e1000_phy.c >> index c1bb64d..c127d9b 100644 >> --- a/drivers/net/ethernet/intel/igb/e1000_phy.c >> +++ b/drivers/net/ethernet/intel/igb/e1000_phy.c >> @@ -1652,20 +1652,14 @@ s32 igb_phy_has_link(struct e1000_hw *hw, u32 >> iterations, >> * ownership of the resources, wait and try again to >> * see if they have relinquished the resources yet. >> */ >> - if (usec_interval >= 1000) >> - mdelay(usec_interval/1000); >> - else >> - udelay(usec_interval); >> + igb_sleep_usecs(usec_interval); >> } >> ret_val = hw->phy.ops.read_reg(hw, PHY_STATUS, &phy_status); >> if (ret_val) >> break; >> if (phy_status & MII_SR_LINK_STATUS) >> break; >> - if (usec_interval >= 1000) >> - mdelay(usec_interval/1000); >> - else >> - udelay(usec_interval); >> + igb_sleep_usecs(usec_interval); >> } >> >> *success = (i < iterations) ? true : false; >> diff --git a/drivers/net/ethernet/intel/igb/e1000_phy.h >> b/drivers/net/ethernet/intel/igb/e1000_phy.h >> index 7af4ffa..e80e129 100644 >> --- a/drivers/net/ethernet/intel/igb/e1000_phy.h >> +++ b/drivers/net/ethernet/intel/igb/e1000_phy.h >> @@ -73,6 +73,7 @@ s32 igb_get_cable_length_82580(struct e1000_hw *hw); >> s32 igb_read_phy_reg_gs40g(struct e1000_hw *hw, u32 offset, u16 *data); >> s32 igb_write_phy_reg_gs40g(struct e1000_hw *hw, u32 offset, u16 data); >> s32 igb_check_polarity_m88(struct e1000_hw *hw); >> +void igb_sleep_usecs(u32 usecs); >> >> /* IGP01E1000 Specific Registers */ >> #define IGP01E1000_PHY_PORT_CONFIG 0x10 /* Port Config */ >> diff --git a/drivers/net/ethernet/intel/igb/igb.h >> b/drivers/net/ethernet/intel/igb/igb.h >> index 82d891e..d9b0902 100644 >> --- a/drivers/net/ethernet/intel/igb/igb.h >> +++ b/drivers/net/ethernet/intel/igb/igb.h >> @@ -531,6 +531,7 @@ void igb_ptp_rx_pktstamp(struct igb_q_vector *q_vector, >> unsigned char *va, >> struct sk_buff *skb); >> int igb_ptp_set_ts_config(struct net_device *netdev, struct ifreq *ifr); >> int igb_ptp_get_ts_config(struct net_device *netdev, struct ifreq *ifr); >> +void igb_sleep_usecs(u32 usecs); >> #ifdef CONFIG_IGB_HWMON >> void igb_sysfs_exit(struct igb_adapter *adapter); >> int igb_sysfs_init(struct igb_adapter *adapter); >> diff --git a/drivers/net/ethernet/intel/igb/igb_ethtool.c >> b/drivers/net/ethernet/intel/igb/igb_ethtool.c >> index 02cfd3b..bf524db 100644 >> --- a/drivers/net/ethernet/intel/igb/igb_ethtool.c >> +++ b/drivers/net/ethernet/intel/igb/igb_ethtool.c >> @@ -1645,7 +1645,7 @@ static int igb_integrated_phy_loopback(struct >> igb_adapter *adapter) >> if (hw->phy.type == e1000_phy_m88) >> igb_phy_disable_receiver(adapter); >> >> - mdelay(500); >> + msleep(500); >> return 0; >> } >> >> diff --git a/drivers/net/ethernet/intel/igb/igb_main.c >> b/drivers/net/ethernet/intel/igb/igb_main.c >> index 3c02216..277797f 100644 >> --- a/drivers/net/ethernet/intel/igb/igb_main.c >> +++ b/drivers/net/ethernet/intel/igb/igb_main.c >> @@ -294,6 +294,16 @@ static const struct igb_reg_info igb_reg_info_tbl[] = { >> {} >> }; >> >> +void igb_sleep_usecs(u32 usecs) >> +{ >> + if (usecs > 10000) >> + msleep(usecs / 1000 + 1); >> + else if (usecs > 10) >> + usleep_range(usecs, 2 * usecs); >> + else >> + udelay(usecs); >> +} >> + >> /* igb_regdump - register printout routine */ >> static void igb_regdump(struct e1000_hw *hw, struct igb_reg_info *reginfo) >> { >> @@ -3267,7 +3277,7 @@ void igb_configure_tx_ring(struct igb_adapter *adapter, >> /* disable the queue */ >> wr32(E1000_TXDCTL(reg_idx), 0); >> wrfl(); >> - mdelay(10); >> + msleep(10); >> >> wr32(E1000_TDLEN(reg_idx), >> ring->count * sizeof(union e1000_adv_tx_desc)); > > > -- > *Bernhard Kaindl* > SW Engineer for Control Platform Systems > ---------------------------------------------------- > *Thales Austria GmbH > Handelskai 92, 1200 Vienna, Austria > * > > Tel.: +43 1 27711-5095 > Fax: +43 1 27711-3171 > Email: bernhard.kai...@thalesgroup.com > <mailto:bernhard.kai...@thalesgroup.com> > www.thalesgroup.com/austria > ------------------------------------------------------------------------------ Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server from Actuate! Instantly Supercharge Your Business Reports and Dashboards with Interactivity, Sharing, Native Excel Exports, App Integration & more Get technology previously reserved for billion-dollar corporations, FREE http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk _______________________________________________ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired