Hello Heiner and Todd,

IMHO - I think this patch submitted by Heiner may yield to crashes on at least some of the many chips and cards this driver needs to support - (maybe just in unusual setups) and I think it is therefore not easy to test for that the changes are safe, so I've some questions regarding this patch, and I'd propose to back out this patch again from Jeff's queue:

Do you have proof that __none__ of the sleeps (which schedule) that this patch introduces __can__ __never__ be called from any atomic context or other context that has locks (e.g. spinlocks) held or otherwise is bad if it would schedule?

I ask this because I've looked at the igb driver especially with the need to remove sources of latency spikes between 1ms and 10ms and was rather scared away from attempting to replace these delay loops with calls to schedule (which the usleeps actually are), because my impression was that these delay loops were possibly called from context that should not schedule (or at least were before my own patch which I mention below).

I think some of the changes only get executed when the HW is locked for a prolonged amount of time, I am quite certain that its not possible to trigger them in a simple test. Or did you cove these with code coverage - gcov for example)?

Even if you would be able to cover all changes, IMO this does not ensure that none of these places are never called from an atomic context or other context (e.g. with any kind of locks or HW semaphores) that may or should not sleep.

I've a special HW setup which caused delays to kick in. It caused latency spikes between 1 to 10ms, which I needed to fix. But I think we need to be very careful with this driver that is in production use, because the sleeping functions aren't "equivalent" as you write:

The reschedule which they cause, while desirable when always possible at the specific place, may otherwise put running systems to a sudden grinding halt if these sleeps trigger, which may be a long time after the system booted, in situations like cable problems.

Linux-3.18 includes my change in the igb driver that fixes latency spikes which we saw only with custom I350 NICs when they were directly connected to each other without a switch in between - this is just one example of a strange case when such delay triggered that you likely won't be able to test without such specific HW setup:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=7acf631889ec2ba7431a86a4c8db9698a496e964

That patch linked above removes a call to igb_read_phy_reg() that was called with a spinlock held in igb_update_stats() which itself can be called thru several routes and I guess would be bad if that would enter the scheduler while holding the spin lock.

In that specific setup only, igb_read_phy_reg() ended up calling one of the delays that (I think) your patch changes to sleeps...

That's why I would recommend that you check some more and maybe pull back the patch from Jeff's patch queue, resubmit the individual changes with careful review (not one big global change - that is in my book also against Documentation/SubmittingPatches), which mandates to submit different __logical__ changes separately:

   3) Separate your changes.

   Separate _logical changes_ into a single patch file.

Also, I am missing some bits that Documentation/SubmittingPatches request, __emphasis__ mine - e.g.:

   For these reasons, the "summary" must be no more than 70-75
   characters, and it __must__ describe both what the patch changes, as
   well
   as __why__ the patch might be necessary.  It is challenging to be both
   succinct and descriptive, but that is what a well-written summary
   should do.

You say in your mail that busy-wait is discouraged in non-atomic contexts, but I think it should be stated that it is indeed a non-atomic context that is changed by a specific patch submission, and just that it's discouraged does not mean that it's otherwise safe to change from delays to sleeps in such way.

Maybe you could start with the changes to igb_configure_tx_ring() and (separate from that, for me these are already logically different changes) to igb_ethtool.c:igb_integrated_phy_loopback() which I guess you should be able to make positive code coverage tests (even if you only add a printk for a quick test).

What was your issue with the IGB driver that prompted you to change these logically unrelated delays?
Did you check if my change in 3.18 - linked above would fix your issue?

Best Regards,
Bernhard

On 2014-12-07 23:16, Heiner Kallweit wrote:
The usage of longer busy-wait delays is discouraged in non-atomic context.
See also Documentation/timers/timers-howto.txt
Therefore replace them with sleeping equivalents.
I successfully tested the patch on a device with I211.

Signed-off-by: Heiner Kallweit<hkallwe...@gmail.com>
---
  drivers/net/ethernet/intel/igb/e1000_82575.c |  2 +-
  drivers/net/ethernet/intel/igb/e1000_i210.c  |  2 +-
  drivers/net/ethernet/intel/igb/e1000_phy.c   | 10 ++--------
  drivers/net/ethernet/intel/igb/e1000_phy.h   |  1 +
  drivers/net/ethernet/intel/igb/igb.h         |  1 +
  drivers/net/ethernet/intel/igb/igb_ethtool.c |  2 +-
  drivers/net/ethernet/intel/igb/igb_main.c    | 12 +++++++++++-
  7 files changed, 18 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/e1000_82575.c 
b/drivers/net/ethernet/intel/igb/e1000_82575.c
index 051ea94..d58dd7b 100644
--- a/drivers/net/ethernet/intel/igb/e1000_82575.c
+++ b/drivers/net/ethernet/intel/igb/e1000_82575.c
@@ -1141,7 +1141,7 @@ static s32 igb_acquire_swfw_sync_82575(struct e1000_hw 
*hw, u16 mask)
                 * or other software thread using resource (swmask)
                 */
                igb_put_hw_semaphore(hw);
-               mdelay(5);
+               usleep_range(5000, 6000);
                i++;
        }

diff --git a/drivers/net/ethernet/intel/igb/e1000_i210.c 
b/drivers/net/ethernet/intel/igb/e1000_i210.c
index 65d9316..345bb19 100644
--- a/drivers/net/ethernet/intel/igb/e1000_i210.c
+++ b/drivers/net/ethernet/intel/igb/e1000_i210.c
@@ -154,7 +154,7 @@ s32 igb_acquire_swfw_sync_i210(struct e1000_hw *hw, u16 
mask)

                /* Firmware currently using resource (fwmask) */
                igb_put_hw_semaphore(hw);
-               mdelay(5);
+               usleep_range(5000, 6000);
                i++;
        }

diff --git a/drivers/net/ethernet/intel/igb/e1000_phy.c 
b/drivers/net/ethernet/intel/igb/e1000_phy.c
index c1bb64d..c127d9b 100644
--- a/drivers/net/ethernet/intel/igb/e1000_phy.c
+++ b/drivers/net/ethernet/intel/igb/e1000_phy.c
@@ -1652,20 +1652,14 @@ s32 igb_phy_has_link(struct e1000_hw *hw, u32 
iterations,
                         * ownership of the resources, wait and try again to
                         * see if they have relinquished the resources yet.
                         */
-                       if (usec_interval>= 1000)
-                               mdelay(usec_interval/1000);
-                       else
-                               udelay(usec_interval);
+                       igb_sleep_usecs(usec_interval);
                }
                ret_val = hw->phy.ops.read_reg(hw, PHY_STATUS,&phy_status);
                if (ret_val)
                        break;
                if (phy_status&  MII_SR_LINK_STATUS)
                        break;
-               if (usec_interval>= 1000)
-                       mdelay(usec_interval/1000);
-               else
-                       udelay(usec_interval);
+               igb_sleep_usecs(usec_interval);
        }

        *success = (i<  iterations) ? true : false;
diff --git a/drivers/net/ethernet/intel/igb/e1000_phy.h 
b/drivers/net/ethernet/intel/igb/e1000_phy.h
index 7af4ffa..e80e129 100644
--- a/drivers/net/ethernet/intel/igb/e1000_phy.h
+++ b/drivers/net/ethernet/intel/igb/e1000_phy.h
@@ -73,6 +73,7 @@ s32  igb_get_cable_length_82580(struct e1000_hw *hw);
  s32  igb_read_phy_reg_gs40g(struct e1000_hw *hw, u32 offset, u16 *data);
  s32  igb_write_phy_reg_gs40g(struct e1000_hw *hw, u32 offset, u16 data);
  s32  igb_check_polarity_m88(struct e1000_hw *hw);
+void igb_sleep_usecs(u32 usecs);

  /* IGP01E1000 Specific Registers */
  #define IGP01E1000_PHY_PORT_CONFIG        0x10 /* Port Config */
diff --git a/drivers/net/ethernet/intel/igb/igb.h 
b/drivers/net/ethernet/intel/igb/igb.h
index 82d891e..d9b0902 100644
--- a/drivers/net/ethernet/intel/igb/igb.h
+++ b/drivers/net/ethernet/intel/igb/igb.h
@@ -531,6 +531,7 @@ void igb_ptp_rx_pktstamp(struct igb_q_vector *q_vector, 
unsigned char *va,
                         struct sk_buff *skb);
  int igb_ptp_set_ts_config(struct net_device *netdev, struct ifreq *ifr);
  int igb_ptp_get_ts_config(struct net_device *netdev, struct ifreq *ifr);
+void igb_sleep_usecs(u32 usecs);
  #ifdef CONFIG_IGB_HWMON
  void igb_sysfs_exit(struct igb_adapter *adapter);
  int igb_sysfs_init(struct igb_adapter *adapter);
diff --git a/drivers/net/ethernet/intel/igb/igb_ethtool.c 
b/drivers/net/ethernet/intel/igb/igb_ethtool.c
index 02cfd3b..bf524db 100644
--- a/drivers/net/ethernet/intel/igb/igb_ethtool.c
+++ b/drivers/net/ethernet/intel/igb/igb_ethtool.c
@@ -1645,7 +1645,7 @@ static int igb_integrated_phy_loopback(struct igb_adapter 
*adapter)
        if (hw->phy.type == e1000_phy_m88)
                igb_phy_disable_receiver(adapter);

-       mdelay(500);
+       msleep(500);
        return 0;
  }

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c 
b/drivers/net/ethernet/intel/igb/igb_main.c
index 3c02216..277797f 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -294,6 +294,16 @@ static const struct igb_reg_info igb_reg_info_tbl[] = {
        {}
  };

+void igb_sleep_usecs(u32 usecs)
+{
+       if (usecs>  10000)
+               msleep(usecs / 1000 + 1);
+       else if (usecs>  10)
+               usleep_range(usecs, 2 * usecs);
+       else
+               udelay(usecs);
+}
+
  /* igb_regdump - register printout routine */
  static void igb_regdump(struct e1000_hw *hw, struct igb_reg_info *reginfo)
  {
@@ -3267,7 +3277,7 @@ void igb_configure_tx_ring(struct igb_adapter *adapter,
        /* disable the queue */
        wr32(E1000_TXDCTL(reg_idx), 0);
        wrfl();
-       mdelay(10);
+       msleep(10);

        wr32(E1000_TDLEN(reg_idx),
             ring->count * sizeof(union e1000_adv_tx_desc));


--
*Bernhard Kaindl*
SW Engineer for Control Platform Systems
----------------------------------------------------
*Thales Austria GmbH
Handelskai 92, 1200 Vienna, Austria
*

Tel.: +43 1 27711-5095
Fax: +43 1 27711-3171
Email: bernhard.kai...@thalesgroup.com <mailto:bernhard.kai...@thalesgroup.com>
www.thalesgroup.com/austria

------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit 
http://communities.intel.com/community/wired

Reply via email to