Re: [PATCH] e1000e: Fix link status in case of error.

2018-03-01 Thread Alexander Duyck
On Wed, Feb 28, 2018 at 10:40 PM, Benjamin Poirier  wrote:
> On 2018/02/28 08:48, Alexander Duyck wrote:
>> On Tue, Feb 27, 2018 at 9:20 PM, Benjamin Poirier  wrote:
>> > Before commit 19110cfbb34d ("e1000e: Separate signaling for link check/link
>> > up"), errors which happen after "get_link_status = false" in the copper
>> > check_for_link callbacks would be ignored and the link considered up. After
>> > that commit, any error implies that the link is down. Since all
>> > combinations of link up/down and error/no error are possible, do the same
>> > thing as e1000e_phy_has_link_generic() and return the link status in a
>> > separate variable.
>> >
>> > Fixes: 19110cfbb34d ("e1000e: Separate signaling for link check/link up")
>> > Signed-off-by: Benjamin Poirier 
>>
>> This seems more like a refactor than a fix. There are valid cases
>> where errors can be ignored after we set the link to up. For example
>> if we cannot negotiate flow control we may not care as long as the
>> link is established. In such a case we may see errors establishing
>> flow control and they should be ignored.
>
> Indeed, before commit 19110cfbb34d ("e1000e: Separate signaling for link
> check/link up") a failure of e1000e_config_fc_after_link_up() in
> e1000e_check_for_copper_link() would be ignored. After that commit,
> there is an extra 2s delay before link up, like what was just fixed for
> autoneg in commit d8ca384786c2 ("e1000e: Fix check_for_link return value
> with autoneg off"). That was an unintended change in behavior. This is
> what this patch purports to fix. The same is true for other failure
> paths in e1000_check_for_copper_link_ich8lan() that happen after
> get_link_status = false.

Maybe we should just revert 19110cfbb34d ("e1000e: Separate signaling
for link check/link up"). Looking over the logic for it the whole
concept is just wrong. In the real-world it is actually possible for
us to lose link just after testing for it, and when that happens we
wouldn't want the link to be reported as up as it will appear to be a
link flap. If the LSC is actually firing that fast we would want to
stay down, and we know the original issue was we were getting false
LSC messages which should now be solved.

The reason why this behavior is the way it is was because the
assumption was that if we need to check for link then we should assume
the link is down. Getting false LSC events did cause link flaps, but
that was mostly due to misrecognized events, and now that we resolved
that it should no longer be the case. The fact that we are now
ignoring it after testing the link actually leads to undesirable
effects such as us having windows where the link is toggling up and
back down and we could possibly ignore it and report link up anyway
until we can get around to testing it again. If we make use of the RFC
you are currently working on to fix the race I pointed out, and we
revert the patch that separates signaling, then we should stay link
down during a LSC storm and come back up once the link has stabilized.
Otherwise we run the risk of reporting link up which may cause issues
if the link is electrically unstable after a cable is just connected
or pulled and we start trying to use it.

- Alex


Re: [PATCH] e1000e: Fix link status in case of error.

2018-02-28 Thread Benjamin Poirier
On 2018/02/28 08:48, Alexander Duyck wrote:
> On Tue, Feb 27, 2018 at 9:20 PM, Benjamin Poirier  wrote:
> > Before commit 19110cfbb34d ("e1000e: Separate signaling for link check/link
> > up"), errors which happen after "get_link_status = false" in the copper
> > check_for_link callbacks would be ignored and the link considered up. After
> > that commit, any error implies that the link is down. Since all
> > combinations of link up/down and error/no error are possible, do the same
> > thing as e1000e_phy_has_link_generic() and return the link status in a
> > separate variable.
> >
> > Fixes: 19110cfbb34d ("e1000e: Separate signaling for link check/link up")
> > Signed-off-by: Benjamin Poirier 
> 
> This seems more like a refactor than a fix. There are valid cases
> where errors can be ignored after we set the link to up. For example
> if we cannot negotiate flow control we may not care as long as the
> link is established. In such a case we may see errors establishing
> flow control and they should be ignored.

Indeed, before commit 19110cfbb34d ("e1000e: Separate signaling for link
check/link up") a failure of e1000e_config_fc_after_link_up() in
e1000e_check_for_copper_link() would be ignored. After that commit,
there is an extra 2s delay before link up, like what was just fixed for
autoneg in commit d8ca384786c2 ("e1000e: Fix check_for_link return value
with autoneg off"). That was an unintended change in behavior. This is
what this patch purports to fix. The same is true for other failure
paths in e1000_check_for_copper_link_ich8lan() that happen after
get_link_status = false.

> 
> If there are cases where we are setting the link as up to early we
> should address that instead of changing all the functions to make them
> look like other ones.
> 
> > ---
[...]


Re: [PATCH] e1000e: Fix link status in case of error.

2018-02-28 Thread Jeff Kirsher
On Wed, 2018-02-28 at 14:20 +0900, Benjamin Poirier wrote:
> Before commit 19110cfbb34d ("e1000e: Separate signaling for link
> check/link
> up"), errors which happen after "get_link_status = false" in the
> copper
> check_for_link callbacks would be ignored and the link considered up.
> After
> that commit, any error implies that the link is down. Since all
> combinations of link up/down and error/no error are possible, do the
> same
> thing as e1000e_phy_has_link_generic() and return the link status in
> a
> separate variable.
> 
> Fixes: 19110cfbb34d ("e1000e: Separate signaling for link check/link
> up")
> Signed-off-by: Benjamin Poirier 

A minor nitpick, almost every patch you send out, has your patch
title/subject ending in a period which is not needed or wanted.  Patch
titles/subjects should not be complete sentences, so they do not need
punctuation at the end.

signature.asc
Description: This is a digitally signed message part


Re: [PATCH] e1000e: Fix link status in case of error.

2018-02-28 Thread Alexander Duyck
On Tue, Feb 27, 2018 at 9:20 PM, Benjamin Poirier  wrote:
> Before commit 19110cfbb34d ("e1000e: Separate signaling for link check/link
> up"), errors which happen after "get_link_status = false" in the copper
> check_for_link callbacks would be ignored and the link considered up. After
> that commit, any error implies that the link is down. Since all
> combinations of link up/down and error/no error are possible, do the same
> thing as e1000e_phy_has_link_generic() and return the link status in a
> separate variable.
>
> Fixes: 19110cfbb34d ("e1000e: Separate signaling for link check/link up")
> Signed-off-by: Benjamin Poirier 

This seems more like a refactor than a fix. There are valid cases
where errors can be ignored after we set the link to up. For example
if we cannot negotiate flow control we may not care as long as the
link is established. In such a case we may see errors establishing
flow control and they should be ignored.

If there are cases where we are setting the link as up to early we
should address that instead of changing all the functions to make them
look like other ones.

> ---
>  drivers/net/ethernet/intel/e1000e/82571.c   |  6 --
>  drivers/net/ethernet/intel/e1000e/ethtool.c |  5 +++--
>  drivers/net/ethernet/intel/e1000e/hw.h  |  2 +-
>  drivers/net/ethernet/intel/e1000e/ich8lan.c | 19 +++
>  drivers/net/ethernet/intel/e1000e/mac.c | 26 +++---
>  drivers/net/ethernet/intel/e1000e/mac.h |  6 +++---
>  drivers/net/ethernet/intel/e1000e/netdev.c  | 11 +++
>  7 files changed, 40 insertions(+), 35 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/e1000e/82571.c 
> b/drivers/net/ethernet/intel/e1000e/82571.c
> index 6b03c8553e59..980ed89e61ea 100644
> --- a/drivers/net/ethernet/intel/e1000e/82571.c
> +++ b/drivers/net/ethernet/intel/e1000e/82571.c
> @@ -40,7 +40,8 @@
>  static s32 e1000_get_phy_id_82571(struct e1000_hw *hw);
>  static s32 e1000_setup_copper_link_82571(struct e1000_hw *hw);
>  static s32 e1000_setup_fiber_serdes_link_82571(struct e1000_hw *hw);
> -static s32 e1000_check_for_serdes_link_82571(struct e1000_hw *hw);
> +static s32 e1000_check_for_serdes_link_82571(struct e1000_hw *hw,
> +bool *unused);
>  static s32 e1000_write_nvm_eewr_82571(struct e1000_hw *hw, u16 offset,
>   u16 words, u16 *data);
>  static s32 e1000_fix_nvm_checksum_82571(struct e1000_hw *hw);
> @@ -1493,6 +1494,7 @@ static s32 e1000_setup_fiber_serdes_link_82571(struct 
> e1000_hw *hw)
>  /**
>   *  e1000_check_for_serdes_link_82571 - Check for link (Serdes)
>   *  @hw: pointer to the HW structure
> + *  @unused: unused for serdes links
>   *
>   *  Reports the link state as up or down.
>   *
> @@ -1509,7 +1511,7 @@ static s32 e1000_setup_fiber_serdes_link_82571(struct 
> e1000_hw *hw)
>   *  4) forced_up (the link has been forced up, it did not autonegotiate)
>   *
>   **/
> -static s32 e1000_check_for_serdes_link_82571(struct e1000_hw *hw)
> +static s32 e1000_check_for_serdes_link_82571(struct e1000_hw *hw, bool 
> *unused)
>  {
> struct e1000_mac_info *mac = >mac;
> u32 rxcw;
> diff --git a/drivers/net/ethernet/intel/e1000e/ethtool.c 
> b/drivers/net/ethernet/intel/e1000e/ethtool.c
> index 003cbd605799..1946ddae06c0 100644
> --- a/drivers/net/ethernet/intel/e1000e/ethtool.c
> +++ b/drivers/net/ethernet/intel/e1000e/ethtool.c
> @@ -1753,6 +1753,7 @@ static int e1000_loopback_test(struct e1000_adapter 
> *adapter, u64 *data)
>  static int e1000_link_test(struct e1000_adapter *adapter, u64 *data)
>  {
> struct e1000_hw *hw = >hw;
> +   bool link_status;
>
> *data = 0;
> if (hw->phy.media_type == e1000_media_type_internal_serdes) {
> @@ -1764,7 +1765,7 @@ static int e1000_link_test(struct e1000_adapter 
> *adapter, u64 *data)
>  * could take as long as 2-3 minutes
>  */
> do {
> -   hw->mac.ops.check_for_link(hw);
> +   hw->mac.ops.check_for_link(hw, NULL);
> if (hw->mac.serdes_has_link)
> return *data;
> msleep(20);
> @@ -1772,7 +1773,7 @@ static int e1000_link_test(struct e1000_adapter 
> *adapter, u64 *data)
>
> *data = 1;
> } else {
> -   hw->mac.ops.check_for_link(hw);
> +   hw->mac.ops.check_for_link(hw, _status);
> if (hw->mac.autoneg)
> /* On some Phy/switch combinations, link establishment
>  * can take a few seconds more than expected.
> diff --git a/drivers/net/ethernet/intel/e1000e/hw.h 
> b/drivers/net/ethernet/intel/e1000e/hw.h
> index d803b1a12349..4dff6df469bb 100644
> --- a/drivers/net/ethernet/intel/e1000e/hw.h
> +++ b/drivers/net/ethernet/intel/e1000e/hw.h
> @@ -472,7 +472,7 @@ struct 

[PATCH] e1000e: Fix link status in case of error.

2018-02-27 Thread Benjamin Poirier
Before commit 19110cfbb34d ("e1000e: Separate signaling for link check/link
up"), errors which happen after "get_link_status = false" in the copper
check_for_link callbacks would be ignored and the link considered up. After
that commit, any error implies that the link is down. Since all
combinations of link up/down and error/no error are possible, do the same
thing as e1000e_phy_has_link_generic() and return the link status in a
separate variable.

Fixes: 19110cfbb34d ("e1000e: Separate signaling for link check/link up")
Signed-off-by: Benjamin Poirier 
---
 drivers/net/ethernet/intel/e1000e/82571.c   |  6 --
 drivers/net/ethernet/intel/e1000e/ethtool.c |  5 +++--
 drivers/net/ethernet/intel/e1000e/hw.h  |  2 +-
 drivers/net/ethernet/intel/e1000e/ich8lan.c | 19 +++
 drivers/net/ethernet/intel/e1000e/mac.c | 26 +++---
 drivers/net/ethernet/intel/e1000e/mac.h |  6 +++---
 drivers/net/ethernet/intel/e1000e/netdev.c  | 11 +++
 7 files changed, 40 insertions(+), 35 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/82571.c 
b/drivers/net/ethernet/intel/e1000e/82571.c
index 6b03c8553e59..980ed89e61ea 100644
--- a/drivers/net/ethernet/intel/e1000e/82571.c
+++ b/drivers/net/ethernet/intel/e1000e/82571.c
@@ -40,7 +40,8 @@
 static s32 e1000_get_phy_id_82571(struct e1000_hw *hw);
 static s32 e1000_setup_copper_link_82571(struct e1000_hw *hw);
 static s32 e1000_setup_fiber_serdes_link_82571(struct e1000_hw *hw);
-static s32 e1000_check_for_serdes_link_82571(struct e1000_hw *hw);
+static s32 e1000_check_for_serdes_link_82571(struct e1000_hw *hw,
+bool *unused);
 static s32 e1000_write_nvm_eewr_82571(struct e1000_hw *hw, u16 offset,
  u16 words, u16 *data);
 static s32 e1000_fix_nvm_checksum_82571(struct e1000_hw *hw);
@@ -1493,6 +1494,7 @@ static s32 e1000_setup_fiber_serdes_link_82571(struct 
e1000_hw *hw)
 /**
  *  e1000_check_for_serdes_link_82571 - Check for link (Serdes)
  *  @hw: pointer to the HW structure
+ *  @unused: unused for serdes links
  *
  *  Reports the link state as up or down.
  *
@@ -1509,7 +1511,7 @@ static s32 e1000_setup_fiber_serdes_link_82571(struct 
e1000_hw *hw)
  *  4) forced_up (the link has been forced up, it did not autonegotiate)
  *
  **/
-static s32 e1000_check_for_serdes_link_82571(struct e1000_hw *hw)
+static s32 e1000_check_for_serdes_link_82571(struct e1000_hw *hw, bool *unused)
 {
struct e1000_mac_info *mac = >mac;
u32 rxcw;
diff --git a/drivers/net/ethernet/intel/e1000e/ethtool.c 
b/drivers/net/ethernet/intel/e1000e/ethtool.c
index 003cbd605799..1946ddae06c0 100644
--- a/drivers/net/ethernet/intel/e1000e/ethtool.c
+++ b/drivers/net/ethernet/intel/e1000e/ethtool.c
@@ -1753,6 +1753,7 @@ static int e1000_loopback_test(struct e1000_adapter 
*adapter, u64 *data)
 static int e1000_link_test(struct e1000_adapter *adapter, u64 *data)
 {
struct e1000_hw *hw = >hw;
+   bool link_status;
 
*data = 0;
if (hw->phy.media_type == e1000_media_type_internal_serdes) {
@@ -1764,7 +1765,7 @@ static int e1000_link_test(struct e1000_adapter *adapter, 
u64 *data)
 * could take as long as 2-3 minutes
 */
do {
-   hw->mac.ops.check_for_link(hw);
+   hw->mac.ops.check_for_link(hw, NULL);
if (hw->mac.serdes_has_link)
return *data;
msleep(20);
@@ -1772,7 +1773,7 @@ static int e1000_link_test(struct e1000_adapter *adapter, 
u64 *data)
 
*data = 1;
} else {
-   hw->mac.ops.check_for_link(hw);
+   hw->mac.ops.check_for_link(hw, _status);
if (hw->mac.autoneg)
/* On some Phy/switch combinations, link establishment
 * can take a few seconds more than expected.
diff --git a/drivers/net/ethernet/intel/e1000e/hw.h 
b/drivers/net/ethernet/intel/e1000e/hw.h
index d803b1a12349..4dff6df469bb 100644
--- a/drivers/net/ethernet/intel/e1000e/hw.h
+++ b/drivers/net/ethernet/intel/e1000e/hw.h
@@ -472,7 +472,7 @@ struct e1000_mac_operations {
s32  (*id_led_init)(struct e1000_hw *);
s32  (*blink_led)(struct e1000_hw *);
bool (*check_mng_mode)(struct e1000_hw *);
-   s32  (*check_for_link)(struct e1000_hw *);
+   s32  (*check_for_link)(struct e1000_hw *, bool *);
s32  (*cleanup_led)(struct e1000_hw *);
void (*clear_hw_cntrs)(struct e1000_hw *);
void (*clear_vfta)(struct e1000_hw *);
diff --git a/drivers/net/ethernet/intel/e1000e/ich8lan.c 
b/drivers/net/ethernet/intel/e1000e/ich8lan.c
index ff308b05d68c..3d25255289ff 100644
--- a/drivers/net/ethernet/intel/e1000e/ich8lan.c
+++ b/drivers/net/ethernet/intel/e1000e/ich8lan.c
@@ -1363,15 +1363,14 @@ static s32 e1000_disable_ulp_lpt_lp(struct