RE: bonding reports interface up with 0 Mbps

2016-02-08 Thread Tantilov, Emil S
>-Original Message-
>From: Jay Vosburgh [mailto:jay.vosbu...@canonical.com]
>Sent: Thursday, February 04, 2016 4:37 PM
>To: Tantilov, Emil S <emil.s.tanti...@intel.com>
>Cc: netdev@vger.kernel.org; go...@cumulusnetworks.com; zhuyj
><zyjzyj2...@gmail.com>; j...@mellanox.com
>Subject: Re: bonding reports interface up with 0 Mbps
>
>Jay Vosburgh <jay.vosbu...@canonical.com> wrote:
>[...]
>>  Thinking about the trace again... Emil: what happens in the
>>trace before this?  Is there ever a call to the ixgbe_get_settings?
>>Does a NETDEV_UP or NETDEV_CHANGE event ever hit the bond_netdev_event
>>function?
>
>   Emil kindly sent me the trace offline, and I think I see what's
>going on.  It looks like the sequence of events is:
>
>bond_enslave ->
>   bond_update_speed_duplex (device is down, thus DUPLEX/SPEED_UNKNOWN)
>   [ do rest of enslavement, start miimon periodic work ]
>
>   [ time passes, device goes carrier up ]
>
>ixgbe_service_task: eth1: NIC Link is Up 10 Gbps ->
>   netif_carrier_on (arranges for NETDEV_CHANGE notifier out of line)
>
>   [ a few microseconds later ]
>
>bond_mii_monitor ->
>   bond_check_dev_link (now is carrier up)
>   bond_miimon_commit ->   (emits "0 Mbps full duplex" message)
>   bond_lower_state_changed ->
>   bond_netdev_event (NETDEV_CHANGELOWERSTATE, is ignored)
>   bond_3ad_handle_link_change (sees DUPLEX/SPEED_UNKNOWN)
>
>   [ a few microseconds later, in response to ixgbe's netif_carrier_on ]
>
>notifier_call_chain ->
>   bond_netdev_event NETDEV_CHANGE ->
>   bond_update_speed_duplex (sees correct SPEED_1/FULL) ->
>   bond_3ad_adapter_speed_duplex_changed (updates 802.3ad)
>
>   Basically, the race is that the periodic bond_mii_monitor is
>squeezing in between the link going up and bonding's update of the speed
>and duplex in response to the NETDEV_CHANGE triggered by the driver's
>netif_carrier_on call.  bonding ends up using the stale duplex and speed
>information obtained at enslavement time.
>
>   I think that, nowadays, the initial speed and duplex will pretty
>much always be UNKNOWN, at least for real Ethernet devices, because it
>will take longer to autoneg than the time between the dev_open and
>bond_update_speed_duplex calls in bond_enslave.
>
>   Adding a case to bond_netdev_event for CHANGELOWERSTATE works
>because it's a synchronous call from bonding.  For purposes of fixing
>this, it's more or less equivalent to calling bond_update_speed_duplex
>from bond_miimon_commit (which is part of a test patch I posted earlier
>today).
>
>   If the above analysis is correct, then I would expect this patch
>to make the problem go away:
>
>diff --git a/drivers/net/bonding/bond_main.c
>b/drivers/net/bonding/bond_main.c
>index 56b560558884..cabaeb61333d 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -2127,6 +2127,7 @@ static void bond_miimon_commit(struct bonding *bond)
>   continue;
>
>   case BOND_LINK_UP:
>+  bond_update_speed_duplex(slave);
>   bond_set_slave_link_state(slave, BOND_LINK_UP,
> BOND_SLAVE_NOTIFY_NOW);
>   slave->last_link_up = jiffies;

No issues seen over the weekend with this patch. The condition was hit 32 times.

You can add my "tested-by:" when you submit this patch.

Thanks Jay for all your help!
Emil



RE: bonding reports interface up with 0 Mbps

2016-02-05 Thread Tantilov, Emil S
>-Original Message-
>From: Jay Vosburgh [mailto:jay.vosbu...@canonical.com]
>Sent: Thursday, February 04, 2016 4:37 PM
>To: Tantilov, Emil S
>Cc: netdev@vger.kernel.org; go...@cumulusnetworks.com; zhuyj;
>j...@mellanox.com
>Subject: Re: bonding reports interface up with 0 Mbps
>
>Jay Vosburgh <jay.vosbu...@canonical.com> wrote:
>[...]
>>  Thinking about the trace again... Emil: what happens in the
>>trace before this?  Is there ever a call to the ixgbe_get_settings?
>>Does a NETDEV_UP or NETDEV_CHANGE event ever hit the bond_netdev_event
>>function?
>
>   Emil kindly sent me the trace offline, and I think I see what's
>going on.  It looks like the sequence of events is:
>
>bond_enslave ->
>   bond_update_speed_duplex (device is down, thus DUPLEX/SPEED_UNKNOWN)
>   [ do rest of enslavement, start miimon periodic work ]
>
>   [ time passes, device goes carrier up ]
>
>ixgbe_service_task: eth1: NIC Link is Up 10 Gbps ->
>   netif_carrier_on (arranges for NETDEV_CHANGE notifier out of line)
>
>   [ a few microseconds later ]
>
>bond_mii_monitor ->
>   bond_check_dev_link (now is carrier up)
>   bond_miimon_commit ->   (emits "0 Mbps full duplex" message)
>   bond_lower_state_changed ->
>   bond_netdev_event (NETDEV_CHANGELOWERSTATE, is ignored)
>   bond_3ad_handle_link_change (sees DUPLEX/SPEED_UNKNOWN)
>
>   [ a few microseconds later, in response to ixgbe's netif_carrier_on ]
>
>notifier_call_chain ->
>   bond_netdev_event NETDEV_CHANGE ->
>   bond_update_speed_duplex (sees correct SPEED_1/FULL) ->
>   bond_3ad_adapter_speed_duplex_changed (updates 802.3ad)
>
>   Basically, the race is that the periodic bond_mii_monitor is
>squeezing in between the link going up and bonding's update of the speed
>and duplex in response to the NETDEV_CHANGE triggered by the driver's
>netif_carrier_on call.  bonding ends up using the stale duplex and speed
>information obtained at enslavement time.
>
>   I think that, nowadays, the initial speed and duplex will pretty
>much always be UNKNOWN, at least for real Ethernet devices, because it
>will take longer to autoneg than the time between the dev_open and
>bond_update_speed_duplex calls in bond_enslave.
>
>   Adding a case to bond_netdev_event for CHANGELOWERSTATE works
>because it's a synchronous call from bonding.  For purposes of fixing
>this, it's more or less equivalent to calling bond_update_speed_duplex
>from bond_miimon_commit (which is part of a test patch I posted earlier
>today).
>
>   If the above analysis is correct, then I would expect this patch
>to make the problem go away:
>
>diff --git a/drivers/net/bonding/bond_main.c
>b/drivers/net/bonding/bond_main.c
>index 56b560558884..cabaeb61333d 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -2127,6 +2127,7 @@ static void bond_miimon_commit(struct bonding *bond)
>   continue;
>
>   case BOND_LINK_UP:
>+  bond_update_speed_duplex(slave);
>   bond_set_slave_link_state(slave, BOND_LINK_UP,
> BOND_SLAVE_NOTIFY_NOW);
>   slave->last_link_up = jiffies;
>
>
>   Emil, can you give just the above a test?

Test has been running all night and no failures so far. Looking at the logs
the condition triggering the race occurred 5 times. I will leave the test 
over the weekend just in case and post a final update on Monday.

Thanks,
Emil



Re: bonding reports interface up with 0 Mbps

2016-02-04 Thread Jay Vosburgh
Jay Vosburgh  wrote:
[...]
>   Thinking about the trace again... Emil: what happens in the
>trace before this?  Is there ever a call to the ixgbe_get_settings?
>Does a NETDEV_UP or NETDEV_CHANGE event ever hit the bond_netdev_event
>function?

Emil kindly sent me the trace offline, and I think I see what's
going on.  It looks like the sequence of events is:

bond_enslave ->
bond_update_speed_duplex (device is down, thus DUPLEX/SPEED_UNKNOWN)
[ do rest of enslavement, start miimon periodic work ]

[ time passes, device goes carrier up ]

ixgbe_service_task: eth1: NIC Link is Up 10 Gbps ->
netif_carrier_on (arranges for NETDEV_CHANGE notifier out of line)

[ a few microseconds later ]

bond_mii_monitor ->
bond_check_dev_link (now is carrier up)
bond_miimon_commit ->   (emits "0 Mbps full duplex" message)
bond_lower_state_changed ->
bond_netdev_event (NETDEV_CHANGELOWERSTATE, is ignored)
bond_3ad_handle_link_change (sees DUPLEX/SPEED_UNKNOWN)

[ a few microseconds later, in response to ixgbe's netif_carrier_on ]

notifier_call_chain ->
bond_netdev_event NETDEV_CHANGE ->
bond_update_speed_duplex (sees correct SPEED_1/FULL) ->
bond_3ad_adapter_speed_duplex_changed (updates 802.3ad)

Basically, the race is that the periodic bond_mii_monitor is
squeezing in between the link going up and bonding's update of the speed
and duplex in response to the NETDEV_CHANGE triggered by the driver's
netif_carrier_on call.  bonding ends up using the stale duplex and speed
information obtained at enslavement time.

I think that, nowadays, the initial speed and duplex will pretty
much always be UNKNOWN, at least for real Ethernet devices, because it
will take longer to autoneg than the time between the dev_open and
bond_update_speed_duplex calls in bond_enslave.

Adding a case to bond_netdev_event for CHANGELOWERSTATE works
because it's a synchronous call from bonding.  For purposes of fixing
this, it's more or less equivalent to calling bond_update_speed_duplex
from bond_miimon_commit (which is part of a test patch I posted earlier
today).

If the above analysis is correct, then I would expect this patch
to make the problem go away:

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 56b560558884..cabaeb61333d 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2127,6 +2127,7 @@ static void bond_miimon_commit(struct bonding *bond)
continue;
 
case BOND_LINK_UP:
+   bond_update_speed_duplex(slave);
bond_set_slave_link_state(slave, BOND_LINK_UP,
  BOND_SLAVE_NOTIFY_NOW);
slave->last_link_up = jiffies;


Emil, can you give just the above a test?

I don't see in the trace that there's evidence that ixgbe's link
is rapidly flapping, so I don't think it's necessary to do more than the
above.

Now, separately, bonding really should obey the NETDEV_CHANGE /
NETDEV_UP events instead of polling for carrier state, but if the above
patch works it's a simple fix that is easily backported, which the
CHANGELOWERSTATE method isn't, and the new way (notifier driven) can be
net-next material.

-J

---
-Jay Vosburgh, jay.vosbu...@canonical.com


RE: bonding reports interface up with 0 Mbps

2016-02-04 Thread Tantilov, Emil S
>-Original Message-
>From: Jay Vosburgh [mailto:jay.vosbu...@canonical.com]
>Sent: Thursday, February 04, 2016 4:37 PM
>To: Tantilov, Emil S
>Cc: netdev@vger.kernel.org; go...@cumulusnetworks.com; zhuyj;
>j...@mellanox.com
>Subject: Re: bonding reports interface up with 0 Mbps
>
>Jay Vosburgh <jay.vosbu...@canonical.com> wrote:
>[...]
>>  Thinking about the trace again... Emil: what happens in the
>>trace before this?  Is there ever a call to the ixgbe_get_settings?
>>Does a NETDEV_UP or NETDEV_CHANGE event ever hit the bond_netdev_event
>>function?
>
>   Emil kindly sent me the trace offline, and I think I see what's
>going on.  It looks like the sequence of events is:
>
>bond_enslave ->
>   bond_update_speed_duplex (device is down, thus DUPLEX/SPEED_UNKNOWN)
>   [ do rest of enslavement, start miimon periodic work ]
>
>   [ time passes, device goes carrier up ]
>
>ixgbe_service_task: eth1: NIC Link is Up 10 Gbps ->
>   netif_carrier_on (arranges for NETDEV_CHANGE notifier out of line)
>
>   [ a few microseconds later ]
>
>bond_mii_monitor ->
>   bond_check_dev_link (now is carrier up)
>   bond_miimon_commit ->   (emits "0 Mbps full duplex" message)
>   bond_lower_state_changed ->
>   bond_netdev_event (NETDEV_CHANGELOWERSTATE, is ignored)
>   bond_3ad_handle_link_change (sees DUPLEX/SPEED_UNKNOWN)
>
>   [ a few microseconds later, in response to ixgbe's netif_carrier_on ]
>
>notifier_call_chain ->
>   bond_netdev_event NETDEV_CHANGE ->
>   bond_update_speed_duplex (sees correct SPEED_1/FULL) ->
>   bond_3ad_adapter_speed_duplex_changed (updates 802.3ad)
>
>   Basically, the race is that the periodic bond_mii_monitor is
>squeezing in between the link going up and bonding's update of the speed
>and duplex in response to the NETDEV_CHANGE triggered by the driver's
>netif_carrier_on call.  bonding ends up using the stale duplex and speed
>information obtained at enslavement time.
>
>   I think that, nowadays, the initial speed and duplex will pretty
>much always be UNKNOWN, at least for real Ethernet devices, because it
>will take longer to autoneg than the time between the dev_open and
>bond_update_speed_duplex calls in bond_enslave.
>
>   Adding a case to bond_netdev_event for CHANGELOWERSTATE works
>because it's a synchronous call from bonding.  For purposes of fixing
>this, it's more or less equivalent to calling bond_update_speed_duplex
>from bond_miimon_commit (which is part of a test patch I posted earlier
>today).
>
>   If the above analysis is correct, then I would expect this patch
>to make the problem go away:
>
>diff --git a/drivers/net/bonding/bond_main.c
>b/drivers/net/bonding/bond_main.c
>index 56b560558884..cabaeb61333d 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -2127,6 +2127,7 @@ static void bond_miimon_commit(struct bonding *bond)
>   continue;
>
>   case BOND_LINK_UP:
>+  bond_update_speed_duplex(slave);
>   bond_set_slave_link_state(slave, BOND_LINK_UP,
> BOND_SLAVE_NOTIFY_NOW);
>   slave->last_link_up = jiffies;
>
>
>   Emil, can you give just the above a test?

Sure I'll fire it up.

Thanks,
Emil



Re: bonding reports interface up with 0 Mbps

2016-02-04 Thread zhuyj

On 02/05/2016 08:37 AM, Jay Vosburgh wrote:

Jay Vosburgh  wrote:
[...]

Thinking about the trace again... Emil: what happens in the
trace before this?  Is there ever a call to the ixgbe_get_settings?
Does a NETDEV_UP or NETDEV_CHANGE event ever hit the bond_netdev_event
function?

Emil kindly sent me the trace offline, and I think I see what's
going on.  It looks like the sequence of events is:

bond_enslave ->
bond_update_speed_duplex (device is down, thus DUPLEX/SPEED_UNKNOWN)
[ do rest of enslavement, start miimon periodic work ]

[ time passes, device goes carrier up ]

ixgbe_service_task: eth1: NIC Link is Up 10 Gbps ->
netif_carrier_on (arranges for NETDEV_CHANGE notifier out of line)

[ a few microseconds later ]

bond_mii_monitor ->
bond_check_dev_link (now is carrier up)
bond_miimon_commit ->(emits "0 Mbps full duplex" message)
bond_lower_state_changed ->
bond_netdev_event (NETDEV_CHANGELOWERSTATE, is ignored)
bond_3ad_handle_link_change (sees DUPLEX/SPEED_UNKNOWN)

[ a few microseconds later, in response to ixgbe's netif_carrier_on ]

notifier_call_chain ->
bond_netdev_event NETDEV_CHANGE ->
bond_update_speed_duplex (sees correct SPEED_1/FULL) ->
bond_3ad_adapter_speed_duplex_changed (updates 802.3ad)

Basically, the race is that the periodic bond_mii_monitor is
squeezing in between the link going up and bonding's update of the speed
and duplex in response to the NETDEV_CHANGE triggered by the driver's
netif_carrier_on call.  bonding ends up using the stale duplex and speed
information obtained at enslavement time.

I think that, nowadays, the initial speed and duplex will pretty
much always be UNKNOWN, at least for real Ethernet devices, because it
will take longer to autoneg than the time between the dev_open and
bond_update_speed_duplex calls in bond_enslave.

Adding a case to bond_netdev_event for CHANGELOWERSTATE works
because it's a synchronous call from bonding.  For purposes of fixing
this, it's more or less equivalent to calling bond_update_speed_duplex
from bond_miimon_commit (which is part of a test patch I posted earlier
today).

If the above analysis is correct, then I would expect this patch
to make the problem go away:

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 56b560558884..cabaeb61333d 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2127,6 +2127,7 @@ static void bond_miimon_commit(struct bonding *bond)
continue;
  
  		case BOND_LINK_UP:

+   bond_update_speed_duplex(slave);
bond_set_slave_link_state(slave, BOND_LINK_UP,
  BOND_SLAVE_NOTIFY_NOW);
slave->last_link_up = jiffies;


Emil, can you give just the above a test?

I don't see in the trace that there's evidence that ixgbe's link
is rapidly flapping, so I don't think it's necessary to do more than the
above.


Sure. I agree with you. I expect this can solve this problem.

Thanks a lot.
Zhu Yanjun



Now, separately, bonding really should obey the NETDEV_CHANGE /
NETDEV_UP events instead of polling for carrier state, but if the above
patch works it's a simple fix that is easily backported, which the
CHANGELOWERSTATE method isn't, and the new way (notifier driven) can be
net-next material.

-J

---
-Jay Vosburgh, jay.vosbu...@canonical.com




Re: bonding reports interface up with 0 Mbps

2016-02-04 Thread zhuyj

On 02/05/2016 08:43 AM, Tantilov, Emil S wrote:

-Original Message-
From: Jay Vosburgh [mailto:jay.vosbu...@canonical.com]
Sent: Thursday, February 04, 2016 4:37 PM
To: Tantilov, Emil S
Cc: netdev@vger.kernel.org; go...@cumulusnetworks.com; zhuyj;
j...@mellanox.com
Subject: Re: bonding reports interface up with 0 Mbps

Jay Vosburgh <jay.vosbu...@canonical.com> wrote:
[...]

Thinking about the trace again... Emil: what happens in the
trace before this?  Is there ever a call to the ixgbe_get_settings?
Does a NETDEV_UP or NETDEV_CHANGE event ever hit the bond_netdev_event
function?

Emil kindly sent me the trace offline, and I think I see what's
going on.  It looks like the sequence of events is:

bond_enslave ->
bond_update_speed_duplex (device is down, thus DUPLEX/SPEED_UNKNOWN)
[ do rest of enslavement, start miimon periodic work ]

[ time passes, device goes carrier up ]

ixgbe_service_task: eth1: NIC Link is Up 10 Gbps ->
netif_carrier_on (arranges for NETDEV_CHANGE notifier out of line)

[ a few microseconds later ]

bond_mii_monitor ->
bond_check_dev_link (now is carrier up)
bond_miimon_commit ->(emits "0 Mbps full duplex" message)
bond_lower_state_changed ->
bond_netdev_event (NETDEV_CHANGELOWERSTATE, is ignored)
bond_3ad_handle_link_change (sees DUPLEX/SPEED_UNKNOWN)

[ a few microseconds later, in response to ixgbe's netif_carrier_on ]

notifier_call_chain ->
bond_netdev_event NETDEV_CHANGE ->
bond_update_speed_duplex (sees correct SPEED_1/FULL) ->
bond_3ad_adapter_speed_duplex_changed (updates 802.3ad)

Basically, the race is that the periodic bond_mii_monitor is
squeezing in between the link going up and bonding's update of the speed
and duplex in response to the NETDEV_CHANGE triggered by the driver's
netif_carrier_on call.  bonding ends up using the stale duplex and speed
information obtained at enslavement time.

I think that, nowadays, the initial speed and duplex will pretty
much always be UNKNOWN, at least for real Ethernet devices, because it
will take longer to autoneg than the time between the dev_open and
bond_update_speed_duplex calls in bond_enslave.

Adding a case to bond_netdev_event for CHANGELOWERSTATE works
because it's a synchronous call from bonding.  For purposes of fixing
this, it's more or less equivalent to calling bond_update_speed_duplex

>from bond_miimon_commit (which is part of a test patch I posted earlier

today).

If the above analysis is correct, then I would expect this patch
to make the problem go away:

diff --git a/drivers/net/bonding/bond_main.c
b/drivers/net/bonding/bond_main.c
index 56b560558884..cabaeb61333d 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2127,6 +2127,7 @@ static void bond_miimon_commit(struct bonding *bond)
continue;

case BOND_LINK_UP:
+   bond_update_speed_duplex(slave);
bond_set_slave_link_state(slave, BOND_LINK_UP,
  BOND_SLAVE_NOTIFY_NOW);
slave->last_link_up = jiffies;


Emil, can you give just the above a test?

Sure I'll fire it up.

Let me know the test result.

Thanks a lot.
Zhu Yanjun


Thanks,
Emil





RE: bonding reports interface up with 0 Mbps

2016-02-04 Thread Tantilov, Emil S
>-Original Message-
>From: Jay Vosburgh [mailto:jay.vosbu...@canonical.com]
>Sent: Wednesday, February 03, 2016 9:57 PM
>To: Tantilov, Emil S
>Cc: netdev@vger.kernel.org; go...@cumulusnetworks.com; zhuyj;
>j...@mellanox.com
>Subject: Re: bonding reports interface up with 0 Mbps
>
>Tantilov, Emil S <emil.s.tanti...@intel.com> wrote:
>
>>We are seeing an occasional issue where the bonding driver may report
>interface up with 0 Mbps:
>>bond0: link status definitely up for interface eth0, 0 Mbps full duplex
>>
>>So far in all the failed traces I have collected this happens on 
>>NETDEV_CHANGELOWERSTATE event:
>>
>><...>-20533 [000]  81811.041241: ixgbe_service_task: eth1: NIC Link is Up 
>>10 Gbps, Flow Control: RX/TX
>><...>-20533 [000]  81811.041257: ixgbe_check_vf_rate_limit 
>><-ixgbe_service_task
>><...>-20533 [000]  81811.041272: ixgbe_ping_all_vfs <-ixgbe_service_task
>>kworker/u48:0-7503  [010]  81811.041345: ixgbe_get_stats64 <-dev_get_stats
>>kworker/u48:0-7503  [010]  81811.041393: bond_netdev_event: eth1: event: 
>>1b
>>kworker/u48:0-7503  [010]  81811.041394: bond_netdev_event: eth1: 
>>IFF_SLAVE
>>kworker/u48:0-7503  [010]  81811.041395: bond_netdev_event: eth1: 
>>slave->speed = 
>><...>-20533 [000]  81811.041407: ixgbe_ptp_overflow_check 
>><-ixgbe_service_task
>>kworker/u48:0-7503  [010]  81811.041407: bond_mii_monitor: bond0: link 
>>status definitely up for interface eth1, 0 Mbps full duplex
>
>   From looking at the code that prints this, the "full" duplex is
>probably actually DUPLEX_UNKNOWN, but the netdev_info uses the
>expression slave->duplex ? "full" : "half", so DUPLEX_UNKNOWN at 0xff
>would print "full."
>
>   This is what ixgbe_get_settings returns for speed and duplex if
>it is called when carrier is off.

But in this case carrier is on and regardless - ixgbe_get_settings() is not
called as you can see from the trace, which is why speed and duplex are not set.

>>As a proof of concept I added NETDEV_CHANGELOWERSTATE in
>bond_slave_netdev_event() along with NETDEV_UP/CHANGE:
>>
>>diff --git a/drivers/net/bonding/bond_main.c
>b/drivers/net/bonding/bond_main.c
>>index 56b5605..a9dac4c 100644
>>--- a/drivers/net/bonding/bond_main.c
>>+++ b/drivers/net/bonding/bond_main.c
>>@@ -3014,6 +3014,7 @@ static int bond_slave_netdev_event(unsigned long
>event,
>>  break;
>>  case NETDEV_UP:
>>  case NETDEV_CHANGE:
>>+ case NETDEV_CHANGELOWERSTATE:
>>  bond_update_speed_duplex(slave);
>>  if (BOND_MODE(bond) == BOND_MODE_8023AD)
>>  bond_3ad_adapter_speed_duplex_changed(slave);
>>
>>With this change I have not seen 0 Mbps reported by the bonding driver
>(around 12 hour test up to this point
>>vs. 2-3 hours otherwise). Although I suppose it could also be some sort of
>race/timing issue with bond_mii_monitor().
>
>   This change as a fix seems kind of odd, since CHANGELOWERSTATE
>is generated by bonding itself.  Perhaps the net effect is to add a
>delay and then update the speed and duplex, masking the actual problem.

I added this case to make sure bond_update_speed_duplex() is called which
will get the speed/duplex from ixgbe_get_settings.

>   Emil, if I recall correctly, the test patch I send that uses the
>notifiers directly instead of miimon (specify miimon=0 and have bonding
>respond to the notifiers) handled everything properly, right?  If so I

Yes, but I think the code path was also different as I recall the "definitely
up" message was not displayed as often.
 
>can split that up and submit it properly; it seems more like a feature
>than a straightforward bug fix, so I'm not sure it's appropriate for
>net.
>
>   As a possibly less complex alternative for the miimon > 0 case,
>could you try the following:
>
>diff --git a/drivers/net/bonding/bond_main.c
>b/drivers/net/bonding/bond_main.c
>index 56b560558884..ac8921e65f26 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -2120,6 +2120,7 @@ static void bond_miimon_commit(struct bonding *bond)
> {
>   struct list_head *iter;
>   struct slave *slave, *primary;
>+  int link_state;
>
>   bond_for_each_slave(bond, slave, iter) {
>   switch (slave->new_link) {
>@@ -2127,6 +2128,10 @@ static void bond_miimon_commit(struct bonding *bond)
>   continue;
>
>   case BOND_LINK_UP:
>+  link_state =

Re: bonding reports interface up with 0 Mbps

2016-02-04 Thread Jay Vosburgh
Tantilov, Emil S  wrote:
[...]
>Sure, I'll give this a try, but I'm not sure this check applies in this case
>as you can see from the trace link is up and carrier is on.

From code inspection, I see another possible race, although I'm
not sure if it's relevant for this case.  During enslavement, the speed
and duplex are initially set, and then later the link state is checked,
but if link is up, the code presumes the speed and duplex are valid,
which they may not be.

I think this patch would narrow (but not totally eliminate) this
race:

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 56b560558884..b8b8a24f92d1 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1591,6 +1591,7 @@ int bond_enslave(struct net_device *bond_dev, struct 
net_device *slave_dev)
/* check for initial state */
if (bond->params.miimon) {
if (bond_check_dev_link(bond, slave_dev, 0) == BMSR_LSTATUS) {
+   bond_update_speed_duplex(new_slave);
if (bond->params.updelay) {
bond_set_slave_link_state(new_slave,
  BOND_LINK_BACK,


I'm not sure it's going to be really possible to completely
close all of these races, as the device can change its link state (and
thus what it reports for speed and duplex) asynchronously.  Even in a
NETDEV_UP or NETDEV_CHANGE notifier callback, the link could go down
between the time of the netif_carrier_on call that triggers the notifier
and when the callback runs.

But, if the link is really flapping here, bonding should get a
notifier for each flap (up or down).  Once it settles down with carrier
up then the speed and duplex should be valid.

-J

---
-Jay Vosburgh, jay.vosbu...@canonical.com


Re: bonding reports interface up with 0 Mbps

2016-02-04 Thread Jay Vosburgh
Tantilov, Emil S  wrote:

>We are seeing an occasional issue where the bonding driver may report 
>interface up with 0 Mbps:
>bond0: link status definitely up for interface eth0, 0 Mbps full duplex
>
>So far in all the failed traces I have collected this happens on 
>NETDEV_CHANGELOWERSTATE event:
>
><...>-20533 [000]  81811.041241: ixgbe_service_task: eth1: NIC Link is Up 
>10 Gbps, Flow Control: RX/TX
><...>-20533 [000]  81811.041257: ixgbe_check_vf_rate_limit 
><-ixgbe_service_task
><...>-20533 [000]  81811.041272: ixgbe_ping_all_vfs <-ixgbe_service_task
>kworker/u48:0-7503  [010]  81811.041345: ixgbe_get_stats64 <-dev_get_stats
>kworker/u48:0-7503  [010]  81811.041393: bond_netdev_event: eth1: event: 1b
>kworker/u48:0-7503  [010]  81811.041394: bond_netdev_event: eth1: IFF_SLAVE
>kworker/u48:0-7503  [010]  81811.041395: bond_netdev_event: eth1: 
>slave->speed = 
><...>-20533 [000]  81811.041407: ixgbe_ptp_overflow_check 
><-ixgbe_service_task
>kworker/u48:0-7503  [010]  81811.041407: bond_mii_monitor: bond0: link 
>status definitely up for interface eth1, 0 Mbps full duplex

Thinking about the trace again... Emil: what happens in the
trace before this?  Is there ever a call to the ixgbe_get_settings?
Does a NETDEV_UP or NETDEV_CHANGE event ever hit the bond_netdev_event
function?

Could you describe your test that reproduces this?  I'd like to
see if I can set it up locally.

-J

---
-Jay Vosburgh, jay.vosbu...@canonical.com


Re: bonding reports interface up with 0 Mbps

2016-02-03 Thread zhuyj

Hi, Emil

Thanks for your hard work.

With kernel 3.14, NETDEV_CHANGELOWERSTATE is not introduced. my user 
still confronted
"bond_mii_monitor: bond0: link status definitely up for interface eth1, 
0 Mbps full duplex".


How to explain it?

Would you like to make tests with kernel 3.14?

Thanks a lot.

Zhu Yanjun

On 02/04/2016 07:10 AM, Tantilov, Emil S wrote:

We are seeing an occasional issue where the bonding driver may report interface 
up with 0 Mbps:
bond0: link status definitely up for interface eth0, 0 Mbps full duplex

So far in all the failed traces I have collected this happens on 
NETDEV_CHANGELOWERSTATE event:

<...>-20533 [000]  81811.041241: ixgbe_service_task: eth1: NIC Link is Up 
10 Gbps, Flow Control: RX/TX
<...>-20533 [000]  81811.041257: ixgbe_check_vf_rate_limit 
<-ixgbe_service_task
<...>-20533 [000]  81811.041272: ixgbe_ping_all_vfs <-ixgbe_service_task
kworker/u48:0-7503  [010]  81811.041345: ixgbe_get_stats64 <-dev_get_stats
kworker/u48:0-7503  [010]  81811.041393: bond_netdev_event: eth1: event: 1b
kworker/u48:0-7503  [010]  81811.041394: bond_netdev_event: eth1: IFF_SLAVE
kworker/u48:0-7503  [010]  81811.041395: bond_netdev_event: eth1: 
slave->speed = 
<...>-20533 [000]  81811.041407: ixgbe_ptp_overflow_check 
<-ixgbe_service_task
kworker/u48:0-7503  [010]  81811.041407: bond_mii_monitor: bond0: link 
status definitely up for interface eth1, 0 Mbps full duplex

As a proof of concept I added NETDEV_CHANGELOWERSTATE in 
bond_slave_netdev_event() along with NETDEV_UP/CHANGE:

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 56b5605..a9dac4c 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3014,6 +3014,7 @@ static int bond_slave_netdev_event(unsigned long event,
break;
case NETDEV_UP:
case NETDEV_CHANGE:
+   case NETDEV_CHANGELOWERSTATE:
bond_update_speed_duplex(slave);
if (BOND_MODE(bond) == BOND_MODE_8023AD)
bond_3ad_adapter_speed_duplex_changed(slave);

With this change I have not seen 0 Mbps reported by the bonding driver (around 
12 hour test up to this point
vs. 2-3 hours otherwise). Although I suppose it could also be some sort of 
race/timing issue with bond_mii_monitor().

This test is with current bonding driver from net-next (top commit 03d84a5f83).

The bond is configured as such:

mode = 802.3ad
lacp_rate = fast
miimon = 100
xmit_hash_policy = layer3+4

I should note that the speed is reported correctly in /proc/net/bonding/bond0 
once the bond0 interface is up,
so this seems to be just an issue with the initial detection of the speed. At 
least from what I have seen so far.

Thanks,
Emil






bonding reports interface up with 0 Mbps

2016-02-03 Thread Tantilov, Emil S
We are seeing an occasional issue where the bonding driver may report interface 
up with 0 Mbps:
bond0: link status definitely up for interface eth0, 0 Mbps full duplex

So far in all the failed traces I have collected this happens on 
NETDEV_CHANGELOWERSTATE event:

<...>-20533 [000]  81811.041241: ixgbe_service_task: eth1: NIC Link is Up 
10 Gbps, Flow Control: RX/TX
<...>-20533 [000]  81811.041257: ixgbe_check_vf_rate_limit 
<-ixgbe_service_task
<...>-20533 [000]  81811.041272: ixgbe_ping_all_vfs <-ixgbe_service_task
kworker/u48:0-7503  [010]  81811.041345: ixgbe_get_stats64 <-dev_get_stats
kworker/u48:0-7503  [010]  81811.041393: bond_netdev_event: eth1: event: 1b
kworker/u48:0-7503  [010]  81811.041394: bond_netdev_event: eth1: IFF_SLAVE
kworker/u48:0-7503  [010]  81811.041395: bond_netdev_event: eth1: 
slave->speed = 
<...>-20533 [000]  81811.041407: ixgbe_ptp_overflow_check 
<-ixgbe_service_task
kworker/u48:0-7503  [010]  81811.041407: bond_mii_monitor: bond0: link 
status definitely up for interface eth1, 0 Mbps full duplex

As a proof of concept I added NETDEV_CHANGELOWERSTATE in 
bond_slave_netdev_event() along with NETDEV_UP/CHANGE:

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 56b5605..a9dac4c 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3014,6 +3014,7 @@ static int bond_slave_netdev_event(unsigned long event,
break;
case NETDEV_UP:
case NETDEV_CHANGE:
+   case NETDEV_CHANGELOWERSTATE:
bond_update_speed_duplex(slave);
if (BOND_MODE(bond) == BOND_MODE_8023AD)
bond_3ad_adapter_speed_duplex_changed(slave);

With this change I have not seen 0 Mbps reported by the bonding driver (around 
12 hour test up to this point
vs. 2-3 hours otherwise). Although I suppose it could also be some sort of 
race/timing issue with bond_mii_monitor().

This test is with current bonding driver from net-next (top commit 03d84a5f83).

The bond is configured as such:

mode = 802.3ad
lacp_rate = fast
miimon = 100
xmit_hash_policy = layer3+4

I should note that the speed is reported correctly in /proc/net/bonding/bond0 
once the bond0 interface is up,
so this seems to be just an issue with the initial detection of the speed. At 
least from what I have seen so far.

Thanks,
Emil



Re: bonding reports interface up with 0 Mbps

2016-02-03 Thread zhuyj

On 02/04/2016 01:57 PM, Jay Vosburgh wrote:

Tantilov, Emil S  wrote:


We are seeing an occasional issue where the bonding driver may report interface 
up with 0 Mbps:
bond0: link status definitely up for interface eth0, 0 Mbps full duplex

So far in all the failed traces I have collected this happens on 
NETDEV_CHANGELOWERSTATE event:

<...>-20533 [000]  81811.041241: ixgbe_service_task: eth1: NIC Link is Up 
10 Gbps, Flow Control: RX/TX
<...>-20533 [000]  81811.041257: ixgbe_check_vf_rate_limit 
<-ixgbe_service_task
<...>-20533 [000]  81811.041272: ixgbe_ping_all_vfs <-ixgbe_service_task
kworker/u48:0-7503  [010]  81811.041345: ixgbe_get_stats64 <-dev_get_stats
kworker/u48:0-7503  [010]  81811.041393: bond_netdev_event: eth1: event: 1b
kworker/u48:0-7503  [010]  81811.041394: bond_netdev_event: eth1: IFF_SLAVE
kworker/u48:0-7503  [010]  81811.041395: bond_netdev_event: eth1: 
slave->speed = 
<...>-20533 [000]  81811.041407: ixgbe_ptp_overflow_check 
<-ixgbe_service_task
kworker/u48:0-7503  [010]  81811.041407: bond_mii_monitor: bond0: link 
status definitely up for interface eth1, 0 Mbps full duplex

From looking at the code that prints this, the "full" duplex is
probably actually DUPLEX_UNKNOWN, but the netdev_info uses the
expression slave->duplex ? "full" : "half", so DUPLEX_UNKNOWN at 0xff
would print "full."

This is what ixgbe_get_settings returns for speed and duplex if
it is called when carrier is off.


I agree with you totally. I think it is the root cause.

Best Regards!
Zhu Yanjun




As a proof of concept I added NETDEV_CHANGELOWERSTATE in 
bond_slave_netdev_event() along with NETDEV_UP/CHANGE:

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 56b5605..a9dac4c 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3014,6 +3014,7 @@ static int bond_slave_netdev_event(unsigned long event,
break;
case NETDEV_UP:
case NETDEV_CHANGE:
+   case NETDEV_CHANGELOWERSTATE:
bond_update_speed_duplex(slave);
if (BOND_MODE(bond) == BOND_MODE_8023AD)
bond_3ad_adapter_speed_duplex_changed(slave);

With this change I have not seen 0 Mbps reported by the bonding driver (around 
12 hour test up to this point
vs. 2-3 hours otherwise). Although I suppose it could also be some sort of 
race/timing issue with bond_mii_monitor().

This change as a fix seems kind of odd, since CHANGELOWERSTATE
is generated by bonding itself.  Perhaps the net effect is to add a
delay and then update the speed and duplex, masking the actual problem.

Emil, if I recall correctly, the test patch I send that uses the
notifiers directly instead of miimon (specify miimon=0 and have bonding
respond to the notifiers) handled everything properly, right?  If so I
can split that up and submit it properly; it seems more like a feature
than a straightforward bug fix, so I'm not sure it's appropriate for
net.

As a possibly less complex alternative for the miimon > 0 case,
could you try the following:

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 56b560558884..ac8921e65f26 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2120,6 +2120,7 @@ static void bond_miimon_commit(struct bonding *bond)
  {
struct list_head *iter;
struct slave *slave, *primary;
+   int link_state;
  
  	bond_for_each_slave(bond, slave, iter) {

switch (slave->new_link) {
@@ -2127,6 +2128,10 @@ static void bond_miimon_commit(struct bonding *bond)
continue;
  
  		case BOND_LINK_UP:

+   link_state = bond_check_dev_link(bond, slave->dev, 0);
+   if (!link_state)
+   continue;
+   bond_update_speed_duplex(slave);
bond_set_slave_link_state(slave, BOND_LINK_UP,
  BOND_SLAVE_NOTIFY_NOW);
slave->last_link_up = jiffies;


This will make bonding recheck the link state and update the
speed and duplex after it acquires RTNL to commit a link change.  This
probably still has a race, since the change of carrier state in the
device is not mutexed by anything bonding can acquire (so it can always
change as soon as it's checked).

Thanks,

-J

---
-Jay Vosburgh, jay.vosbu...@canonical.com




Re: bonding reports interface up with 0 Mbps

2016-02-03 Thread Jay Vosburgh
Tantilov, Emil S  wrote:

>We are seeing an occasional issue where the bonding driver may report 
>interface up with 0 Mbps:
>bond0: link status definitely up for interface eth0, 0 Mbps full duplex
>
>So far in all the failed traces I have collected this happens on 
>NETDEV_CHANGELOWERSTATE event:
>
><...>-20533 [000]  81811.041241: ixgbe_service_task: eth1: NIC Link is Up 
>10 Gbps, Flow Control: RX/TX
><...>-20533 [000]  81811.041257: ixgbe_check_vf_rate_limit 
><-ixgbe_service_task
><...>-20533 [000]  81811.041272: ixgbe_ping_all_vfs <-ixgbe_service_task
>kworker/u48:0-7503  [010]  81811.041345: ixgbe_get_stats64 <-dev_get_stats
>kworker/u48:0-7503  [010]  81811.041393: bond_netdev_event: eth1: event: 1b
>kworker/u48:0-7503  [010]  81811.041394: bond_netdev_event: eth1: IFF_SLAVE
>kworker/u48:0-7503  [010]  81811.041395: bond_netdev_event: eth1: 
>slave->speed = 
><...>-20533 [000]  81811.041407: ixgbe_ptp_overflow_check 
><-ixgbe_service_task
>kworker/u48:0-7503  [010]  81811.041407: bond_mii_monitor: bond0: link 
>status definitely up for interface eth1, 0 Mbps full duplex

From looking at the code that prints this, the "full" duplex is
probably actually DUPLEX_UNKNOWN, but the netdev_info uses the
expression slave->duplex ? "full" : "half", so DUPLEX_UNKNOWN at 0xff
would print "full."

This is what ixgbe_get_settings returns for speed and duplex if
it is called when carrier is off.

>As a proof of concept I added NETDEV_CHANGELOWERSTATE in 
>bond_slave_netdev_event() along with NETDEV_UP/CHANGE:
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 56b5605..a9dac4c 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -3014,6 +3014,7 @@ static int bond_slave_netdev_event(unsigned long event,
>   break;
>   case NETDEV_UP:
>   case NETDEV_CHANGE:
>+  case NETDEV_CHANGELOWERSTATE:
>   bond_update_speed_duplex(slave);
>   if (BOND_MODE(bond) == BOND_MODE_8023AD)
>   bond_3ad_adapter_speed_duplex_changed(slave);
>
>With this change I have not seen 0 Mbps reported by the bonding driver (around 
>12 hour test up to this point
>vs. 2-3 hours otherwise). Although I suppose it could also be some sort of 
>race/timing issue with bond_mii_monitor().

This change as a fix seems kind of odd, since CHANGELOWERSTATE
is generated by bonding itself.  Perhaps the net effect is to add a
delay and then update the speed and duplex, masking the actual problem.

Emil, if I recall correctly, the test patch I send that uses the
notifiers directly instead of miimon (specify miimon=0 and have bonding
respond to the notifiers) handled everything properly, right?  If so I
can split that up and submit it properly; it seems more like a feature
than a straightforward bug fix, so I'm not sure it's appropriate for
net.

As a possibly less complex alternative for the miimon > 0 case,
could you try the following:

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 56b560558884..ac8921e65f26 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2120,6 +2120,7 @@ static void bond_miimon_commit(struct bonding *bond)
 {
struct list_head *iter;
struct slave *slave, *primary;
+   int link_state;
 
bond_for_each_slave(bond, slave, iter) {
switch (slave->new_link) {
@@ -2127,6 +2128,10 @@ static void bond_miimon_commit(struct bonding *bond)
continue;
 
case BOND_LINK_UP:
+   link_state = bond_check_dev_link(bond, slave->dev, 0);
+   if (!link_state)
+   continue;
+   bond_update_speed_duplex(slave);
bond_set_slave_link_state(slave, BOND_LINK_UP,
  BOND_SLAVE_NOTIFY_NOW);
slave->last_link_up = jiffies;


This will make bonding recheck the link state and update the
speed and duplex after it acquires RTNL to commit a link change.  This
probably still has a race, since the change of carrier state in the
device is not mutexed by anything bonding can acquire (so it can always
change as soon as it's checked).

Thanks,

-J

---
-Jay Vosburgh, jay.vosbu...@canonical.com