On 2017-11-02 9:11 PM, Jay Vosburgh wrote:
Alex Sidorenko <[email protected]> wrote:
...> I think I see the flaw in the logic.
1) bond_miimon_inspect finds link_state = 0, then makes a call to bond_propose_link_state(BOND_LINK_FAIL), setting link_new_state to BOND_LINK_FAIL. _inspect then sets slave->new_link = BOND_LINK_DOWN and returns non-zero. 2) bond_mii_monitor rtnl_trylock fails, it reschedules. 3) bond_mii_monitor runs again, and calls bond_miimon_inspect. 4) the slave's link has recovered, so link_state != 0. slave->link is still BOND_LINK_UP. The slave's link_new_state remains set to BOND_LINK_FAIL, but new_link is reset to NOCHANGE. bond_miimon_inspect returns 0, so nothing is committed. 5) step 4 can repeat indefinitely. 6) eventually, the other slave does something that causes commit++, making bond_mii_monitor call bond_commit_link_state and then bond_miimon_commit. The slave in question from steps 1-4 still has link_new_state as BOND_LINK_FAIL, but new_link is NOCHANGE, so it ends up in BOND_LINK_FAIL state. I think step 6 could also occur concurrently with the initial pass through step 4 to induce the problem. It looks like Mahesh mostly fixed this in commit fb9eb899a6dc663e4a2deed9af2ac28f507d0ffb Author: Mahesh Bandewar <[email protected]> Date: Tue Apr 11 22:36:00 2017 -0700 bonding: handle link transition from FAIL to UP correctly but the window still exists, and requires the slave link state to change between the failed rtnl_trylock and the second pass through _inspect. The problem is that a state transition has been kept between invocations to _inspect, but the condition that induced the transition has changed. I haven't tested these, but I suspect the solution is either to clear link_new_state on entry to the loop in bond_miimon_inspect, or merge new_state and link_new_state as a single "next state" (which is cleared on entry to the loop). The first of these is a pretty simple patch: diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 18b58e1376f1..6f89f9981a6c 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -2046,6 +2046,7 @@ static int bond_miimon_inspect(struct bonding *bond)bond_for_each_slave_rcu(bond, slave, iter) {slave->new_link = BOND_LINK_NOCHANGE; + slave->link_new_state = slave->link;link_state = bond_check_dev_link(bond, slave->dev, 0);Alex / Jarod, could you check my logic, and would you be able to test this patch if my analysis appears sound?
This patch looks good, the original reproducing setup successfully recovers after the original active slave goes down, even with NetworkManager in the mix.
Reviewed-by: Jarod Wilson <[email protected]> -- Jarod Wilson [email protected]
