Acked-by: Jon Maloy <jon.ma...@ericsson.com>

Also see my comments below.

///jon

> -----Original Message-----
> From: Tuong Lien <tuong.t.l...@dektech.com.au>
> Sent: 24-Apr-19 00:41
> To: Jon Maloy <jon.ma...@ericsson.com>; ma...@donjonn.com;
> ying....@windriver.com; tipc-discussion@lists.sourceforge.net
> Subject: [PATCH RFC] tipc: fix link "half-failover" issue
> 
> TIPC link can temporarily fall into "half-establish" that only one of the link
> endpoints is ESTABLISHED and starts to send traffic, PROTOCOL messages,
> whereas the other link endpoint is not up (e.g. immediately when the
> endpoint receives ACTIVATE_MSG, the network interface goes down...).
> 
> This is a normal situation and will be settled because the link endpoint will 
> be
> eventually brought down after the link tolerance time.
> 
> However, the situation will become worse when the second link is
> established before the first link endpoint goes down, For example:
> 
>    1. Both links <1A-1B>, <2A-2B> down

Confusing terminology here. How can there be a link <1A-1B>?
I think you want to say "Both link endpoints" 1A,1B down etc.

>    2. Link endpoint 1B up, but 1A still down (e.g. due to network
>       disturbance, wrong session, etc.)

>    3. Link <2A-2B> up
Same here.

>    4. Link endpoint 1B down (e.g. due to link tolerance timeout)
>    5. Node B starts failover onto link <2A-2B>
> 
>    ==> Node A does never start link failover.
> 
> When the "half-failover" situation happens, two consequences have been
> observed:
> 
> a) Peer link/node gets stuck in FAILINGOVER state;
> b) Traffic or user messages that peer node is trying to failover onto the
> second link can be partially or completely dropped by this node.
> 
> The consequence a) was actually solved by commit c140eb166d68 ("tipc:
> fix failover problem"), but that commit didn't cover the b). It's due to the 
> fact
> that the tunnel link endpoint has never been prepared for a failover, so the
> 'l->drop_point' (and the other data...) is not set correctly. When a
> TUNNLE_MSG 

s/TUNNLE_MSG/TUNNEL_MSG

from peer node arrives on the link, depending on the inner
> message's seqno and the current 'l->drop_point'
> value, the message can be dropped (- treated as a duplicate message) or
> processed.
> At this early stage, the traffic messages from peer are likely to be
> NAME_DISTRIBUTORs, this means some name table entries will be missed on
> the node forever!
> 
> The commit resolves the issue by starting the FAILOVER process on this node
> as well. Another benefit from this solution is that we ensure the link will 
> not
> be re-established until the failover ends.
> 
> Signed-off-by: Tuong Lien <tuong.t.l...@dektech.com.au>
> ---
>  net/tipc/link.c | 35 +++++++++++++++++++++++++++++++++++
>  net/tipc/link.h |  2 ++
>  net/tipc/node.c | 54
> +++++++++++++++++++++++++++++++++++++++++++++++-------
>  3 files changed, 84 insertions(+), 7 deletions(-)
> 
> diff --git a/net/tipc/link.c b/net/tipc/link.c index 
> 6053489c8063..fa639054329d
> 100644
> --- a/net/tipc/link.c
> +++ b/net/tipc/link.c
> @@ -1705,6 +1705,41 @@ void tipc_link_tnl_prepare(struct tipc_link *l,
> struct tipc_link *tnl,
>       }
>  }
> 
> +/**
> + * tipc_link_failover_prepare() - prepare tnl for link failover
> + *
> + * This is a special version of the precursor -
> +tipc_link_tnl_prepare(),
> + * see the __tipc_node_link_failover() for details
> + *
> + * @l: failover link
> + * @tnl: tunnel link
> + * @xmitq: queue for messages to be xmited  */ void
> +tipc_link_failover_prepare(struct tipc_link *l, struct tipc_link *tnl,
> +                             struct sk_buff_head *xmitq)
> +{
> +     struct sk_buff_head *fdefq = &tnl->failover_deferdq;
> +
> +     tipc_link_create_dummy_tnl_msg(tnl, xmitq);
> +
> +     /* This failover link enpoint should never be established

s/should never be established before, so did not../was never established, so it 
has not ...

> +      * before, so did not receive anything from peer.
> +      * Otherwise, it must be a normal failover situation or the
> +      * node has entered SELF_DOWN_PEER_LEAVING and both peer
> nodes
> +      * would have to start over from scratch instead.
> +      */
> +     WARN_ON(l && tipc_link_is_up(l));
> +     tnl->drop_point = 1;
> +     tnl->failover_reasm_skb = NULL;
> +
> +     /* Initiate the link's failover deferdq */
> +     if (unlikely(!skb_queue_empty(fdefq))) {
> +             pr_warn("Link failover deferdq not empty: %d!\n",
> +                     skb_queue_len(fdefq));
> +             __skb_queue_purge(fdefq);
> +     }
> +}
> +
>  /* tipc_link_validate_msg(): validate message against current link state
>   * Returns true if message should be accepted, otherwise false
>   */
> diff --git a/net/tipc/link.h b/net/tipc/link.h index
> 8439e0ee53a8..adcad65e761c 100644
> --- a/net/tipc/link.h
> +++ b/net/tipc/link.h
> @@ -90,6 +90,8 @@ void tipc_link_tnl_prepare(struct tipc_link *l, struct
> tipc_link *tnl,
>                          int mtyp, struct sk_buff_head *xmitq);  void
> tipc_link_create_dummy_tnl_msg(struct tipc_link *tnl,
>                                   struct sk_buff_head *xmitq);
> +void tipc_link_failover_prepare(struct tipc_link *l, struct tipc_link *tnl,
> +                             struct sk_buff_head *xmitq);
>  void tipc_link_build_reset_msg(struct tipc_link *l, struct sk_buff_head
> *xmitq);  int tipc_link_fsm_evt(struct tipc_link *l, int evt);  bool
> tipc_link_is_up(struct tipc_link *l); diff --git a/net/tipc/node.c
> b/net/tipc/node.c index 7478e2d4ec02..be07cf327d2d 100644
> --- a/net/tipc/node.c
> +++ b/net/tipc/node.c
> @@ -714,7 +714,6 @@ static void __tipc_node_link_up(struct tipc_node *n,
> int bearer_id,
>               *slot0 = bearer_id;
>               *slot1 = bearer_id;
>               tipc_node_fsm_evt(n, SELF_ESTABL_CONTACT_EVT);
> -             n->failover_sent = false;
>               n->action_flags |= TIPC_NOTIFY_NODE_UP;
>               tipc_link_set_active(nl, true);
>               tipc_bcast_add_peer(n->net, nl, xmitq); @@ -757,6 +756,45
> @@ static void tipc_node_link_up(struct tipc_node *n, int bearer_id,  }
> 
>  /**
> + * __tipc_node_link_failover() - start failover in case "half-failover"

Don't see any reason for the "__" prefix. This is normally used when we need 
two functions with the same name, one locked and one unlocked.

> + *
> + * This function is only called in a very special situation where link
> + * failover can be already started on peer node but not on this node.
> + * This can happen when e.g.
> + *   1. Both links <1A-1B>, <2A-2B> down
> + *   2. Link endpoint 1B up, but 1A still down (e.g. due to network
> + *      disturbance, wrong session, etc.)
> + *   3. Link <2A-2B> up
> + *   4. Link endpoint 1B down (e.g. due to link tolerance timeout)
> + *   5. Node B starts failover onto link <2A-2B>
> + *
> + *   ==> Node A does never start link/node failover!
> + *
> + * @n: tipc node structure
> + * @l: link peer endpoint failingover (- can be NULL)
> + * @tnl: tunnel link
> + * @xmitq: queue for messages to be xmited on tnl link later  */ static
> +void __tipc_node_link_failover(struct tipc_node *n, struct tipc_link *l,
> +                                   struct tipc_link *tnl,
> +                                   struct sk_buff_head *xmitq)
> +{
> +     /* Avoid to be "self-failover" that can never end */
> +     if (!tipc_link_is_up(tnl))
> +             return;
> +
> +     tipc_link_fsm_evt(tnl, LINK_SYNCH_END_EVT);
> +     tipc_node_fsm_evt(n, NODE_SYNCH_END_EVT);
> +
> +     n->sync_point = tipc_link_rcv_nxt(tnl) + (U16_MAX / 2 - 1);
> +     tipc_link_failover_prepare(l, tnl, xmitq);
> +
> +     if (l)
> +             tipc_link_fsm_evt(l, LINK_FAILOVER_BEGIN_EVT);
> +     tipc_node_fsm_evt(n, NODE_FAILOVER_BEGIN_EVT); }
> +
> +/**
>   * __tipc_node_link_down - handle loss of link
>   */
>  static void __tipc_node_link_down(struct tipc_node *n, int *bearer_id, @@
> -1675,14 +1713,16 @@ static bool tipc_node_check_state(struct tipc_node
> *n, struct sk_buff *skb,
>                       tipc_skb_queue_splice_tail_init(tipc_link_inputq(pl),
>                                                       tipc_link_inputq(l));
>               }
> +
>               /* If parallel link was already down, and this happened
> before
> -              * the tunnel link came up, FAILOVER was never sent. Ensure
> that
> -              * FAILOVER is sent to get peer out of NODE_FAILINGOVER
> state.
> +              * the tunnel link came up, node failover was never started.
> +              * Ensure that a FAILOVER_MSG is sent to get peer out of
> +              * NODE_FAILINGOVER state, also this node must accept
> +              * TUNNLE_MSGs from peer.

s/TUNNLE_MSG/TUNEL_MSG

>                */
> -             if (n->state != NODE_FAILINGOVER && !n->failover_sent) {
> -                     tipc_link_create_dummy_tnl_msg(l, xmitq);
> -                     n->failover_sent = true;
> -             }
> +             if (n->state != NODE_FAILINGOVER)
> +                     __tipc_node_link_failover(n, pl, l, xmitq);
> +
>               /* If pkts arrive out of order, use lowest calculated syncpt */
>               if (less(syncpt, n->sync_point))
>                       n->sync_point = syncpt;
> --
> 2.13.7



_______________________________________________
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Reply via email to