-----Original Message----- From: Jon Maloy <jma...@redhat.com> Sent: Tuesday, June 2, 2020 8:37 PM To: Hoang Huu Le <hoang.h...@dektech.com.au>; tipc-discussion@lists.sourceforge.net Cc: Tung Quang Nguyen <tung.q.ngu...@dektech.com.au>; Tuong Tong Lien <tuong.t.l...@dektech.com.au>; ma...@donjonn.com; x...@redhat.com; ying....@windriver.com; parthasarathy.bhuvara...@gmail.com Subject: Re: [ ] tipc: update a binding service via broadcast
On 6/1/20 7:45 PM, Hoang Huu Le wrote: > Hi Jon, > > See my comment inline. > > Hoang > -----Original Message----- > From: Jon Maloy <jma...@redhat.com> > Sent: Monday, June 1, 2020 7:40 PM > To: Hoang Huu Le <hoang.h...@dektech.com.au>; > tipc-discussion@lists.sourceforge.net > Cc: Tung Quang Nguyen <tung.q.ngu...@dektech.com.au>; Tuong Tong Lien > <tuong.t.l...@dektech.com.au>; ma...@donjonn.com; x...@redhat.com; > ying....@windriver.com; parthasarathy.bhuvara...@gmail.com > Subject: Re: [ ] tipc: update a binding service via broadcast > > Hi Hoang. > See below. > > On 6/1/20 5:33 AM, Hoang Huu Le wrote: >> Hi Jon, >> >> There is a concern in function tipc_node_broadcast. >> See my comment inline. >> >> Regards, >> Hoang >> -----Original Message----- >> From: jma...@redhat.com <jma...@redhat.com> >> Sent: Friday, May 29, 2020 10:34 PM >> To: tipc-discussion@lists.sourceforge.net >> Cc: Tung Quang Nguyen <tung.q.ngu...@dektech.com.au>; Hoang Huu Le >> <hoang.h...@dektech.com.au>; Tuong Tong Lien <tuong.t.l...@dektech.com.au>; >> jma...@redhat.com; ma...@donjonn.com; x...@redhat.com; >> ying....@windriver.com; parthasarathy.bhuvara...@gmail.com >> Subject: [ ] tipc: update a binding service via broadcast >> >> From: Hoang Huu Le <hoang.h...@dektech.com.au> >> >> Currently, updating binding table (add service binding to >> name table/withdraw a service binding) is being sent over replicast. >> However, if we are scaling up clusters to > 100 nodes/containers this >> method is less affection because of looping through nodes in a cluster one >> by one. > [...] >> >> + /* Use broadcast if all nodes support it */ >> + if (!rc_dests) { >> + __skb_queue_head_init(&xmitq); >> + __skb_queue_tail(&xmitq, skb); >> + tipc_bcast_xmit(net, &xmitq, &dummy); >> + return; >> + } >> + >> [Hoang] >> We could not use 'rc_dests' to send as broadcast mode because of it is not >> fully broadcast supporting in the cluster. >> As stated, if there is a node in the cluster not supporting broadcast mode, >> we need to use replicast instead. >> That's why we introduced the feature "Smooth switch between >> replicast/broadcast in bcast.c" and a new command line to configure the >> broadcast link. >> If we assume base on 'rc_dests' (i.e capability flags TIPC_NAMED_BCAST), >> probably 'forced replicast' configuration on broadcast link becomes useless. >> Then, destination nodes will be missed the publication item. > You misunderstand this function. rc_dests is a *counter*, not a flag. It > counts the number of peer nodes that don't support TIPC_NAMED_BCAST, and > if this is non-zero, we use replicast. So this is safe. > > [Hoang] Yes, I know about this counter. What I'm considering about the L2 > interface (e.g VXLAN) pseudo support broadcast (we discussed the topic a year > ago), then, the sending side must be switching to replicast > (method->force_bcast = true). But above likely forcing to use broadcast > careless setting from broadcast/replicast from L2. Yes you are right. I only checked this for UDP, where it seems to be ok, but forgot about VXLAN. Still, we can avoid building up 'dest' lists by making a small change to tipc_node_bcast(): /* Use broadcast if bearer and all nodes support it */ if (!rc_dests && tipc_bcast_get_broadcast_mode(net) != BCLINK_MODE_RCAST) { u16 dummy; __skb_queue_head_init(&xmitq); __skb_queue_tail(&xmitq, skb); tipc_bcast_xmit(net, &xmitq, &dummy); return; } [Hoang] It sounds fine with this way. I really believe sequence numbering and these three new bits solve our problem in a simple and robust way. Strictly, even the 'non_legacy' bit could be omitted, since the receiver already has this info from the capability map, but I found it simpler to do it that way. Also, I realized that tipc_named_dequeue() function can be simplified further, since there can be no duplicate messages. [Hoang] I will do further testing your solution and improve this one. BR ///jon > > ///jon > >> + /* Otherwise use legacy replicast method */ >> rcu_read_lock(); >> list_for_each_entry_rcu(n, tipc_nodes(net), list) { >> dst = n->addr; >> @@ -1749,7 +1762,6 @@ void tipc_node_broadcast(struct net *net, struct >> sk_buff *skb) >> tipc_node_xmit_skb(net, txskb, dst, 0); >> } >> rcu_read_unlock(); >> - >> kfree_skb(skb); >> } >> >> @@ -1844,7 +1856,9 @@ static void tipc_node_bc_rcv(struct net *net, struct >> sk_buff *skb, int bearer_id >> >> /* Handle NAME_DISTRIBUTOR messages sent from 1.7 nodes */ >> if (!skb_queue_empty(&n->bc_entry.namedq)) >> - tipc_named_rcv(net, &n->bc_entry.namedq); >> + tipc_named_rcv(net, &n->bc_entry.namedq, >> + &n->bc_entry.named_rcv_nxt, >> + &n->bc_entry.named_open); >> >> /* If reassembly or retransmission failure => reset all links to peer */ >> if (rc & TIPC_LINK_DOWN_EVT) >> @@ -2109,7 +2123,9 @@ void tipc_rcv(struct net *net, struct sk_buff *skb, >> struct tipc_bearer *b) >> tipc_node_link_down(n, bearer_id, false); >> >> if (unlikely(!skb_queue_empty(&n->bc_entry.namedq))) >> - tipc_named_rcv(net, &n->bc_entry.namedq); >> + tipc_named_rcv(net, &n->bc_entry.namedq, >> + &n->bc_entry.named_rcv_nxt, >> + &n->bc_entry.named_open); >> >> if (unlikely(!skb_queue_empty(&n->bc_entry.inputq1))) >> tipc_node_mcast_rcv(n); >> diff --git a/net/tipc/node.h b/net/tipc/node.h >> index a6803b449a2c..9f6f13f1604f 100644 >> --- a/net/tipc/node.h >> +++ b/net/tipc/node.h >> @@ -55,7 +55,8 @@ enum { >> TIPC_MCAST_RBCTL = (1 << 7), >> TIPC_GAP_ACK_BLOCK = (1 << 8), >> TIPC_TUNNEL_ENHANCED = (1 << 9), >> - TIPC_NAGLE = (1 << 10) >> + TIPC_NAGLE = (1 << 10), >> + TIPC_NAMED_BCAST = (1 << 11) >> }; >> >> #define TIPC_NODE_CAPABILITIES (TIPC_SYN_BIT | \ >> @@ -68,7 +69,8 @@ enum { >> TIPC_MCAST_RBCTL | \ >> TIPC_GAP_ACK_BLOCK | \ >> TIPC_TUNNEL_ENHANCED | \ >> - TIPC_NAGLE) >> + TIPC_NAGLE | \ >> + TIPC_NAMED_BCAST) >> >> #define INVALID_BEARER_ID -1 >> >> @@ -101,7 +103,7 @@ int tipc_node_xmit_skb(struct net *net, struct sk_buff >> *skb, u32 dest, >> u32 selector); >> void tipc_node_subscribe(struct net *net, struct list_head *subscr, u32 >> addr); >> void tipc_node_unsubscribe(struct net *net, struct list_head *subscr, u32 >> addr); >> -void tipc_node_broadcast(struct net *net, struct sk_buff *skb); >> +void tipc_node_broadcast(struct net *net, struct sk_buff *skb, int >> rc_dests); >> int tipc_node_add_conn(struct net *net, u32 dnode, u32 port, u32 >> peer_port); >> void tipc_node_remove_conn(struct net *net, u32 dnode, u32 port); >> int tipc_node_get_mtu(struct net *net, u32 addr, u32 sel, bool connected); _______________________________________________ tipc-discussion mailing list tipc-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/tipc-discussion