-----Original Message-----
From: Jon Maloy <jma...@redhat.com> 
Sent: Tuesday, June 2, 2020 8:37 PM
To: Hoang Huu Le <hoang.h...@dektech.com.au>; 
tipc-discussion@lists.sourceforge.net
Cc: Tung Quang Nguyen <tung.q.ngu...@dektech.com.au>; Tuong Tong Lien 
<tuong.t.l...@dektech.com.au>; ma...@donjonn.com; x...@redhat.com; 
ying....@windriver.com; parthasarathy.bhuvara...@gmail.com
Subject: Re: [ ] tipc: update a binding service via broadcast



On 6/1/20 7:45 PM, Hoang Huu Le wrote:
> Hi Jon,
>
> See my comment inline.
>
> Hoang
> -----Original Message-----
> From: Jon Maloy <jma...@redhat.com>
> Sent: Monday, June 1, 2020 7:40 PM
> To: Hoang Huu Le <hoang.h...@dektech.com.au>; 
> tipc-discussion@lists.sourceforge.net
> Cc: Tung Quang Nguyen <tung.q.ngu...@dektech.com.au>; Tuong Tong Lien 
> <tuong.t.l...@dektech.com.au>; ma...@donjonn.com; x...@redhat.com; 
> ying....@windriver.com; parthasarathy.bhuvara...@gmail.com
> Subject: Re: [ ] tipc: update a binding service via broadcast
>
> Hi Hoang.
> See below.
>
> On 6/1/20 5:33 AM, Hoang Huu Le wrote:
>> Hi Jon,
>>
>> There is a concern in function tipc_node_broadcast.
>> See my comment inline.
>>
>> Regards,
>> Hoang
>> -----Original Message-----
>> From: jma...@redhat.com <jma...@redhat.com>
>> Sent: Friday, May 29, 2020 10:34 PM
>> To: tipc-discussion@lists.sourceforge.net
>> Cc: Tung Quang Nguyen <tung.q.ngu...@dektech.com.au>; Hoang Huu Le 
>> <hoang.h...@dektech.com.au>; Tuong Tong Lien <tuong.t.l...@dektech.com.au>; 
>> jma...@redhat.com; ma...@donjonn.com; x...@redhat.com; 
>> ying....@windriver.com; parthasarathy.bhuvara...@gmail.com
>> Subject: [ ] tipc: update a binding service via broadcast
>>
>> From: Hoang Huu Le <hoang.h...@dektech.com.au>
>>
>> Currently, updating binding table (add service binding to
>> name table/withdraw a service binding) is being sent over replicast.
>> However, if we are scaling up clusters to > 100 nodes/containers this
>> method is less affection because of looping through nodes in a cluster one
>> by one.
> [...]
>>    
>> +    /* Use broadcast if all nodes support it */
>> +    if (!rc_dests) {
>> +            __skb_queue_head_init(&xmitq);
>> +            __skb_queue_tail(&xmitq, skb);
>> +            tipc_bcast_xmit(net, &xmitq, &dummy);
>> +            return;
>> +    }
>> +
>> [Hoang]
>> We could not use 'rc_dests' to send as broadcast mode because of it is not 
>> fully broadcast supporting in the cluster.
>> As stated, if there is a node in the cluster not supporting broadcast mode, 
>> we need to use replicast instead.
>> That's why we introduced the feature "Smooth switch between 
>> replicast/broadcast in bcast.c" and a new command line to configure the 
>> broadcast link.
>> If we assume base on 'rc_dests' (i.e capability flags TIPC_NAMED_BCAST), 
>> probably 'forced replicast' configuration on broadcast link becomes useless. 
>> Then, destination nodes will be missed the publication item.
> You misunderstand this function. rc_dests is a *counter*, not a flag. It
> counts the number of peer nodes that don't support TIPC_NAMED_BCAST, and
> if this is non-zero, we use replicast. So this is safe.
>
> [Hoang] Yes, I know about this counter. What I'm considering about the L2 
> interface (e.g VXLAN) pseudo support broadcast (we discussed the topic a year 
> ago), then, the sending side must be switching to replicast 
> (method->force_bcast = true). But above likely forcing to use broadcast 
> careless setting from broadcast/replicast from L2.
Yes you are right. I only checked this for UDP, where it seems to be ok, 
but forgot about VXLAN.
Still, we can avoid building up 'dest' lists by making a small change to 
tipc_node_bcast():

/* Use broadcast if bearer and all nodes support it */
if (!rc_dests && tipc_bcast_get_broadcast_mode(net) != BCLINK_MODE_RCAST) {
          u16 dummy;
          __skb_queue_head_init(&xmitq);
          __skb_queue_tail(&xmitq, skb);
          tipc_bcast_xmit(net, &xmitq, &dummy);
          return;
}
[Hoang] It sounds fine with this way.

I really believe sequence numbering and these three new bits solve our 
problem in a simple and robust way.
Strictly, even the 'non_legacy' bit could be omitted, since the receiver 
already has this info from the capability map, but I found it simpler to 
do it that way. Also, I realized that tipc_named_dequeue() function can 
be simplified further, since there can be no duplicate messages.

[Hoang] I will do further testing your solution and improve this one. 

BR
///jon

>
> ///jon
>
>> +    /* Otherwise use legacy replicast method */
>>      rcu_read_lock();
>>      list_for_each_entry_rcu(n, tipc_nodes(net), list) {
>>              dst = n->addr;
>> @@ -1749,7 +1762,6 @@ void tipc_node_broadcast(struct net *net, struct 
>> sk_buff *skb)
>>              tipc_node_xmit_skb(net, txskb, dst, 0);
>>      }
>>      rcu_read_unlock();
>> -
>>      kfree_skb(skb);
>>    }
>>    
>> @@ -1844,7 +1856,9 @@ static void tipc_node_bc_rcv(struct net *net, struct 
>> sk_buff *skb, int bearer_id
>>    
>>      /* Handle NAME_DISTRIBUTOR messages sent from 1.7 nodes */
>>      if (!skb_queue_empty(&n->bc_entry.namedq))
>> -            tipc_named_rcv(net, &n->bc_entry.namedq);
>> +            tipc_named_rcv(net, &n->bc_entry.namedq,
>> +                           &n->bc_entry.named_rcv_nxt,
>> +                           &n->bc_entry.named_open);
>>    
>>      /* If reassembly or retransmission failure => reset all links to peer */
>>      if (rc & TIPC_LINK_DOWN_EVT)
>> @@ -2109,7 +2123,9 @@ void tipc_rcv(struct net *net, struct sk_buff *skb, 
>> struct tipc_bearer *b)
>>              tipc_node_link_down(n, bearer_id, false);
>>    
>>      if (unlikely(!skb_queue_empty(&n->bc_entry.namedq)))
>> -            tipc_named_rcv(net, &n->bc_entry.namedq);
>> +            tipc_named_rcv(net, &n->bc_entry.namedq,
>> +                           &n->bc_entry.named_rcv_nxt,
>> +                           &n->bc_entry.named_open);
>>    
>>      if (unlikely(!skb_queue_empty(&n->bc_entry.inputq1)))
>>              tipc_node_mcast_rcv(n);
>> diff --git a/net/tipc/node.h b/net/tipc/node.h
>> index a6803b449a2c..9f6f13f1604f 100644
>> --- a/net/tipc/node.h
>> +++ b/net/tipc/node.h
>> @@ -55,7 +55,8 @@ enum {
>>      TIPC_MCAST_RBCTL      = (1 << 7),
>>      TIPC_GAP_ACK_BLOCK    = (1 << 8),
>>      TIPC_TUNNEL_ENHANCED  = (1 << 9),
>> -    TIPC_NAGLE            = (1 << 10)
>> +    TIPC_NAGLE            = (1 << 10),
>> +    TIPC_NAMED_BCAST      = (1 << 11)
>>    };
>>    
>>    #define TIPC_NODE_CAPABILITIES (TIPC_SYN_BIT           |  \
>> @@ -68,7 +69,8 @@ enum {
>>                              TIPC_MCAST_RBCTL       |   \
>>                              TIPC_GAP_ACK_BLOCK     |   \
>>                              TIPC_TUNNEL_ENHANCED   |   \
>> -                            TIPC_NAGLE)
>> +                            TIPC_NAGLE             |   \
>> +                            TIPC_NAMED_BCAST)
>>    
>>    #define INVALID_BEARER_ID -1
>>    
>> @@ -101,7 +103,7 @@ int tipc_node_xmit_skb(struct net *net, struct sk_buff 
>> *skb, u32 dest,
>>                     u32 selector);
>>    void tipc_node_subscribe(struct net *net, struct list_head *subscr, u32 
>> addr);
>>    void tipc_node_unsubscribe(struct net *net, struct list_head *subscr, u32 
>> addr);
>> -void tipc_node_broadcast(struct net *net, struct sk_buff *skb);
>> +void tipc_node_broadcast(struct net *net, struct sk_buff *skb, int 
>> rc_dests);
>>    int tipc_node_add_conn(struct net *net, u32 dnode, u32 port, u32 
>> peer_port);
>>    void tipc_node_remove_conn(struct net *net, u32 dnode, u32 port);
>>    int tipc_node_get_mtu(struct net *net, u32 addr, u32 sel, bool connected);


_______________________________________________
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Reply via email to