Re: [tipc-discussion] [net-next] tipc: introduce smooth change to replicast

Jon Maloy Thu, 08 Oct 2020 11:28:13 -0700


On 10/7/20 10:22 PM, Hoang Huu Le wrote:

In commit cad2929dc432 ("tipc: update a binding service via broadcast"),
We use broadcast to update a binding service for a large cluster.
However, if we try to publish a thousands of services at the
same time, we may to get "link overflow" happen because of queue limit
had reached.

We now introduce a smooth change to replicast if the broadcast link has
reach to the limit of queue.

To me this beats the whole purpose of using broadcast distribution inthe first place.We wanted to save CPU and network resources by using broadcast, andthen, when things get tough, we fall back to the supposedly lessefficient replicast method. Not good.


I wonder what is really happening when this overflow situation occurs.

First, the reset limit is dimensioned so that it should be possible topublish MAX_PUBLICATIONS (65k) publications in one shot.With full bundling, which is what I expect here, there are 1460/20 = 73publication items in each buffer, so the reset limit (==max_bulk) shouldbe 65k/73 = 897 buffers.My figures are just from the top of my head, so you should double checkthem, but I find it unlikely that we hit this limit unless there is alot of other broadcast going on at the same time, and even then I findit unlikely.

I suggest you try to find out what is really going on when we reach thissituation.

-What exactly is in the backlog queue?
-Only publications?
-How many?
-A mixture of publications and other traffic?
-Has bundling really worked as supposed?

-Do we still have some issue with the broadcast link that stops buffersbeing acked and released in a timely manner?

- Have you been able to dump out such info when this problem occurs?
- Are you able to re-produce it in your own system?

In the end it might be as simple as increasing the reset limit, but wereally should try to understand what is happening first.


Regards
///jon


Signed-off-by: Hoang Huu Le <[email protected]>
---
  net/tipc/link.c |  5 ++++-
  net/tipc/node.c | 12 ++++++++++--
  2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/net/tipc/link.c b/net/tipc/link.c
index 06b880da2a8e..ca908ead753a 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -1022,7 +1022,10 @@ int tipc_link_xmit(struct tipc_link *l, struct 
sk_buff_head *list,
        /* Allow oversubscription of one data msg per source at congestion */
        if (unlikely(l->backlog[imp].len >= l->backlog[imp].limit)) {
                if (imp == TIPC_SYSTEM_IMPORTANCE) {
-                       pr_warn("%s<%s>, link overflow", link_rst_msg, l->name);
+                       pr_warn_ratelimited("%s<%s>, link overflow",
+                                           link_rst_msg, l->name);
+                       if (link_is_bc_sndlink(l))
+                               return -EOVERFLOW;
                        return -ENOBUFS;
                }
                rc = link_schedule_user(l, hdr);
diff --git a/net/tipc/node.c b/net/tipc/node.c
index d269ebe382e1..a37976610367 100644
--- a/net/tipc/node.c
+++ b/net/tipc/node.c
@@ -1750,15 +1750,23 @@ void tipc_node_broadcast(struct net *net, struct 
sk_buff *skb, int rc_dests)
        struct tipc_node *n;
        u16 dummy;
        u32 dst;
+       int rc = 0;

/* Use broadcast if all nodes support it */

        if (!rc_dests && tipc_bcast_get_mode(net) != BCLINK_MODE_RCAST) {
+               txskb = pskb_copy(skb, GFP_ATOMIC);
+               if (!txskb)
+                       goto rcast;
                __skb_queue_head_init(&xmitq);
-               __skb_queue_tail(&xmitq, skb);
-               tipc_bcast_xmit(net, &xmitq, &dummy);
+               __skb_queue_tail(&xmitq, txskb);
+               rc = tipc_bcast_xmit(net, &xmitq, &dummy);
+               if (rc == -EOVERFLOW)
+                       goto rcast;
+               kfree_skb(skb);
                return;
        }

+rcast:

        /* Otherwise use legacy replicast method */
        rcu_read_lock();
        list_for_each_entry_rcu(n, tipc_nodes(net), list) {




_______________________________________________
tipc-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Re: [tipc-discussion] [net-next] tipc: introduce smooth change to replicast

Reply via email to