Re: [tipc-discussion] [net] tipc: fix incorrect setting window for bcast link

2020-10-12 Thread Hoang Huu Le
> -Original Message-
> From: Jon Maloy 
> Sent: Tuesday, October 13, 2020 6:39 AM
> To: Hoang Huu Le ; 
> tipc-discussion@lists.sourceforge.net; ma...@donjonn.com;
> ying@windriver.com
> Subject: Re: [net] tipc: fix incorrect setting window for bcast link
> 
> Hi Hoang,
> I apologize for not paying enough attention to this problem until now.
> See below.
> 
> 
> On 9/30/20 9:43 PM, Hoang Huu Le wrote:
> > In commit 16ad3f4022bb
> > ("tipc: introduce variable window congestion control"), we applied
> > the Reno algorithm to select window size from minimum window to the
> > configured maximum window for unicast link.
> >
> > However, when setting window variable we do not specific to unicast link.
> > This lead to the window for broadcast link had effect as unexpect value
> > (i.e the window size is constantly 32).
> >
> > We fix this by updating the window for broadcast as its configuration.
> >
> > Signed-off-by: Hoang Huu Le 
> > ---
> >   net/tipc/bcast.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
> > index 940d176e0e87..abac9443b4d9 100644
> > --- a/net/tipc/bcast.c
> > +++ b/net/tipc/bcast.c
> > @@ -585,7 +585,7 @@ static int tipc_bc_link_set_queue_limits(struct net 
> > *net, u32 max_win)
> > if (max_win > TIPC_MAX_LINK_WIN)
> > return -EINVAL;
> > tipc_bcast_lock(net);
> > -   tipc_link_set_queue_limits(l, BCLINK_WIN_MIN, max_win);
> > +   tipc_link_set_queue_limits(l, max_win, max_win);
> I think this is dangerous. The broadcast link puts a much higher stress
> on the switch, and the risk of massive packet loss with ditto
> retransmissions is very high.
> A safer bet to stay with a fix window of 50 for now, to solve the
> problem you have at sites, and then you can possibly experiment with a
> variable window later.
So, should we return to 'not support' when user changes the window for 
broadcast link. Or just silent ignore the setting?

> If it gives significant higher throughput it might be worthwhile trying,
> but I am pretty sure that e.g. the base for calculating ssthresh (300)
> is too big.
> 
> ///jon
> > tipc_bcast_unlock(net);
> > return 0;
> >   }


___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion


Re: [tipc-discussion] [net-next] tipc: introduce smooth change to replicast

2020-10-12 Thread Jon Maloy



On 10/8/20 11:36 PM, Hoang Huu Le wrote:

Hi Jon,

See my inline comment.

Thanks,
Hoang

-Original Message-
From: Jon Maloy 
Sent: Friday, October 9, 2020 1:27 AM
To: Hoang Huu Le ; ma...@donjonn.com; 
ying@windriver.com; tipc-
discuss...@lists.sourceforge.net
Subject: Re: [net-next] tipc: introduce smooth change to replicast



On 10/7/20 10:22 PM, Hoang Huu Le wrote:

In commit cad2929dc432 ("tipc: update a binding service via broadcast"),
We use broadcast to update a binding service for a large cluster.
However, if we try to publish a thousands of services at the
same time, we may to get "link overflow" happen because of queue limit
had reached.

We now introduce a smooth change to replicast if the broadcast link has
reach to the limit of queue.

To me this beats the whole purpose of using broadcast distribution in
the first place.
We wanted to save CPU and network resources by usingĀ  broadcast, and
then, when things get tough, we fall back to the supposedly less
efficient replicast method. Not good.

I wonder what is really happening when this overflow situation occurs.
First, the reset limit is dimensioned so that it should be possible to
publish MAX_PUBLICATIONS (65k) publications in one shot.
With full bundling, which is what I expect here, there are 1460/20 = 73
publication items in each buffer, so the reset limit (==max_bulk) should
be 65k/73 = 897 buffers.

[Hoang] No, it's not right!
As I staged in another commit that the reset limit is 350 buffers (65k/(3744/20)) 
=> #1.
where:
FB_MTU=3744
and if a user set window size is 50, we are fallback to 32 window-size => #2.
So, if the broadcast link is under high load traffic, we will reach to the 
limit easily (#1 + #2).

Yes. I didn't review your previous patches before making this comment.



My figures are just from the top of my head, so you should double check
them, but I find it unlikely that we hit this limit unless there is a
lot of other broadcast going on at the same time, and even then I find
it unlikely.

[Hoang]
I just implement a simple application:
[...]
num_service = 10k
for (i=0;i);
[...]

Then, run this app; sleep 2; kill SIGINT app; Then, the problem is reproducible.
  

I suggest you try to find out what is really going on when we reach this
situation.
-What exactly is in the backlog queue?
-Only publications?
-How many?
-A mixture of publications and other traffic?

Only publication/withdrawn buffers, around more thousands.


-Has bundling really worked as supposed?
-Do we still have some issue with the broadcast link that stops buffers
being acked and released in a timely manner?

I suspect you mean NO in this case ;-)

- Have you been able to dump out such info when this problem occurs?
- Are you able to re-produce it in your own system?

These answers are YES


In the end it might be as simple as increasing the reset limit, but we
really should try to understand what is happening first.

Yes, I think so. As previous patch I made the code change to update queue 
backlog supporting to 873 buffers.
But if I increase number of services in my app up to 20k (not real??>) . The 
issue is able to reproduce as well.
Why does it still happen? The new limit is calculated to be safe for 65k 
publications, so why does it fail already at 20k?
According to the loop you show above you only do publications, no 
withdrawals, so yo should not even theoretically be able to reach the 
reset limit.

What is happening?
Let's discuss this tomorrow.

///jon


That is the reason why I propose this new solution + combine with two previous 
patches to solve the problem completely.

Regards
///jon



Signed-off-by: Hoang Huu Le 
---
   net/tipc/link.c |  5 -
   net/tipc/node.c | 12 ++--
   2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/net/tipc/link.c b/net/tipc/link.c
index 06b880da2a8e..ca908ead753a 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -1022,7 +1022,10 @@ int tipc_link_xmit(struct tipc_link *l, struct 
sk_buff_head *list,
/* Allow oversubscription of one data msg per source at congestion */
if (unlikely(l->backlog[imp].len >= l->backlog[imp].limit)) {
if (imp == TIPC_SYSTEM_IMPORTANCE) {
-   pr_warn("%s<%s>, link overflow", link_rst_msg, l->name);
+   pr_warn_ratelimited("%s<%s>, link overflow",
+   link_rst_msg, l->name);
+   if (link_is_bc_sndlink(l))
+   return -EOVERFLOW;
return -ENOBUFS;
}
rc = link_schedule_user(l, hdr);
diff --git a/net/tipc/node.c b/net/tipc/node.c
index d269ebe382e1..a37976610367 100644
--- a/net/tipc/node.c
+++ b/net/tipc/node.c
@@ -1750,15 +1750,23 @@ void tipc_node_broadcast(struct net *net, struct 
sk_buff *skb, int rc_dests)
struct tipc_node *n;
u16 dummy;
u32 dst;
+   int rc = 0;

/* Use broadcast if all nodes 

Re: [tipc-discussion] [net] tipc: fix incorrect setting window for bcast link

2020-10-12 Thread Jon Maloy

Hi Hoang,
I apologize for not paying enough attention to this problem until now.
See below.


On 9/30/20 9:43 PM, Hoang Huu Le wrote:

In commit 16ad3f4022bb
("tipc: introduce variable window congestion control"), we applied
the Reno algorithm to select window size from minimum window to the
configured maximum window for unicast link.

However, when setting window variable we do not specific to unicast link.
This lead to the window for broadcast link had effect as unexpect value
(i.e the window size is constantly 32).

We fix this by updating the window for broadcast as its configuration.

Signed-off-by: Hoang Huu Le 
---
  net/tipc/bcast.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
index 940d176e0e87..abac9443b4d9 100644
--- a/net/tipc/bcast.c
+++ b/net/tipc/bcast.c
@@ -585,7 +585,7 @@ static int tipc_bc_link_set_queue_limits(struct net *net, 
u32 max_win)
if (max_win > TIPC_MAX_LINK_WIN)
return -EINVAL;
tipc_bcast_lock(net);
-   tipc_link_set_queue_limits(l, BCLINK_WIN_MIN, max_win);
+   tipc_link_set_queue_limits(l, max_win, max_win);
I think this is dangerous. The broadcast link puts a much higher stress 
on the switch, and the risk of massive packet loss with ditto 
retransmissions is very high.
A safer bet to stay with a fix window of 50 for now, to solve the 
problem you have at sites, and then you can possibly experiment with a 
variable window later.
If it gives significant higher throughput it might be worthwhile trying, 
but I am pretty sure that e.g. the base for calculating ssthresh (300) 
is too big.


///jon

tipc_bcast_unlock(net);
return 0;
  }




___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion


Re: [tipc-discussion] [net-next] tipc: re-configure queue limit for broadcast link

2020-10-12 Thread Jon Maloy




On 10/1/20 2:51 AM, Hoang Huu Le wrote:

The queue limit of the broadcast link is being calculated base on initial
MTU. However, when MTU value changed (e.g manual changing MTU on NIC device,
MTU negotiation etc.,) we do not re-calculate queue limit. This gives
throughput does not reflect with the change.

So fix it by calling the function to re-calculate queue limit of the
broadcast link.

Signed-off-by: Hoang Huu Le 
---
  net/tipc/bcast.c | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
index abac9443b4d9..bc566b304571 100644
--- a/net/tipc/bcast.c
+++ b/net/tipc/bcast.c
@@ -108,6 +108,7 @@ static void tipc_bcbase_select_primary(struct net *net)
  {
struct tipc_bc_base *bb = tipc_bc_base(net);
int all_dests =  tipc_link_bc_peers(bb->link);
+   int max_win = tipc_link_max_win(bb->link);
int i, mtu, prim;
  
  	bb->primary_bearer = INVALID_BEARER_ID;

@@ -121,8 +122,11 @@ static void tipc_bcbase_select_primary(struct net *net)
continue;
  
  		mtu = tipc_bearer_mtu(net, i);

-   if (mtu < tipc_link_mtu(bb->link))
+   if (mtu < tipc_link_mtu(bb->link)) {
tipc_link_set_mtu(bb->link, mtu);
+   tipc_link_set_queue_limits(bb->link, max_win,
+  max_win);
+   }
bb->bcast_support &= tipc_bearer_bcast_support(net, i);
if (bb->dests[i] < all_dests)
continue;

An obvious bug that explains a lot, and needed to be fixed. Thanks.
Acked-by: Jon Maloy 



___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion


Re: [tipc-discussion] [iproute2-next 0/2] tipc: add new options for TIPC encryption

2020-10-12 Thread Jon Maloy




On 10/11/20 5:13 AM, Tuong Lien wrote:

This series adds two new options in the 'iproute2/tipc' command, enabling users
to use the new TIPC encryption features, i.e. the master key and rekeying which
have been recently merged in kernel.

The help menu of the "tipc node set key" command is also updated accordingly:

# tipc node set key --help
Usage: tipc node set key KEY [algname ALGNAME] [PROPERTIES]
tipc node set key rekeying REKEYING

KEY
   Symmetric KEY & SALT as a composite ASCII or hex string (0x...) in form:
   [KEY: 16, 24 or 32 octets][SALT: 4 octets]

ALGNAME
   Cipher algorithm [default: "gcm(aes)"]

PROPERTIES
   master- Set KEY as a cluster master key
  - Set KEY as a cluster key
   nodeid NODEID - Set KEY as a per-node key for own or peer

REKEYING
   INTERVAL  - Set rekeying interval (in minutes) [0: disable]
   now   - Trigger one (first) rekeying immediately

EXAMPLES
   tipc node set key this_is_a_master_key master
   tipc node set key 0x746869735F69735F615F6B657931365F73616C74
   tipc node set key this_is_a_key16_salt algname "gcm(aes)" nodeid 1001002
   tipc node set key rekeying 600

Tuong Lien (2):
   tipc: add option to set master key for encryption
   tipc: add option to set rekeying for encryption

  tipc/cmdl.c |  2 +-
  tipc/cmdl.h |  1 +
  tipc/node.c | 81 +++--
  3 files changed, 62 insertions(+), 22 deletions(-)


Looks good to me.
Series
Acked-by: Jon Maloy 



___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion


Re: [tipc-discussion] [net ] tipc: add stricter control of reserved service types

2020-10-12 Thread Rune Torgersen
I think I see now.
This would block anybody from calling bind() on the topology server (except 
first user in kernel)
Using connect() should be fine.

Sorry, did not read clearly enough.


From: Rune Torgersen 
Sent: Monday, October 12, 2020 8:51 AM
To: jma...@redhat.com ; 
tipc-discussion@lists.sourceforge.net 
Cc: x...@redhat.com 
Subject: Re: [tipc-discussion] [net ] tipc: add stricter control of reserved 
service types

Hi.

We use the tipc topology server extensively in our system to keep track of 
other instances of an app.
Dies this chage mea we cannot use the topology server anymore?

This si the userland code we use
(this one would check on open sockets on 200:0 to 200:255

void * TipcTopologyServerThread(void * context)
{
int tipcStatusSocket;
struct tipc_event event;
struct sockaddr_tipc topsrv;

struct tipc_subscr subscr = {{200, 0, 255},
TIPC_WAIT_FOREVER,
TIPC_SUB_SERVICE,
{5,5,5,5,5,5,5,5}};


memset(, 0, sizeof(topsrv));
topsrv.family = AF_TIPC;
topsrv.addrtype = TIPC_ADDR_NAME;
topsrv.addr.name.name.type = TIPC_TOP_SRV;
topsrv.addr.name.name.instance = TIPC_TOP_SRV;

tipcStatusSocket = socket(AF_TIPC, SOCK_SEQPACKET,0);
if (tipcStatusSocket < 0)
{
perror("TipcTopologyThread: Could not make TIPC socket");
}

// Connect to topology server:
if (0 > connect(tipcStatusSocket,(struct sockaddr*),sizeof(topsrv)))
{
perro("TipcTopologyThread: Could not connect to TIPC topology server");
return NULL;
}

if (send(tipcStatusSocket,,sizeof(subscr),0) != sizeof(subscr))
{
printf("TipcTopologyThread: Failed to send topology server 
subscription");
return NULL;
}

// Now wait for the subscriptions to fire:
while(true)
{
int ret = recv(tipcStatusSocket,,sizeof(event),0);
if (ret == sizeof(event))
{
printf("Received an event\n");
if (event.event == TIPC_PUBLISHED)
{
   printf("received a TIPC_PUBLISH msg, type: %u, found_lower: %u, 
found_upper: %u\n", event.s.seq.type, event.found_lower, event.found_upper);
   // do something
}
else if (event.event == TIPC_WITHDRAWN)
{
   printf("received a TIPC_WITHDRAWN msg, type: %u, found_lower: 
%u, found_upper: %u\n", event.s.seq.type, event.found_lower, event.found_upper);
   // do something
}
else if (event.event == TIPC_SUBSCR_TIMEOUT)
{
   printf("received a TIPC_SUBSCR_TIMEOUT msg, type: %u, 
found_lower: %u, found_upper: %u\n", event.s.seq.type, event.found_lower, 
event.found_upper);
}
}
}

return NULL;
}





From: jma...@redhat.com 
Sent: Saturday, October 10, 2020 9:56 AM
To: tipc-discussion@lists.sourceforge.net 

Cc: x...@redhat.com 
Subject: [tipc-discussion] [net ] tipc: add stricter control of reserved 
service types

This email originated from outside Innovative Systems. Do not click links or 
open attachments unless you recognize the sender and know the content is safe.


From: Jon Maloy 

TIPC reserves 64 service types for current and future internal use.
Therefore, the bind() function is meant to block regular user sockets
from being bound to these values, while it should let through such
bindings from internal users.

However, since we at the design moment saw no way to distinguish
between regular and internal users the filter function ended up
with allowing all bindings of the types which were really in use
([0,1]), and block all the rest ([2,63]).

This is dangerous, since a regular user may bind to the service type
representing the topology server (TIPC_TOP_SRV == 1) or the one used
for indicating neigboring node status (TIPC_CFG_SRV == 0), and wreak
havoc for users of those services. I.e., practically all users.

The reality is however that TIPC_CFG_SRV never is bound through the
bind() function, since it doesn't represent a regular socket, and
TIPC_TOP_SRV can easily be filtered out, since it is the very first
binding performed when the system is starting.

We can hence block TIPC_CFG_SRV completely, and only allow TIPC_TOP_SRV
to be bound once, and the correct behavior is achieved. This is what we
do in this commit.

It should be noted that, although this is a change of the API semantics,
there is no risk we will break any currently working applications by
doing this. Any application trying to bind to the values in question
would be badly broken from the outset, so there is no chance we would
find any such applications in real-world production systems.

Signed-off-by: Jon Maloy 
---
 net/tipc/socket.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index e795a8a2955b..67875a5761d0 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -665,6 

Re: [tipc-discussion] [net ] tipc: add stricter control of reserved service types

2020-10-12 Thread Rune Torgersen
Hi.

We use the tipc topology server extensively in our system to keep track of 
other instances of an app.
Dies this chage mea we cannot use the topology server anymore?

This si the userland code we use
(this one would check on open sockets on 200:0 to 200:255

void * TipcTopologyServerThread(void * context)
{
int tipcStatusSocket;
struct tipc_event event;
struct sockaddr_tipc topsrv;

struct tipc_subscr subscr = {{200, 0, 255},
TIPC_WAIT_FOREVER,
TIPC_SUB_SERVICE,
{5,5,5,5,5,5,5,5}};


memset(, 0, sizeof(topsrv));
topsrv.family = AF_TIPC;
topsrv.addrtype = TIPC_ADDR_NAME;
topsrv.addr.name.name.type = TIPC_TOP_SRV;
topsrv.addr.name.name.instance = TIPC_TOP_SRV;

tipcStatusSocket = socket(AF_TIPC, SOCK_SEQPACKET,0);
if (tipcStatusSocket < 0)
{
perror("TipcTopologyThread: Could not make TIPC socket");
}

// Connect to topology server:
if (0 > connect(tipcStatusSocket,(struct sockaddr*),sizeof(topsrv)))
{
perro("TipcTopologyThread: Could not connect to TIPC topology server");
return NULL;
}

if (send(tipcStatusSocket,,sizeof(subscr),0) != sizeof(subscr))
{
printf("TipcTopologyThread: Failed to send topology server 
subscription");
return NULL;
}

// Now wait for the subscriptions to fire:
while(true)
{
int ret = recv(tipcStatusSocket,,sizeof(event),0);
if (ret == sizeof(event))
{
printf("Received an event\n");
if (event.event == TIPC_PUBLISHED)
{
   printf("received a TIPC_PUBLISH msg, type: %u, found_lower: %u, 
found_upper: %u\n", event.s.seq.type, event.found_lower, event.found_upper);
   // do something
}
else if (event.event == TIPC_WITHDRAWN)
{
   printf("received a TIPC_WITHDRAWN msg, type: %u, found_lower: 
%u, found_upper: %u\n", event.s.seq.type, event.found_lower, event.found_upper);
   // do something
}
else if (event.event == TIPC_SUBSCR_TIMEOUT)
{
   printf("received a TIPC_SUBSCR_TIMEOUT msg, type: %u, 
found_lower: %u, found_upper: %u\n", event.s.seq.type, event.found_lower, 
event.found_upper);
}
}
}

return NULL;
}





From: jma...@redhat.com 
Sent: Saturday, October 10, 2020 9:56 AM
To: tipc-discussion@lists.sourceforge.net 

Cc: x...@redhat.com 
Subject: [tipc-discussion] [net ] tipc: add stricter control of reserved 
service types

This email originated from outside Innovative Systems. Do not click links or 
open attachments unless you recognize the sender and know the content is safe.


From: Jon Maloy 

TIPC reserves 64 service types for current and future internal use.
Therefore, the bind() function is meant to block regular user sockets
from being bound to these values, while it should let through such
bindings from internal users.

However, since we at the design moment saw no way to distinguish
between regular and internal users the filter function ended up
with allowing all bindings of the types which were really in use
([0,1]), and block all the rest ([2,63]).

This is dangerous, since a regular user may bind to the service type
representing the topology server (TIPC_TOP_SRV == 1) or the one used
for indicating neigboring node status (TIPC_CFG_SRV == 0), and wreak
havoc for users of those services. I.e., practically all users.

The reality is however that TIPC_CFG_SRV never is bound through the
bind() function, since it doesn't represent a regular socket, and
TIPC_TOP_SRV can easily be filtered out, since it is the very first
binding performed when the system is starting.

We can hence block TIPC_CFG_SRV completely, and only allow TIPC_TOP_SRV
to be bound once, and the correct behavior is achieved. This is what we
do in this commit.

It should be noted that, although this is a change of the API semantics,
there is no risk we will break any currently working applications by
doing this. Any application trying to bind to the values in question
would be badly broken from the outset, so there is no chance we would
find any such applications in real-world production systems.

Signed-off-by: Jon Maloy 
---
 net/tipc/socket.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index e795a8a2955b..67875a5761d0 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -665,6 +665,7 @@ static int tipc_bind(struct socket *sock, struct sockaddr 
*uaddr,
struct sockaddr_tipc *addr = (struct sockaddr_tipc *)uaddr;
struct tipc_sock *tsk = tipc_sk(sk);
int res = -EINVAL;
+   u32 stype, dnode;

lock_sock(sk);
if (unlikely(!uaddr_len)) {
@@ -691,9 +692,10 @@ static int tipc_bind(struct socket *sock, struct sockaddr 
*uaddr,
goto exit;
}

-   if