from:"Hoang Huu Le"

Re: [tipc-discussion] TIPC Communication

2022-06-15 Thread Hoang Huu Le

Hi,

Please take an example at:
https://sourceforge.net/p/tipc/tipcutils/ci/master/tree/demos/connection_demo/

Regards,
Hoang
> -Original Message-
> From: Røysland, Jonas Gjendem via tipc-discussion 
> 
> Sent: Tuesday, June 14, 2022 8:22 PM
> To: tipc-discussion@lists.sourceforge.net
> Subject: [tipc-discussion] TIPC Communication
> 
> Hey,
> 
> We are some summer students that are working with the TIPC protocol for a 
> project. We like to make a sequence diagram of TIPC to
> better understand how the protocol communicate from the client to the server. 
> Like in TCP it is using 3-way handshake to
> communicate before sending data from one another. We really appreciete the 
> help we could get to better understand the protocol.
> 
> Sincerly,
> 
> Jonas Gjendem Røysland
> 
> ___
> tipc-discussion mailing list
> tipc-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/tipc-discussion


___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Re: [tipc-discussion] [PATCH] tipc: fix use-after-free Read in tipc_named_reinit

2022-06-12 Thread Hoang Huu Le

Hi Jon, Ying,

Just remind in case you guys missed this email thread.

Thanks,
Hoang
> -Original Message-
> From: Hoang Le 
> Sent: Tuesday, June 7, 2022 2:35 PM
> To: jma...@redhat.com; ma...@donjonn.com; ying@windriver.com; Tung Quang 
> Nguyen ;
> tipc-discussion@lists.sourceforge.net
> Cc: syzbot+47af19f3307fc9c5c...@syzkaller.appspotmail.com
> Subject: [tipc-discussion] [PATCH] tipc: fix use-after-free Read in 
> tipc_named_reinit
> 
> syzbot found the following issue on:
> ==
> BUG: KASAN: use-after-free in tipc_named_reinit+0x94f/0x9b0
> net/tipc/name_distr.c:413
> Read of size 8 at addr 88805299a000 by task kworker/1:9/23764
> 
> CPU: 1 PID: 23764 Comm: kworker/1:9 Not tainted
> 5.18.0-rc4-syzkaller-00878-g17d49e6e8012 #0
> Hardware name: Google Compute Engine/Google Compute Engine,
> BIOS Google 01/01/2011
> Workqueue: events tipc_net_finalize_work
> Call Trace:
>  
>  __dump_stack lib/dump_stack.c:88 [inline]
>  dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
>  print_address_description.constprop.0.cold+0xeb/0x495
> mm/kasan/report.c:313
>  print_report mm/kasan/report.c:429 [inline]
>  kasan_report.cold+0xf4/0x1c6 mm/kasan/report.c:491
>  tipc_named_reinit+0x94f/0x9b0 net/tipc/name_distr.c:413
>  tipc_net_finalize+0x234/0x3d0 net/tipc/net.c:138
>  process_one_work+0x996/0x1610 kernel/workqueue.c:2289
>  worker_thread+0x665/0x1080 kernel/workqueue.c:2436
>  kthread+0x2e9/0x3a0 kernel/kthread.c:376
>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:298
>  
> [...]
> ==
> 
> In the commit
> d966ddcc3821 ("tipc: fix a deadlock when flushing scheduled work"),
> the cancel_work_sync() function just to make sure ONLY the work
> tipc_net_finalize_work() is executing/pending on any CPU completed before
> tipc namespace is destroyed through tipc_exit_net(). But this function
> is not guaranteed the work is the last queued. So, the destroyed instance
> may be accessed in the work which will try to enqueue later.
> 
> In order to completely fix, we re-order the calling of cancel_work_sync()
> to make sure the work tipc_net_finalize_work() was last queued and it
> must be completed by calling cancel_work_sync().
> 
> Reported-by: syzbot+47af19f3307fc9c5c...@syzkaller.appspotmail.com
> Fixes: d966ddcc3821 ("tipc: fix a deadlock when flushing scheduled work")
> Signed-off-by: Ying Xue 
> Signed-off-by: Hoang Le 
> ---
>  net/tipc/core.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/net/tipc/core.c b/net/tipc/core.c
> index 3f4542e0f065..434e70eabe08 100644
> --- a/net/tipc/core.c
> +++ b/net/tipc/core.c
> @@ -109,10 +109,9 @@ static void __net_exit tipc_exit_net(struct net *net)
>   struct tipc_net *tn = tipc_net(net);
> 
>   tipc_detach_loopback(net);
> + tipc_net_stop(net);
>   /* Make sure the tipc_net_finalize_work() finished */
>   cancel_work_sync(>work);
> - tipc_net_stop(net);
> -
>   tipc_bcast_stop(net);
>   tipc_nametbl_stop(net);
>   tipc_sk_rht_destroy(net);
> --
> 2.30.2
> 
> 
> 
> ___
> tipc-discussion mailing list
> tipc-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/tipc-discussion


___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Re: [tipc-discussion] [PATCH net-next] tipc: remove inputq from tipc_bc_base

2022-06-06 Thread Hoang Huu Le

> -Original Message-
> From: Xin Long 
> Sent: Tuesday, June 7, 2022 12:57 AM
> To: tipc-discussion@lists.sourceforge.net
> Subject: Re: [tipc-discussion] [PATCH net-next] tipc: remove inputq from 
> tipc_bc_base
> 
> fix Jon's email address.
> 
> On Mon, Jun 6, 2022 at 1:52 PM Xin Long  wrote:
> >
> > After Commit 2af5ae372a4b ("tipc: clean up unused code and structures"),
> > there is no place really using tn->bcbase->inputq. This patch is to
> > delete this member from struct tipc_bc_base.
> >
> > Signed-off-by: Xin Long 
> > ---
> >  net/tipc/bcast.c | 22 +-
> >  1 file changed, 1 insertion(+), 21 deletions(-)
> >
> > diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
> > index 593846d25214..2293f6caa682 100644
> > --- a/net/tipc/bcast.c
> > +++ b/net/tipc/bcast.c
> > @@ -63,7 +63,6 @@ unsigned long sysctl_tipc_bc_retruni __read_mostly;
> >   */
> >  struct tipc_bc_base {
> > struct tipc_link *link;
> > -   struct sk_buff_head inputq;
> > int dests[MAX_BEARERS];
> > int primary_bearer;
> > bool bcast_support;
> > @@ -436,7 +435,6 @@ int tipc_mcast_xmit(struct net *net, struct 
> > sk_buff_head *pkts,
> >  int tipc_bcast_rcv(struct net *net, struct tipc_link *l, struct sk_buff 
> > *skb)
> >  {
> > struct tipc_msg *hdr = buf_msg(skb);
> > -   struct sk_buff_head *inputq = _bc_base(net)->inputq;
> > struct sk_buff_head xmitq;
> > int rc;
> >
> > @@ -456,10 +454,6 @@ int tipc_bcast_rcv(struct net *net, struct tipc_link 
> > *l, struct sk_buff *skb)
> >
> > tipc_bcbase_xmit(net, );
> >
> > -   /* Any socket wakeup messages ? */
> > -   if (!skb_queue_empty(inputq))
> > -   tipc_sk_rcv(net, inputq);
> > -
> > return rc;
> >  }
> >
> > @@ -470,7 +464,6 @@ int tipc_bcast_rcv(struct net *net, struct tipc_link 
> > *l, struct sk_buff *skb)
> >  void tipc_bcast_ack_rcv(struct net *net, struct tipc_link *l,
> > struct tipc_msg *hdr)
> >  {
> > -   struct sk_buff_head *inputq = _bc_base(net)->inputq;
> > u16 acked = msg_bcast_ack(hdr);
> > struct sk_buff_head xmitq;
> >
> > @@ -485,10 +478,6 @@ void tipc_bcast_ack_rcv(struct net *net, struct 
> > tipc_link *l,
> > tipc_bcast_unlock(net);
> >
> > tipc_bcbase_xmit(net, );
> > -
> > -   /* Any socket wakeup messages ? */
> > -   if (!skb_queue_empty(inputq))
> > -   tipc_sk_rcv(net, inputq);
> >  }
> >
> >  /* tipc_bcast_synch_rcv -  check and update rcv link with peer's send state
> > @@ -499,7 +488,6 @@ int tipc_bcast_sync_rcv(struct net *net, struct 
> > tipc_link *l,
> > struct tipc_msg *hdr,
> > struct sk_buff_head *retrq)
> >  {
> > -   struct sk_buff_head *inputq = _bc_base(net)->inputq;
> > struct tipc_gap_ack_blks *ga;
> > struct sk_buff_head xmitq;
> > int rc = 0;
> > @@ -522,9 +510,6 @@ int tipc_bcast_sync_rcv(struct net *net, struct 
> > tipc_link *l,
> >
> > tipc_bcbase_xmit(net, );
> >
> > -   /* Any socket wakeup messages ? */
> > -   if (!skb_queue_empty(inputq))
> > -   tipc_sk_rcv(net, inputq);
> > return rc;
> >  }
> >
> > @@ -551,7 +536,6 @@ void tipc_bcast_add_peer(struct net *net, struct 
> > tipc_link *uc_l,
> >  void tipc_bcast_remove_peer(struct net *net, struct tipc_link *rcv_l)
> >  {
> > struct tipc_link *snd_l = tipc_bc_sndlink(net);
> > -   struct sk_buff_head *inputq = _bc_base(net)->inputq;
> > struct sk_buff_head xmitq;
> >
> > __skb_queue_head_init();
> > @@ -563,10 +547,6 @@ void tipc_bcast_remove_peer(struct net *net, struct 
> > tipc_link *rcv_l)
> > tipc_bcast_unlock(net);
> >
> > tipc_bcbase_xmit(net, );
> > -
> > -   /* Any socket wakeup messages ? */
> > -   if (!skb_queue_empty(inputq))
> > -   tipc_sk_rcv(net, inputq);
> >  }
> >
> >  int tipc_bclink_reset_stats(struct net *net, struct tipc_link *l)
> > @@ -703,7 +683,7 @@ int tipc_bcast_init(struct net *net)
> >  BCLINK_WIN_DEFAULT,
> >  BCLINK_WIN_DEFAULT,
> >  0,
> > ->inputq,
> > +NULL,
> >  NULL,
> >  NULL,
> >  ))
> > --
> > 2.31.1
> >
> 
> 
> ___
> tipc-discussion mailing list
> tipc-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Please also remove kernel-doc comment for this member too.
Regards,
Hoang


___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Re: [tipc-discussion] [PATCH net] tipc: use a write lock for keepalive_intv instead of a read lock

2022-03-29 Thread Hoang Huu Le

Hi Niels,

I did consider this function however I guess it is safe to use  
tipc_node_read_lock()/unlock() since this value is being apply in this callback 
function. 

BTW, you must be using tipc_node_write_unlock_fast() instead of 
tipc_node_write_unlock().
Regards,
Hoang
> -Original Message-
> From: Niels Dossche 
> Sent: Tuesday, March 29, 2022 11:12 PM
> To: tipc-discussion@lists.sourceforge.net
> Cc: net...@vger.kernel.org; Jon Maloy ; Ying Xue 
> ; David S. Miller
> ; Jakub Kicinski ; Paolo Abeni 
> ; Hoang Huu Le
> ; Niels Dossche 
> Subject: [PATCH net] tipc: use a write lock for keepalive_intv instead of a 
> read lock
> 
> Currently, n->keepalive_intv is written to while n is locked by a read
> lock instead of a write lock. This seems to me to break the atomicity
> against other readers.
> Change this to a write lock instead to solve the issue.
> 
> Note:
> I am currently working on a static analyser to detect missing locks
> using type-based static analysis as my master's thesis
> in order to obtain my master's degree.
> If you would like to have more details, please let me know.
> This was a reported case. I manually verified the report by looking
> at the code, so that I do not send wrong information or patches.
> After concluding that this seems to be a true positive, I created
> this patch. I have both compile-tested this patch and runtime-tested
> this patch on x86_64. The effect on a running system could be a
> potential race condition in exceptional cases.
> This issue was found on Linux v5.17.
> 
> Fixes: f5d6c3e5a359 ("tipc: fix node keep alive interval calculation")
> Signed-off-by: Niels Dossche 
> ---
>  net/tipc/node.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/net/tipc/node.c b/net/tipc/node.c
> index 6ef95ce565bd..da867ddb93f5 100644
> --- a/net/tipc/node.c
> +++ b/net/tipc/node.c
> @@ -806,9 +806,9 @@ static void tipc_node_timeout(struct timer_list *t)
>   /* Initial node interval to value larger (10 seconds), then it will be
>* recalculated with link lowest tolerance
>*/
> - tipc_node_read_lock(n);
> + tipc_node_write_lock(n);
>   n->keepalive_intv = 1;
> - tipc_node_read_unlock(n);
> + tipc_node_write_unlock(n);
>   for (bearer_id = 0; remains && (bearer_id < MAX_BEARERS); bearer_id++) {
>   tipc_node_read_lock(n);
>   le = >links[bearer_id];
> --
> 2.35.1



___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Re: [tipc-discussion] Setting Node Address

2021-09-22 Thread Hoang Huu Le



From: Andy Stec 
Sent: Thursday, September 23, 2021 10:14 AM
To: Hoang Huu Le ; 
tipc-discussion@lists.sourceforge.net
Subject: Re: Setting Node Address

Which command is still supported, is it 'tipc node set address'?  I'm getting 
operation not permitted error when I use 'tipc node set address'.
[Hoang] I don't know exactly in redhat 8, but in the upstream kernel (i.e 
v5.14.x) I'm able to use that one to set a node address.

Does 'tipc node set identity' set the node address?
[Hoang] Yuh, I think so. Please give a try on your node.

Sorry about all these questions.  Needless to say I'm not a tipc expert.

From: Hoang Huu Le mailto:hoang.h...@dektech.com.au>>
Sent: Wednesday, September 22, 2021 9:05 PM
To: 
tipc-discussion@lists.sourceforge.net<mailto:tipc-discussion@lists.sourceforge.net>
 
mailto:tipc-discussion@lists.sourceforge.net>>;
 
tipc-discussion@lists.sourceforge.net<mailto:tipc-discussion@lists.sourceforge.net>
 
mailto:tipc-discussion@lists.sourceforge.net>>
Subject: Re: [tipc-discussion] Setting Node Address

Let's use 'tipc node set identity' instead.
However, that command is still support as legacy, this means you can continue 
using it in your application.

> -Original Message-
> From: Andy Stec via tipc-discussion 
> mailto:tipc-discussion@lists.sourceforge.net>>
> Sent: Thursday, September 23, 2021 8:16 AM
> To: 
> tipc-discussion@lists.sourceforge.net<mailto:tipc-discussion@lists.sourceforge.net>
> Subject: [tipc-discussion] Setting Node Address
>
> We are porting an application from redhat 7 to redhat 8, that is from kernel 
> 3.10 to kernel 4.18.  It appears that "tipc node set address"
> command has been removed in kernel 4.17.  Is there another way of setting 
> node address?  We are trying to limit the changes in the
> application.
>
> ___
> tipc-discussion mailing list
> tipc-discussion@lists.sourceforge.net<mailto:tipc-discussion@lists.sourceforge.net>
> https://lists.sourceforge.net/lists/listinfo/tipc-discussion


___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net<mailto:tipc-discussion@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Re: [tipc-discussion] Setting Node Address

2021-09-22 Thread Hoang Huu Le

Let's use 'tipc node set identity' instead.
However, that command is still support as legacy, this means you can continue 
using it in your application.

> -Original Message-
> From: Andy Stec via tipc-discussion 
> Sent: Thursday, September 23, 2021 8:16 AM
> To: tipc-discussion@lists.sourceforge.net
> Subject: [tipc-discussion] Setting Node Address
> 
> We are porting an application from redhat 7 to redhat 8, that is from kernel 
> 3.10 to kernel 4.18.  It appears that "tipc node set address"
> command has been removed in kernel 4.17.  Is there another way of setting 
> node address?  We are trying to limit the changes in the
> application.
> 
> ___
> tipc-discussion mailing list
> tipc-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/tipc-discussion


___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Re: [tipc-discussion] Strange behavior in socket.c::tipc_sk_enqueue()

2021-09-08 Thread Hoang Huu Le




> -Original Message-
> From: Jon Maloy 
> Sent: Thursday, September 9, 2021 5:42 AM
> To: Hoang Huu Le ; 
> tipc-discussion@lists.sourceforge.net; Tung Quang Nguyen
> ; Xin Long ; Ying Xue 
> 
> Cc: Huy Xuan Nhat Hoang 
> Subject: Re: Strange behavior in socket.c::tipc_sk_enqueue()
> 
> 
> 
> On 06/09/2021 05:02, Hoang Huu Le wrote:
> > Hi Jon,  all,
> >
> > I did a test by setting two variables condition in range:
> > - time limit:  2 msecs ... unlimited
> > - search depth limit (sock's skbs): 2 skbs ... unlimited
> >
> > With above range settings, a maximum sock's skbs can be enqueued around 12 
> > skbs regardless of time and search depth limit.
> > I also combine the test with iperf TCP traffic generated and the result 
> > looks the same.
> >
> > So, I don't think we need to apply the search depth limit condition and/or 
> > longer timer in this function, just 2msecs is enough.
> > I guess this result depends on kernel schedule. What are your views?
> 
> I assume your test was done with many, e.g. 100 connections?

Yes, I did the test from 1 to 150 connections and combine with/out other 
traffic generate (i.e TCP).

> 
> ///jon
> 
> >
> > Regards,
> > Hoang
> >> -Original Message-
> >> From: Jon Maloy 
> >> Sent: Wednesday, September 1, 2021 7:39 AM
> >> To: Hoang Huu Le ; 
> >> tipc-discussion@lists.sourceforge.net; Tung Quang Nguyen
> >> ; Xin Long ; Ying Xue 
> >> 
> >> Cc: Huy Xuan Nhat Hoang 
> >> Subject: Re: Strange behavior in socket.c::tipc_sk_enqueue()
> >>
> >> Guys,
> >> After our discussion this morning regarding this problem I gave it some
> >> more thought.
> >>
> >> What if we simply limit the search depth in the receive queue to some
> >> fix number, 10, 20, 50 or something and return NULL if nothing is found
> >> within this range. This would be a simple stack counter inside
> >> tipc_skb_dequeue(), and would cost almost nothing.
> >>
> >> If you experiment with this, of course in combination with a max limit
> >> of some milliseconds as we also discussed, we might obtain acceptable
> >> results.
> >>
> >> What do you think?
> >>
> >> ///jon
> >>
> >>
> >> On 28/07/2021 04:04, Hoang Huu Le wrote:
> >>> Hi Jon,
> >>>
> >>> Let's enjoy your vacation.
> >>> Our new team member (CCed) will take a look at it.
> >>>
> >>> Regards,
> >>> Hoang
> >>>> -Original Message-
> >>>> From: Jon Maloy 
> >>>> Sent: Wednesday, July 28, 2021 6:20 AM
> >>>> To: tipc-discussion@lists.sourceforge.net; Tung Quang Nguyen 
> >>>> ; Hoang Huu Le
> >>>> ; Xin Long ; Ying Xue 
> >>>> 
> >>>> Subject: Strange behavior in socket.c::tipc_sk_enqueue()
> >>>>
> >>>> I did by accident discover a strange behavior in the function
> >>>> tipc_sk_enqueue:
> >>>>
> >>>>
> >>>> static void tipc_sk_enqueue(struct sk_buff_head *inputq,
> >>>>struct sock *sk, u32 dport,
> >>>>struct sk_buff_head *xmitq)
> >>>> {
> >>>>struct tipc_sock *tsk = tipc_sk(sk);
> >>>>unsigned long time_limit = jiffies + 2;
> >>>>struct sk_buff *skb;
> >>>>unsigned int lim;
> >>>>atomic_t *dcnt;
> >>>>u32 onode;
> >>>>
> >>>>while (skb_queue_len(inputq)) {
> >>>>if (unlikely(time_after_eq(jiffies, time_limit)))
> >>>>  return;
> >>>>[...]
> >>>>}
> >>>> }
> >>>>
> >>>> At the moment we call time_after_eq() the two jiffies often
> >>>> have already passed, and the skb is not dequeued.
> >>>> I noticed that tipc_sk_rcv() may call tipc_sk_enqueue()
> >>>> with the same skb dozens of times before the buffer can
> >>>> be delivered further upwards in the stack.
> >>>>
> >>>> Needless to say that this cannot be good for performance.
> >>>>
> >>>> I believe the value of 2 jiffies was hard coded at a time
> >>>> when machines were slower, and a jiffie represented a much
> >>>> longer time interval.
> >>>>
> >>>> Now it is clearly too short, and should be replaced with something
> >>>> longer and more consisten, e.g. msec_to_jiffies(2).
> >>>>
> >>>> Can anybody look into this?
> >>>>
> >>>> Also, I will be on vacation the next two weeks, which means we
> >>>> should cancel the bi-weekly meeting next Tuesday.
> >>>>
> >>>> ///jon
> >>>>
> >>>
> >


___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Re: [tipc-discussion] Strange behavior in socket.c::tipc_sk_enqueue()

2021-09-06 Thread Hoang Huu Le

Hi Jon,  all,

I did a test by setting two variables condition in range:
- time limit:  2 msecs ... unlimited
- search depth limit (sock's skbs): 2 skbs ... unlimited

With above range settings, a maximum sock's skbs can be enqueued around 12 skbs 
regardless of time and search depth limit.
I also combine the test with iperf TCP traffic generated and the result looks 
the same.

So, I don't think we need to apply the search depth limit condition and/or 
longer timer in this function, just 2msecs is enough.
I guess this result depends on kernel schedule. What are your views?

Regards,
Hoang
> -Original Message-
> From: Jon Maloy 
> Sent: Wednesday, September 1, 2021 7:39 AM
> To: Hoang Huu Le ; 
> tipc-discussion@lists.sourceforge.net; Tung Quang Nguyen
> ; Xin Long ; Ying Xue 
> 
> Cc: Huy Xuan Nhat Hoang 
> Subject: Re: Strange behavior in socket.c::tipc_sk_enqueue()
> 
> Guys,
> After our discussion this morning regarding this problem I gave it some
> more thought.
> 
> What if we simply limit the search depth in the receive queue to some
> fix number, 10, 20, 50 or something and return NULL if nothing is found
> within this range. This would be a simple stack counter inside
> tipc_skb_dequeue(), and would cost almost nothing.
> 
> If you experiment with this, of course in combination with a max limit
> of some milliseconds as we also discussed, we might obtain acceptable
> results.
> 
> What do you think?
> 
> ///jon
> 
> 
> On 28/07/2021 04:04, Hoang Huu Le wrote:
> > Hi Jon,
> >
> > Let's enjoy your vacation.
> > Our new team member (CCed) will take a look at it.
> >
> > Regards,
> > Hoang
> >> -Original Message-
> >> From: Jon Maloy 
> >> Sent: Wednesday, July 28, 2021 6:20 AM
> >> To: tipc-discussion@lists.sourceforge.net; Tung Quang Nguyen 
> >> ; Hoang Huu Le
> >> ; Xin Long ; Ying Xue 
> >> 
> >> Subject: Strange behavior in socket.c::tipc_sk_enqueue()
> >>
> >> I did by accident discover a strange behavior in the function
> >> tipc_sk_enqueue:
> >>
> >>
> >> static void tipc_sk_enqueue(struct sk_buff_head *inputq,
> >>   struct sock *sk, u32 dport,
> >>   struct sk_buff_head *xmitq)
> >> {
> >>   struct tipc_sock *tsk = tipc_sk(sk);
> >>   unsigned long time_limit = jiffies + 2;
> >>   struct sk_buff *skb;
> >>   unsigned int lim;
> >>   atomic_t *dcnt;
> >>   u32 onode;
> >>
> >>   while (skb_queue_len(inputq)) {
> >>   if (unlikely(time_after_eq(jiffies, time_limit)))
> >> return;
> >>   [...]
> >>   }
> >> }
> >>
> >> At the moment we call time_after_eq() the two jiffies often
> >> have already passed, and the skb is not dequeued.
> >> I noticed that tipc_sk_rcv() may call tipc_sk_enqueue()
> >> with the same skb dozens of times before the buffer can
> >> be delivered further upwards in the stack.
> >>
> >> Needless to say that this cannot be good for performance.
> >>
> >> I believe the value of 2 jiffies was hard coded at a time
> >> when machines were slower, and a jiffie represented a much
> >> longer time interval.
> >>
> >> Now it is clearly too short, and should be replaced with something
> >> longer and more consisten, e.g. msec_to_jiffies(2).
> >>
> >> Can anybody look into this?
> >>
> >> Also, I will be on vacation the next two weeks, which means we
> >> should cancel the bi-weekly meeting next Tuesday.
> >>
> >> ///jon
> >>
> >


___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Re: [tipc-discussion] [PATCH net-next] tipc: Return the correct errno code

2021-08-02 Thread Hoang Huu Le

Hi Zheng,

The patch was being merged by accident. Will have you planning to revert it?
We need to do ASAP since calling path tipc_node_xmit() -> tipc_link_xmit() 
broken as side effect.

Thanks,
hoang
> -Original Message-
> From: zhengyongjun 
> Sent: Friday, June 4, 2021 8:35 AM
> To: jma...@redhat.com; ying@windriver.com; da...@davemloft.net; 
> k...@kernel.org; net...@vger.kernel.org; tipc-
> discuss...@lists.sourceforge.net; linux-ker...@vger.kernel.org
> Subject: 答复: [PATCH net-next] tipc: Return the correct errno code
> 
> Sorry, this patch is wrong, please ignore it, thanks :)
> 
> -邮件原件-
> 发件人: zhengyongjun
> 发送时间: 2021年6月4日 9:47
> 收件人: jma...@redhat.com; ying@windriver.com; da...@davemloft.net; 
> k...@kernel.org; net...@vger.kernel.org; tipc-
> discuss...@lists.sourceforge.net; linux-ker...@vger.kernel.org
> 抄送: zhengyongjun 
> 主题: [PATCH net-next] tipc: Return the correct errno code
> 
> When kalloc or kmemdup failed, should return ENOMEM rather than ENOBUF.
> 
> Signed-off-by: Zheng Yongjun 
> ---
>  net/tipc/link.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/net/tipc/link.c b/net/tipc/link.c index 
> c44b4bfaaee6..5b6181277cc5 100644
> --- a/net/tipc/link.c
> +++ b/net/tipc/link.c
> @@ -912,7 +912,7 @@ static int link_schedule_user(struct tipc_link *l, struct 
> tipc_msg *hdr)
>   skb = tipc_msg_create(SOCK_WAKEUP, 0, INT_H_SIZE, 0,
> dnode, l->addr, dport, 0, 0);
>   if (!skb)
> - return -ENOBUFS;
> + return -ENOMEM;
>   msg_set_dest_droppable(buf_msg(skb), true);
>   TIPC_SKB_CB(skb)->chain_imp = msg_importance(hdr);
>   skb_queue_tail(>wakeupq, skb);
> @@ -1030,7 +1030,7 @@ void tipc_link_reset(struct tipc_link *l)
>   *
>   * Consumes the buffer chain.
>   * Messages at TIPC_SYSTEM_IMPORTANCE are always accepted
> - * Return: 0 if success, or errno: -ELINKCONG, -EMSGSIZE or -ENOBUFS
> + * Return: 0 if success, or errno: -ELINKCONG, -EMSGSIZE or -ENOBUFS or
> + -ENOMEM
>   */
>  int tipc_link_xmit(struct tipc_link *l, struct sk_buff_head *list,
>  struct sk_buff_head *xmitq)
> @@ -1088,7 +1088,7 @@ int tipc_link_xmit(struct tipc_link *l, struct 
> sk_buff_head *list,
>   if (!_skb) {
>   kfree_skb(skb);
>   __skb_queue_purge(list);
> - return -ENOBUFS;
> + return -ENOMEM;
>   }
>   __skb_queue_tail(transmq, skb);
>   tipc_link_set_skb_retransmit_time(skb, l);
> --
> 2.25.1


___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Re: [tipc-discussion] Strange behavior in socket.c::tipc_sk_enqueue()

2021-07-28 Thread Hoang Huu Le

Hi Jon,

Let's enjoy your vacation.
Our new team member (CCed) will take a look at it. 

Regards,
Hoang
> -Original Message-
> From: Jon Maloy 
> Sent: Wednesday, July 28, 2021 6:20 AM
> To: tipc-discussion@lists.sourceforge.net; Tung Quang Nguyen 
> ; Hoang Huu Le
> ; Xin Long ; Ying Xue 
> 
> Subject: Strange behavior in socket.c::tipc_sk_enqueue()
> 
> I did by accident discover a strange behavior in the function
> tipc_sk_enqueue:
> 
> 
> static void tipc_sk_enqueue(struct sk_buff_head *inputq,
>  struct sock *sk, u32 dport,
>  struct sk_buff_head *xmitq)
> {
>  struct tipc_sock *tsk = tipc_sk(sk);
>  unsigned long time_limit = jiffies + 2;
>  struct sk_buff *skb;
>  unsigned int lim;
>  atomic_t *dcnt;
>  u32 onode;
> 
>  while (skb_queue_len(inputq)) {
>  if (unlikely(time_after_eq(jiffies, time_limit)))
>return;
>  [...]
>  }
> }
> 
> At the moment we call time_after_eq() the two jiffies often
> have already passed, and the skb is not dequeued.
> I noticed that tipc_sk_rcv() may call tipc_sk_enqueue()
> with the same skb dozens of times before the buffer can
> be delivered further upwards in the stack.
> 
> Needless to say that this cannot be good for performance.
> 
> I believe the value of 2 jiffies was hard coded at a time
> when machines were slower, and a jiffie represented a much
> longer time interval.
> 
> Now it is clearly too short, and should be replaced with something
> longer and more consisten, e.g. msec_to_jiffies(2).
> 
> Can anybody look into this?
> 
> Also, I will be on vacation the next two weeks, which means we
> should cancel the bi-weekly meeting next Tuesday.
> 
> ///jon
> 


___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

[tipc-discussion] FW: [syzbot] KASAN: use-after-free Read in tipc_recvmsg

2021-07-22 Thread Hoang Huu Le

Hi Xin,

I think the issue caused by your patch:

f4919ff59c282 ("tipc: keep the skb in rcv queue until the whole data is read)

1976 if (!skb_cb->bytes_read)
1977 tsk_advance_rx_queue(sk);   <-- skb free-ed here
1978
1979 if (likely(!connected) || skb_cb->bytes_read) <- use-after-free
1980 goto exit;


Can you take a look at the issue.

Thanks,
Hoang
-Original Message-
From: syzbot  
Sent: Monday, July 19, 2021 12:15 AM
To: da...@davemloft.net; devicet...@vger.kernel.org; frowand.l...@gmail.com; 
gre...@linuxfoundation.org; jma...@redhat.com; k...@kernel.org; 
linux-ker...@vger.kernel.org; net...@vger.kernel.org; raf...@kernel.org; 
robh...@kernel.org; r...@kernel.org; syzkaller-b...@googlegroups.com; 
tipc-discussion@lists.sourceforge.net; ying@windriver.com
Subject: [syzbot] KASAN: use-after-free Read in tipc_recvmsg

Hello,

syzbot found the following issue on:

HEAD commit:ab0441b4a920 Merge branch 'vmxnet3-version-6'
git tree:   net-next
console output: https://syzkaller.appspot.com/x/log.txt?x=1744ac6a30
kernel config:  https://syzkaller.appspot.com/x/.config?x=da140227e4f25b17
dashboard link: https://syzkaller.appspot.com/bug?extid=e6741b97d5552f97c24d
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=13973a7430
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=17ffc90230

The issue was bisected to:

commit 67a3156453859ceb40dc4448b7a6a99ea0ad27c7
Author: Rob Herring 
Date:   Thu May 27 19:45:47 2021 +

of: Merge of_address_to_resource() and of_pci_address_to_resource() 
implementations

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=129b043830
final oops: https://syzkaller.appspot.com/x/report.txt?x=119b043830
console output: https://syzkaller.appspot.com/x/log.txt?x=169b043830

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+e6741b97d5552f97c...@syzkaller.appspotmail.com
Fixes: 67a315645385 ("of: Merge of_address_to_resource() and 
of_pci_address_to_resource() implementations")

==
BUG: KASAN: use-after-free in tipc_recvmsg+0xf77/0xf90 net/tipc/socket.c:1979
Read of size 4 at addr 8880328cf1c0 by task kworker/u4:0/8

CPU: 1 PID: 8 Comm: kworker/u4:0 Not tainted 5.14.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Workqueue: tipc_rcv tipc_conn_recv_work
Call Trace:
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:105
 print_address_description.constprop.0.cold+0x6c/0x309 mm/kasan/report.c:233
 __kasan_report mm/kasan/report.c:419 [inline]
 kasan_report.cold+0x83/0xdf mm/kasan/report.c:436
 tipc_recvmsg+0xf77/0xf90 net/tipc/socket.c:1979
 sock_recvmsg_nosec net/socket.c:943 [inline]
 sock_recvmsg net/socket.c:961 [inline]
 sock_recvmsg+0xca/0x110 net/socket.c:957
 tipc_conn_rcv_from_sock+0x162/0x2f0 net/tipc/topsrv.c:398
 tipc_conn_recv_work+0xeb/0x190 net/tipc/topsrv.c:421
 process_one_work+0x98d/0x1630 kernel/workqueue.c:2276
 worker_thread+0x658/0x11f0 kernel/workqueue.c:2422
 kthread+0x3e5/0x4d0 kernel/kthread.c:319
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295

Allocated by task 8446:
 kasan_save_stack+0x1b/0x40 mm/kasan/common.c:38
 kasan_set_track mm/kasan/common.c:46 [inline]
 set_alloc_info mm/kasan/common.c:434 [inline]
 __kasan_slab_alloc+0x84/0xa0 mm/kasan/common.c:467
 kasan_slab_alloc include/linux/kasan.h:253 [inline]
 slab_post_alloc_hook mm/slab.h:512 [inline]
 slab_alloc_node mm/slub.c:2981 [inline]
 kmem_cache_alloc_node+0x266/0x3e0 mm/slub.c:3017
 __alloc_skb+0x20b/0x340 net/core/skbuff.c:414
 alloc_skb_fclone include/linux/skbuff.h:1162 [inline]
 tipc_buf_acquire+0x25/0xe0 net/tipc/msg.c:72
 tipc_msg_build+0xf7/0x10a0 net/tipc/msg.c:386
 __tipc_sendstream+0x6d0/0x1150 net/tipc/socket.c:1610
 tipc_sendstream+0x4c/0x70 net/tipc/socket.c:1541
 sock_sendmsg_nosec net/socket.c:703 [inline]
 sock_sendmsg+0xcf/0x120 net/socket.c:723
 sock_write_iter+0x289/0x3c0 net/socket.c:1056
 call_write_iter include/linux/fs.h:2114 [inline]
 new_sync_write+0x426/0x650 fs/read_write.c:518
 vfs_write+0x75a/0xa40 fs/read_write.c:605
 ksys_write+0x1ee/0x250 fs/read_write.c:658
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Freed by task 8:
 kasan_save_stack+0x1b/0x40 mm/kasan/common.c:38
 kasan_set_track+0x1c/0x30 mm/kasan/common.c:46
 kasan_set_free_info+0x20/0x30 mm/kasan/generic.c:360
 kasan_slab_free mm/kasan/common.c:366 [inline]
 kasan_slab_free mm/kasan/common.c:328 [inline]
 __kasan_slab_free+0xfb/0x130 mm/kasan/common.c:374
 kasan_slab_free include/linux/kasan.h:229 [inline]
 slab_free_hook mm/slub.c:1650 [inline]
 slab_free_freelist_hook+0xdf/0x240 mm/slub.c:1675
 slab_free mm/slub.c:3235 [inline]
 kmem_cache_free+0x8e/0x5a0

Re: [tipc-discussion] [net-next v2 0/3] tipc: some small cleanups

2021-04-14 Thread Hoang Huu Le

Series 
Tested-by: Hoang Le 

> -Original Message-
> From: jma...@redhat.com 
> Sent: Thursday, April 8, 2021 3:59 AM
> To: tipc-discussion@lists.sourceforge.net
> Cc: Tung Quang Nguyen ; Hoang Huu Le 
> ; Tuong Tong Lien
> ; jma...@redhat.com; ma...@donjonn.com; 
> l...@redhat.com; ying@windriver.com;
> parthasarathy.bhuvara...@gmail.com
> Subject: [net-next v2 0/3] tipc: some small cleanups
> 
> From: Jon Maloy 
> 
> We make some minor code cleanups and improvements.
> 
> ---
> v2: - Removed patch #1 from v1, which has now been applied upstream
> - Fixed memory leak in patch #2 as identified by Hoang
> 
> Jon Maloy (3):
>   tipc: eliminate redundant fields in struct tipc_sock
>   tipc: refactor function tipc_sk_anc_data_recv()
>   tipc: simplify handling of lookup scope during multicast message
> reception
> 
>  net/tipc/name_table.c |   6 +-
>  net/tipc/name_table.h |   4 +-
>  net/tipc/socket.c | 149 +++---
>  3 files changed, 74 insertions(+), 85 deletions(-)
> 
> --
> 2.29.2



___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Re: [tipc-discussion] [net-next 3/4] tipc: refactor function tipc_sk_anc_data_recv()

2021-03-31 Thread Hoang Huu Le



> -Original Message-
> From: jma...@redhat.com 
> Sent: Thursday, March 25, 2021 10:56 PM
> To: tipc-discussion@lists.sourceforge.net
> Cc: Tung Quang Nguyen ; Hoang Huu Le 
> ; Tuong Tong Lien
> ; jma...@redhat.com; ma...@donjonn.com; 
> l...@redhat.com; ying@windriver.com;
> parthasarathy.bhuvara...@gmail.com
> Subject: [net-next 3/4] tipc: refactor function tipc_sk_anc_data_recv()
> 
> From: Jon Maloy 
> 
> We refactor tipc_sk_anc_data_recv() to make it slighltly more
> comprehensible, but also to facilitate application of some additions
> to the code in a future commit.
> 
> Signed-off-by: Jon Maloy 
> ---
>  net/tipc/socket.c | 85 +--
>  1 file changed, 38 insertions(+), 47 deletions(-)
> 
> diff --git a/net/tipc/socket.c b/net/tipc/socket.c
> index 12a97755bc80..358d1f2494a7 100644
> --- a/net/tipc/socket.c
> +++ b/net/tipc/socket.c
> @@ -1730,67 +1730,58 @@ static void tipc_sk_set_orig_addr(struct msghdr *m, 
> struct sk_buff *skb)
>  static int tipc_sk_anc_data_recv(struct msghdr *m, struct sk_buff *skb,
>struct tipc_sock *tsk)
>  {
> - struct tipc_msg *msg;
> - u32 anc_data[3];
> - u32 err;
> - u32 dest_type;
> - int has_name;
> - int res;
> + struct tipc_msg *hdr;
> + bool has_addr;
> + int data[12];

Potential memory leak here!

> + int dlen, rc;
> 
>   if (likely(m->msg_controllen == 0))
>   return 0;
> - msg = buf_msg(skb);
> 
> - /* Optionally capture errored message object(s) */
> - err = msg ? msg_errcode(msg) : 0;
> - if (unlikely(err)) {
> - anc_data[0] = err;
> - anc_data[1] = msg_data_sz(msg);
> - res = put_cmsg(m, SOL_TIPC, TIPC_ERRINFO, 8, anc_data);
> - if (res)
> - return res;
> - if (anc_data[1]) {
> - if (skb_linearize(skb))
> - return -ENOMEM;
> - msg = buf_msg(skb);
> - res = put_cmsg(m, SOL_TIPC, TIPC_RETDATA, anc_data[1],
> -msg_data(msg));
> - if (res)
> - return res;
> - }
> + hdr = buf_msg(skb);
> + dlen = msg_data_sz(hdr);
> +
> + /* Capture errored message object, if any */
> + if (msg_errcode(hdr)) {
> + if (skb_linearize(skb))
> + return -ENOMEM;
> + hdr = buf_msg(skb);
> + data[0] = msg_errcode(hdr);
> + data[1] = dlen;
> + rc = put_cmsg(m, SOL_TIPC, TIPC_ERRINFO, 8, data);
> + if (rc || !dlen)
> + return rc;
> + rc = put_cmsg(m, SOL_TIPC, TIPC_RETDATA, dlen, msg_data(hdr));
> + if (rc)
> + return rc;
>   }
> 
> - /* Optionally capture message destination object */
> - dest_type = msg ? msg_type(msg) : TIPC_DIRECT_MSG;
> - switch (dest_type) {
> + /* Capture TIPC_SERVICE_ADDR/RANGE destination address, if any */
> + switch (msg_type(hdr)) {
>   case TIPC_NAMED_MSG:
> - has_name = 1;
> - anc_data[0] = msg_nametype(msg);
> - anc_data[1] = msg_namelower(msg);
> - anc_data[2] = msg_namelower(msg);
> + has_addr = true;
> + data[0] = msg_nametype(hdr);
> + data[1] = msg_namelower(hdr);
> + data[2] = data[1];
>   break;
>   case TIPC_MCAST_MSG:
> - has_name = 1;
> - anc_data[0] = msg_nametype(msg);
> - anc_data[1] = msg_namelower(msg);
> - anc_data[2] = msg_nameupper(msg);
> + has_addr = true;
> + data[0] = msg_nametype(hdr);
> + data[1] = msg_namelower(hdr);
> + data[2] = msg_nameupper(hdr);
>   break;
>   case TIPC_CONN_MSG:
> - has_name = !!tsk->conn_addrtype;
> - anc_data[0] = msg_nametype(>phdr);
> - anc_data[1] = msg_nameinst(>phdr);
> - anc_data[2] = anc_data[1];
> + has_addr = !!tsk->conn_addrtype;
> + data[0] = msg_nametype(>phdr);
> + data[1] = msg_nameinst(>phdr);
> + data[2] = data[1];
>   break;
>   default:
> - has_name = 0;
> - }
> - if (has_name) {
> - res = put_cmsg(m, SOL_TIPC, TIPC_DESTNAME, 12, anc_data);
> - if (res)
> - return res;
> + has_addr =

Re: [tipc-discussion] [iproute2-next] tipc: add support for the netlink extack

2021-03-25 Thread Hoang Huu Le

> -Original Message-
> From: David Ahern 
> Sent: Thursday, March 25, 2021 10:08 AM
> To: Hoang Huu Le ; net...@vger.kernel.org; 
> tipc-discussion@lists.sourceforge.net;
> jma...@redhat.com; ma...@donjonn.com; ying@windriver.com; Tuan Anh Vo 
> ; Tung Quang
> Nguyen 
> Subject: Re: [iproute2-next] tipc: add support for the netlink extack
> 
> On 3/24/21 7:56 PM, Hoang Le wrote:
> > Add support extack in tipc to dump the netlink extack error messages
> > (i.e -EINVAL) sent from kernel.
> >
> > Acked-by: Jon Maloy 
> > Signed-off-by: Hoang Le 
> > ---
> >  tipc/msg.c | 29 ++---
> >  1 file changed, 22 insertions(+), 7 deletions(-)
> >
> 
> tipc should be converted to use the library functions in lib/mnl_utils.c.

It's really a big change. So, we will do in next commit.

___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

[tipc-discussion] [net-next v2 2/2] tipc: clean up warnings detected by sparse

2021-03-11 Thread Hoang Huu Le

This patch fixes the following warning from sparse:

net/tipc/monitor.c:263:35: warning: incorrect type in assignment (different 
base types)
net/tipc/monitor.c:263:35:expected unsigned int
net/tipc/monitor.c:263:35:got restricted __be32 [usertype]
[...]
net/tipc/node.c:374:13: warning: context imbalance in 'tipc_node_read_lock' - 
wrong count at exit
net/tipc/node.c:379:13: warning: context imbalance in 'tipc_node_read_unlock' - 
unexpected unlock
net/tipc/node.c:384:13: warning: context imbalance in 'tipc_node_write_lock' - 
wrong count at exit
net/tipc/node.c:389:13: warning: context imbalance in 
'tipc_node_write_unlock_fast' - unexpected unlock
net/tipc/node.c:404:17: warning: context imbalance in 'tipc_node_write_unlock' 
- unexpected unlock
[...]
net/tipc/crypto.c:1201:9: warning: incorrect type in initializer (different 
address spaces)
net/tipc/crypto.c:1201:9:expected struct tipc_aead [noderef] __rcu *__tmp
net/tipc/crypto.c:1201:9:got struct tipc_aead *
[...]

v2: switch to use the keyword "__always_inline" for inline function

Acked-by: Jon Maloy 
Signed-off-by: Hoang Huu Le 
---
 net/tipc/crypto.c  | 13 +-
 net/tipc/monitor.c | 63 ++
 net/tipc/node.c|  5 
 3 files changed, 59 insertions(+), 22 deletions(-)

diff --git a/net/tipc/crypto.c b/net/tipc/crypto.c
index f4fca8f7f63f..b428fa1c3241 100644
--- a/net/tipc/crypto.c
+++ b/net/tipc/crypto.c
@@ -317,7 +317,7 @@ static int tipc_aead_key_generate(struct tipc_aead_key 
*skey);
 
 #define tipc_aead_rcu_replace(rcu_ptr, ptr, lock)  \
 do {   \
-   typeof(rcu_ptr) __tmp = rcu_dereference_protected((rcu_ptr),\
+   struct tipc_aead *__tmp = rcu_dereference_protected((rcu_ptr),  \
lockdep_is_held(lock)); \
rcu_assign_pointer((rcu_ptr), (ptr));   \
tipc_aead_put(__tmp);   \
@@ -798,7 +798,7 @@ static int tipc_aead_encrypt(struct tipc_aead *aead, struct 
sk_buff *skb,
ehdr = (struct tipc_ehdr *)skb->data;
salt = aead->salt;
if (aead->mode == CLUSTER_KEY)
-   salt ^= ehdr->addr; /* __be32 */
+   salt ^= __be32_to_cpu(ehdr->addr);
else if (__dnode)
salt ^= tipc_node_get_addr(__dnode);
memcpy(iv, , 4);
@@ -929,7 +929,7 @@ static int tipc_aead_decrypt(struct net *net, struct 
tipc_aead *aead,
ehdr = (struct tipc_ehdr *)skb->data;
salt = aead->salt;
if (aead->mode == CLUSTER_KEY)
-   salt ^= ehdr->addr; /* __be32 */
+   salt ^= __be32_to_cpu(ehdr->addr);
else if (ehdr->destined)
salt ^= tipc_own_addr(net);
memcpy(iv, , 4);
@@ -1946,16 +1946,17 @@ static void tipc_crypto_rcv_complete(struct net *net, 
struct tipc_aead *aead,
goto rcv;
}
tipc_aead_put(aead);
-   aead = tipc_aead_get(tmp);
+   aead = tipc_aead_get((struct tipc_aead __force __rcu *)tmp);
}
 
if (unlikely(err)) {
-   tipc_aead_users_dec(aead, INT_MIN);
+   tipc_aead_users_dec((struct tipc_aead __force __rcu *)aead,
+   INT_MIN);
goto free_skb;
}
 
/* Set the RX key's user */
-   tipc_aead_users_set(aead, 1);
+   tipc_aead_users_set((struct tipc_aead __force __rcu *)aead, 1);
 
/* Mark this point, RX works */
rx->timer1 = jiffies;
diff --git a/net/tipc/monitor.c b/net/tipc/monitor.c
index 48fac3b17e40..1b514cb7a7d3 100644
--- a/net/tipc/monitor.c
+++ b/net/tipc/monitor.c
@@ -104,6 +104,36 @@ static struct tipc_monitor *tipc_monitor(struct net *net, 
int bearer_id)
 
 const int tipc_max_domain_size = sizeof(struct tipc_mon_domain);
 
+static __always_inline u16 mon_cpu_to_le16(u16 val)
+{
+   return (__force __u16)htons(val);
+}
+
+static __always_inline u32 mon_cpu_to_le32(u32 val)
+{
+   return (__force __u32)htonl(val);
+}
+
+static __always_inline u64 mon_cpu_to_le64(u64 val)
+{
+   return (__force __u64)cpu_to_be64(val);
+}
+
+static __always_inline u16 mon_le16_to_cpu(u16 val)
+{
+   return ntohs((__force __be16)val);
+}
+
+static __always_inline u32 mon_le32_to_cpu(u32 val)
+{
+   return ntohl((__force __be32)val);
+}
+
+static __always_inline u64 mon_le64_to_cpu(u64 val)
+{
+   return be64_to_cpu((__force __be64)val);
+}
+
 /* dom_rec_len(): actual length of domain record for transport
  */
 static int dom_rec_len(struct tipc_mon_domain *dom, u16 mcnt)
@@ -260,16 +290,16 @@ static void mon_update_local_domain(struct tipc_monitor 
*mon)
diff |= dom->members[i] != peer->addr;
dom->members[i] = peer->addr;

[tipc-discussion] [net-next v2 1/2] tipc: convert dest node's address to network order

2021-03-11 Thread Hoang Huu Le

From: Hoang Le 

(struct tipc_link_info)->dest is in network order (__be32), so we must
convert the value to network order before assigning. The problem detected
by sparse:

net/tipc/netlink_compat.c:699:24: warning: incorrect type in assignment 
(different base types)
net/tipc/netlink_compat.c:699:24:expected restricted __be32 [usertype] dest
net/tipc/netlink_compat.c:699:24:got int

Acked-by: Jon Maloy 
Signed-off-by: Hoang Le 
---
 net/tipc/netlink_compat.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/tipc/netlink_compat.c b/net/tipc/netlink_compat.c
index 5a1ce64039f7..0749df80454d 100644
--- a/net/tipc/netlink_compat.c
+++ b/net/tipc/netlink_compat.c
@@ -696,7 +696,7 @@ static int tipc_nl_compat_link_dump(struct 
tipc_nl_compat_msg *msg,
if (err)
return err;
 
-   link_info.dest = nla_get_flag(link[TIPC_NLA_LINK_DEST]);
+   link_info.dest = htonl(nla_get_flag(link[TIPC_NLA_LINK_DEST]));
link_info.up = htonl(nla_get_flag(link[TIPC_NLA_LINK_UP]));
nla_strscpy(link_info.str, link[TIPC_NLA_LINK_NAME],
TIPC_MAX_LINK_NAME);
-- 
2.25.1



___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

[tipc-discussion] [net-next 2/2] tipc: clean up warnings detected by sparse

2021-03-10 Thread Hoang Huu Le

This patch fixes the following warning from sparse:

net/tipc/monitor.c:263:35: warning: incorrect type in assignment (different 
base types)
net/tipc/monitor.c:263:35:expected unsigned int
net/tipc/monitor.c:263:35:got restricted __be32 [usertype]
[...]
net/tipc/node.c:374:13: warning: context imbalance in 'tipc_node_read_lock' - 
wrong count at exit
net/tipc/node.c:379:13: warning: context imbalance in 'tipc_node_read_unlock' - 
unexpected unlock
net/tipc/node.c:384:13: warning: context imbalance in 'tipc_node_write_lock' - 
wrong count at exit
net/tipc/node.c:389:13: warning: context imbalance in 
'tipc_node_write_unlock_fast' - unexpected unlock
net/tipc/node.c:404:17: warning: context imbalance in 'tipc_node_write_unlock' 
- unexpected unlock
[...]
net/tipc/crypto.c:1201:9: warning: incorrect type in initializer (different 
address spaces)
net/tipc/crypto.c:1201:9:expected struct tipc_aead [noderef] __rcu *__tmp
net/tipc/crypto.c:1201:9:got struct tipc_aead *
[...]

Acked-by: Jon Maloy 
Signed-off-by: Hoang Huu Le 
---
 net/tipc/crypto.c  | 12 -
 net/tipc/monitor.c | 63 ++
 net/tipc/node.c|  5 
 3 files changed, 58 insertions(+), 22 deletions(-)

diff --git a/net/tipc/crypto.c b/net/tipc/crypto.c
index f4fca8f7f63f..6f64acef73dc 100644
--- a/net/tipc/crypto.c
+++ b/net/tipc/crypto.c
@@ -317,7 +317,7 @@ static int tipc_aead_key_generate(struct tipc_aead_key 
*skey);
 
 #define tipc_aead_rcu_replace(rcu_ptr, ptr, lock)  \
 do {   \
-   typeof(rcu_ptr) __tmp = rcu_dereference_protected((rcu_ptr),\
+   struct tipc_aead *__tmp = rcu_dereference_protected((rcu_ptr),  \
lockdep_is_held(lock)); \
rcu_assign_pointer((rcu_ptr), (ptr));   \
tipc_aead_put(__tmp);   \
@@ -798,7 +798,7 @@ static int tipc_aead_encrypt(struct tipc_aead *aead, struct 
sk_buff *skb,
ehdr = (struct tipc_ehdr *)skb->data;
salt = aead->salt;
if (aead->mode == CLUSTER_KEY)
-   salt ^= ehdr->addr; /* __be32 */
+   salt ^= __be32_to_cpu(ehdr->addr);
else if (__dnode)
salt ^= tipc_node_get_addr(__dnode);
memcpy(iv, , 4);
@@ -929,7 +929,7 @@ static int tipc_aead_decrypt(struct net *net, struct 
tipc_aead *aead,
ehdr = (struct tipc_ehdr *)skb->data;
salt = aead->salt;
if (aead->mode == CLUSTER_KEY)
-   salt ^= ehdr->addr; /* __be32 */
+   salt ^= __be32_to_cpu(ehdr->addr);
else if (ehdr->destined)
salt ^= tipc_own_addr(net);
memcpy(iv, , 4);
@@ -1946,16 +1946,16 @@ static void tipc_crypto_rcv_complete(struct net *net, 
struct tipc_aead *aead,
goto rcv;
}
tipc_aead_put(aead);
-   aead = tipc_aead_get(tmp);
+   aead = tipc_aead_get((struct tipc_aead __force __rcu *)tmp);
}
 
if (unlikely(err)) {
-   tipc_aead_users_dec(aead, INT_MIN);
+   tipc_aead_users_dec((struct tipc_aead __force __rcu *)aead, 
INT_MIN);
goto free_skb;
}
 
/* Set the RX key's user */
-   tipc_aead_users_set(aead, 1);
+   tipc_aead_users_set((struct tipc_aead __force __rcu *)aead, 1);
 
/* Mark this point, RX works */
rx->timer1 = jiffies;
diff --git a/net/tipc/monitor.c b/net/tipc/monitor.c
index 48fac3b17e40..407619697292 100644
--- a/net/tipc/monitor.c
+++ b/net/tipc/monitor.c
@@ -104,6 +104,36 @@ static struct tipc_monitor *tipc_monitor(struct net *net, 
int bearer_id)
 
 const int tipc_max_domain_size = sizeof(struct tipc_mon_domain);
 
+static inline u16 mon_cpu_to_le16(u16 val)
+{
+   return (__force __u16)htons(val);
+}
+
+static inline u32 mon_cpu_to_le32(u32 val)
+{
+   return (__force __u32)htonl(val);
+}
+
+static inline u64 mon_cpu_to_le64(u64 val)
+{
+   return (__force __u64)cpu_to_be64(val);
+}
+
+static inline u16 mon_le16_to_cpu(u16 val)
+{
+   return ntohs((__force __be16)val);
+}
+
+static inline u32 mon_le32_to_cpu(u32 val)
+{
+   return ntohl((__force __be32)val);
+}
+
+static inline u64 mon_le64_to_cpu(u64 val)
+{
+   return be64_to_cpu((__force __be64)val);
+}
+
 /* dom_rec_len(): actual length of domain record for transport
  */
 static int dom_rec_len(struct tipc_mon_domain *dom, u16 mcnt)
@@ -260,16 +290,16 @@ static void mon_update_local_domain(struct tipc_monitor 
*mon)
diff |= dom->members[i] != peer->addr;
dom->members[i] = peer->addr;
map_set(>up_map, i, peer->is_up);
-   cache->members[i] = htonl(peer->addr);
+   cache->members[i] = mon_cpu_to_le32(pe

[tipc-discussion] [net-next 1/2] tipc: convert dest node's address to network order

2021-03-10 Thread Hoang Huu Le

From: Hoang Le 

(struct tipc_link_info)->dest is in network order (__be32), so we must
convert the value to network order before assigning. The problem detected
by sparse:

net/tipc/netlink_compat.c:699:24: warning: incorrect type in assignment 
(different base types)
net/tipc/netlink_compat.c:699:24:expected restricted __be32 [usertype] dest
net/tipc/netlink_compat.c:699:24:got int

Acked-by: Jon Maloy 
Signed-off-by: Hoang Le 
---
 net/tipc/netlink_compat.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/tipc/netlink_compat.c b/net/tipc/netlink_compat.c
index 5a1ce64039f7..0749df80454d 100644
--- a/net/tipc/netlink_compat.c
+++ b/net/tipc/netlink_compat.c
@@ -696,7 +696,7 @@ static int tipc_nl_compat_link_dump(struct 
tipc_nl_compat_msg *msg,
if (err)
return err;
 
-   link_info.dest = nla_get_flag(link[TIPC_NLA_LINK_DEST]);
+   link_info.dest = htonl(nla_get_flag(link[TIPC_NLA_LINK_DEST]));
link_info.up = htonl(nla_get_flag(link[TIPC_NLA_LINK_UP]));
nla_strscpy(link_info.str, link[TIPC_NLA_LINK_NAME],
TIPC_MAX_LINK_NAME);
-- 
2.25.1



___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

[tipc-discussion] [net-next] tipc: remove duplicated code in tipc_msg_create

2021-01-25 Thread Hoang Huu Le

Remove a duplicate code checking for header size in tipc_msg_create() as
it's already being done in tipc_msg_init().

Signed-off-by: Hoang Huu Le 
---
 net/tipc/msg.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/net/tipc/msg.c b/net/tipc/msg.c
index 2aca86021df5..e9263280a2d4 100644
--- a/net/tipc/msg.c
+++ b/net/tipc/msg.c
@@ -117,10 +117,6 @@ struct sk_buff *tipc_msg_create(uint user, uint type,
msg_set_origport(msg, oport);
msg_set_destport(msg, dport);
msg_set_errcode(msg, errcode);
-   if (hdr_sz > SHORT_H_SIZE) {
-   msg_set_orignode(msg, onode);
-   msg_set_destnode(msg, dnode);
-   }
return buf;
 }
 
-- 
2.25.1



___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

[tipc-discussion] [net-next v2 1/2] tipc: convert dest node's address to network order

2021-01-18 Thread Hoang Huu Le

From: Hoang Le 

(struct tipc_link_info)->dest is in network order (__be32), so we must
convert the value to network order before assigning. The problem detected
by sparse:

net/tipc/netlink_compat.c:699:24: warning: incorrect type in assignment 
(different base types)
net/tipc/netlink_compat.c:699:24:expected restricted __be32 [usertype] dest
net/tipc/netlink_compat.c:699:24:got int

Acked-by: Jon Maloy 
Signed-off-by: Hoang Le 
---
 net/tipc/netlink_compat.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/tipc/netlink_compat.c b/net/tipc/netlink_compat.c
index 5a1ce64039f7..0749df80454d 100644
--- a/net/tipc/netlink_compat.c
+++ b/net/tipc/netlink_compat.c
@@ -696,7 +696,7 @@ static int tipc_nl_compat_link_dump(struct 
tipc_nl_compat_msg *msg,
if (err)
return err;
 
-   link_info.dest = nla_get_flag(link[TIPC_NLA_LINK_DEST]);
+   link_info.dest = htonl(nla_get_flag(link[TIPC_NLA_LINK_DEST]));
link_info.up = htonl(nla_get_flag(link[TIPC_NLA_LINK_UP]));
nla_strscpy(link_info.str, link[TIPC_NLA_LINK_NAME],
TIPC_MAX_LINK_NAME);
-- 
2.25.1



___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

[tipc-discussion] [net-next v2 2/2] tipc: clean up warnings detected by sparse

2021-01-18 Thread Hoang Huu Le

This patch fixes the following warning from sparse:

net/tipc/monitor.c:263:35: warning: incorrect type in assignment (different 
base types)
net/tipc/monitor.c:263:35:expected unsigned int
net/tipc/monitor.c:263:35:got restricted __be32 [usertype]
[...]
net/tipc/node.c:374:13: warning: context imbalance in 'tipc_node_read_lock' - 
wrong count at exit
net/tipc/node.c:379:13: warning: context imbalance in 'tipc_node_read_unlock' - 
unexpected unlock
net/tipc/node.c:384:13: warning: context imbalance in 'tipc_node_write_lock' - 
wrong count at exit
net/tipc/node.c:389:13: warning: context imbalance in 
'tipc_node_write_unlock_fast' - unexpected unlock
net/tipc/node.c:404:17: warning: context imbalance in 'tipc_node_write_unlock' 
- unexpected unlock
[...]
net/tipc/crypto.c:1201:9: warning: incorrect type in initializer (different 
address spaces)
net/tipc/crypto.c:1201:9:expected struct tipc_aead [noderef] __rcu *__tmp
net/tipc/crypto.c:1201:9:got struct tipc_aead *
[...]

Signed-off-by: Hoang Huu Le 
---
 net/tipc/crypto.c  | 12 -
 net/tipc/monitor.c | 63 ++
 net/tipc/node.c|  5 
 3 files changed, 58 insertions(+), 22 deletions(-)

diff --git a/net/tipc/crypto.c b/net/tipc/crypto.c
index f4fca8f7f63f..6f64acef73dc 100644
--- a/net/tipc/crypto.c
+++ b/net/tipc/crypto.c
@@ -317,7 +317,7 @@ static int tipc_aead_key_generate(struct tipc_aead_key 
*skey);
 
 #define tipc_aead_rcu_replace(rcu_ptr, ptr, lock)  \
 do {   \
-   typeof(rcu_ptr) __tmp = rcu_dereference_protected((rcu_ptr),\
+   struct tipc_aead *__tmp = rcu_dereference_protected((rcu_ptr),  \
lockdep_is_held(lock)); \
rcu_assign_pointer((rcu_ptr), (ptr));   \
tipc_aead_put(__tmp);   \
@@ -798,7 +798,7 @@ static int tipc_aead_encrypt(struct tipc_aead *aead, struct 
sk_buff *skb,
ehdr = (struct tipc_ehdr *)skb->data;
salt = aead->salt;
if (aead->mode == CLUSTER_KEY)
-   salt ^= ehdr->addr; /* __be32 */
+   salt ^= __be32_to_cpu(ehdr->addr);
else if (__dnode)
salt ^= tipc_node_get_addr(__dnode);
memcpy(iv, , 4);
@@ -929,7 +929,7 @@ static int tipc_aead_decrypt(struct net *net, struct 
tipc_aead *aead,
ehdr = (struct tipc_ehdr *)skb->data;
salt = aead->salt;
if (aead->mode == CLUSTER_KEY)
-   salt ^= ehdr->addr; /* __be32 */
+   salt ^= __be32_to_cpu(ehdr->addr);
else if (ehdr->destined)
salt ^= tipc_own_addr(net);
memcpy(iv, , 4);
@@ -1946,16 +1946,16 @@ static void tipc_crypto_rcv_complete(struct net *net, 
struct tipc_aead *aead,
goto rcv;
}
tipc_aead_put(aead);
-   aead = tipc_aead_get(tmp);
+   aead = tipc_aead_get((struct tipc_aead __force __rcu *)tmp);
}
 
if (unlikely(err)) {
-   tipc_aead_users_dec(aead, INT_MIN);
+   tipc_aead_users_dec((struct tipc_aead __force __rcu *)aead, 
INT_MIN);
goto free_skb;
}
 
/* Set the RX key's user */
-   tipc_aead_users_set(aead, 1);
+   tipc_aead_users_set((struct tipc_aead __force __rcu *)aead, 1);
 
/* Mark this point, RX works */
rx->timer1 = jiffies;
diff --git a/net/tipc/monitor.c b/net/tipc/monitor.c
index 48fac3b17e40..407619697292 100644
--- a/net/tipc/monitor.c
+++ b/net/tipc/monitor.c
@@ -104,6 +104,36 @@ static struct tipc_monitor *tipc_monitor(struct net *net, 
int bearer_id)
 
 const int tipc_max_domain_size = sizeof(struct tipc_mon_domain);
 
+static inline u16 mon_cpu_to_le16(u16 val)
+{
+   return (__force __u16)htons(val);
+}
+
+static inline u32 mon_cpu_to_le32(u32 val)
+{
+   return (__force __u32)htonl(val);
+}
+
+static inline u64 mon_cpu_to_le64(u64 val)
+{
+   return (__force __u64)cpu_to_be64(val);
+}
+
+static inline u16 mon_le16_to_cpu(u16 val)
+{
+   return ntohs((__force __be16)val);
+}
+
+static inline u32 mon_le32_to_cpu(u32 val)
+{
+   return ntohl((__force __be32)val);
+}
+
+static inline u64 mon_le64_to_cpu(u64 val)
+{
+   return be64_to_cpu((__force __be64)val);
+}
+
 /* dom_rec_len(): actual length of domain record for transport
  */
 static int dom_rec_len(struct tipc_mon_domain *dom, u16 mcnt)
@@ -260,16 +290,16 @@ static void mon_update_local_domain(struct tipc_monitor 
*mon)
diff |= dom->members[i] != peer->addr;
dom->members[i] = peer->addr;
map_set(>up_map, i, peer->is_up);
-   cache->members[i] = htonl(peer->addr);
+   cache->members[i] = mon_cpu_to_le32(peer->addr);

[tipc-discussion] [net-next] tipc: clean up warnings detected by sparse

2021-01-14 Thread Hoang Huu Le

This patch fixes the following warning from sparse:
---
net/tipc/monitor.c:263:35: sparse: warning: incorrect type in assignment..
net/tipc/monitor.c:263:35: sparse:expected unsigned int
net/tipc/monitor.c:263:35: sparse:got restricted __be32 [usertype]
[...]
net/tipc/monitor.c:522:35: sparse: warning: cast to restricted __be32
[...]
---
Signed-off-by: Hoang Huu Le 
---
 net/tipc/crypto.c | 12 ++--
 net/tipc/monitor.c| 32 
 net/tipc/netlink_compat.c |  2 +-
 net/tipc/node.c   |  5 +
 4 files changed, 28 insertions(+), 23 deletions(-)

diff --git a/net/tipc/crypto.c b/net/tipc/crypto.c
index f4fca8f7f63f..096e1c903e3a 100644
--- a/net/tipc/crypto.c
+++ b/net/tipc/crypto.c
@@ -317,7 +317,7 @@ static int tipc_aead_key_generate(struct tipc_aead_key 
*skey);
 
 #define tipc_aead_rcu_replace(rcu_ptr, ptr, lock)  \
 do {   \
-   typeof(rcu_ptr) __tmp = rcu_dereference_protected((rcu_ptr),\
+   struct tipc_aead *__tmp = rcu_dereference_protected((rcu_ptr),  \
lockdep_is_held(lock)); \
rcu_assign_pointer((rcu_ptr), (ptr));   \
tipc_aead_put(__tmp);   \
@@ -798,7 +798,7 @@ static int tipc_aead_encrypt(struct tipc_aead *aead, struct 
sk_buff *skb,
ehdr = (struct tipc_ehdr *)skb->data;
salt = aead->salt;
if (aead->mode == CLUSTER_KEY)
-   salt ^= ehdr->addr; /* __be32 */
+   salt ^= __be32_to_cpu(ehdr->addr); /* __be32 */
else if (__dnode)
salt ^= tipc_node_get_addr(__dnode);
memcpy(iv, , 4);
@@ -929,7 +929,7 @@ static int tipc_aead_decrypt(struct net *net, struct 
tipc_aead *aead,
ehdr = (struct tipc_ehdr *)skb->data;
salt = aead->salt;
if (aead->mode == CLUSTER_KEY)
-   salt ^= ehdr->addr; /* __be32 */
+   salt ^= __be32_to_cpu(ehdr->addr); /* __be32 */
else if (ehdr->destined)
salt ^= tipc_own_addr(net);
memcpy(iv, , 4);
@@ -1946,16 +1946,16 @@ static void tipc_crypto_rcv_complete(struct net *net, 
struct tipc_aead *aead,
goto rcv;
}
tipc_aead_put(aead);
-   aead = tipc_aead_get(tmp);
+   aead = tipc_aead_get((struct tipc_aead __force __rcu *)tmp);
}
 
if (unlikely(err)) {
-   tipc_aead_users_dec(aead, INT_MIN);
+   tipc_aead_users_dec((struct tipc_aead __force __rcu *)aead, 
INT_MIN);
goto free_skb;
}
 
/* Set the RX key's user */
-   tipc_aead_users_set(aead, 1);
+   tipc_aead_users_set((struct tipc_aead __force __rcu *)aead, 1);
 
/* Mark this point, RX works */
rx->timer1 = jiffies;
diff --git a/net/tipc/monitor.c b/net/tipc/monitor.c
index 48fac3b17e40..c9e753a1c98e 100644
--- a/net/tipc/monitor.c
+++ b/net/tipc/monitor.c
@@ -260,16 +260,16 @@ static void mon_update_local_domain(struct tipc_monitor 
*mon)
diff |= dom->members[i] != peer->addr;
dom->members[i] = peer->addr;
map_set(>up_map, i, peer->is_up);
-   cache->members[i] = htonl(peer->addr);
+   cache->members[i] = (__force __u32)htonl(peer->addr);
}
diff |= dom->up_map != prev_up_map;
if (!diff)
return;
dom->gen = ++mon->dom_gen;
-   cache->len = htons(dom->len);
-   cache->gen = htons(dom->gen);
-   cache->member_cnt = htons(member_cnt);
-   cache->up_map = cpu_to_be64(dom->up_map);
+   cache->len = (__force __u32)htons(dom->len);
+   cache->gen = (__force __u16)htons(dom->gen);
+   cache->member_cnt = (__force __u32)htons(member_cnt);
+   cache->up_map = (__force __u64)cpu_to_be64(dom->up_map);
mon_apply_domain(mon, self);
 }
 
@@ -455,10 +455,10 @@ void tipc_mon_rcv(struct net *net, void *data, u16 dlen, 
u32 addr,
struct tipc_mon_domain dom_bef;
struct tipc_mon_domain *dom;
struct tipc_peer *peer;
-   u16 new_member_cnt = ntohs(arrv_dom->member_cnt);
+   u16 new_member_cnt = ntohs((__force __be16)arrv_dom->member_cnt);
int new_dlen = dom_rec_len(arrv_dom, new_member_cnt);
-   u16 new_gen = ntohs(arrv_dom->gen);
-   u16 acked_gen = ntohs(arrv_dom->ack_gen);
+   u16 new_gen = ntohs((__force __be16)arrv_dom->gen);
+   u16 acked_gen = ntohs((__force __be16)arrv_dom->ack_gen);
bool probing = state->probing;
int i, applied_bef;
 
@@ -469,7 +469,7 @@ void tipc_mon_rcv(struct net *net, void *data, u16 dlen, 
u32 addr,
return;
if

[tipc-discussion] [net] tipc: fix NULL deref in tipc_link_xmit()

2021-01-05 Thread Hoang Huu Le

From: Hoang Le 

The buffer list can have zero skb as following path:
tipc_named_node_up()->tipc_node_xmit()->tipc_link_xmit(), so
we need to check the list before casting an _buff.

Fault report:
 [] tipc: Bulk publication failure
 [] general protection fault, probably for non-canonical [#1] PREEMPT [...]
 [] KASAN: null-ptr-deref in range [0x00c8-0x00cf]
 [] CPU: 0 PID: 0 Comm: swapper/0 Kdump: loaded Not tainted 5.10.0-rc4+ #2
 [] Hardware name: Bochs ..., BIOS Bochs 01/01/2011
 [] RIP: 0010:tipc_link_xmit+0xc1/0x2180
 [] Code: 24 b8 00 00 00 00 4d 39 ec 4c 0f 44 e8 e8 d7 0a 10 f9 48 [...]
 [] RSP: 0018:c9006ea0 EFLAGS: 00010202
 [] RAX: dc00 RBX: 8880224da000 RCX: 111003d3cc0d
 [] RDX: 0019 RSI: 886007b9 RDI: 00c8
 [] RBP: c9007018 R08: 0001 R09: f5200ded
 [] R10: 0003 R11: f5200dec R12: c9007148
 [] R13:  R14:  R15: c9007018
 [] FS:  () GS:88803740() knlGS:000[...]
 [] CS:  0010 DS:  ES:  CR0: 80050033
 [] CR2: 7fffd2db5000 CR3: 2b08f000 CR4: 06f0

Signed-off-by: Hoang Le 
---
 net/tipc/link.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/net/tipc/link.c b/net/tipc/link.c
index 6ae2140eb4f7..a6a694b78927 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -1030,7 +1030,6 @@ void tipc_link_reset(struct tipc_link *l)
 int tipc_link_xmit(struct tipc_link *l, struct sk_buff_head *list,
   struct sk_buff_head *xmitq)
 {
-   struct tipc_msg *hdr = buf_msg(skb_peek(list));
struct sk_buff_head *backlogq = >backlogq;
struct sk_buff_head *transmq = >transmq;
struct sk_buff *skb, *_skb;
@@ -1038,13 +1037,18 @@ int tipc_link_xmit(struct tipc_link *l, struct 
sk_buff_head *list,
u16 ack = l->rcv_nxt - 1;
u16 seqno = l->snd_nxt;
int pkt_cnt = skb_queue_len(list);
-   int imp = msg_importance(hdr);
unsigned int mss = tipc_link_mss(l);
unsigned int cwin = l->window;
unsigned int mtu = l->mtu;
+   struct tipc_msg *hdr;
bool new_bundle;
int rc = 0;
+   int imp;
+
+   if (pkt_cnt <= 0)
+   return 0;
 
+   hdr = buf_msg(skb_peek(list));
if (unlikely(msg_size(hdr) > mtu)) {
pr_warn("Too large msg, purging xmit list %d %d %d %d %d!\n",
skb_queue_len(list), msg_user(hdr),
@@ -1053,6 +1057,7 @@ int tipc_link_xmit(struct tipc_link *l, struct 
sk_buff_head *list,
return -EMSGSIZE;
}
 
+   imp = msg_importance(hdr);
/* Allow oversubscription of one data msg per source at congestion */
if (unlikely(l->backlog[imp].len >= l->backlog[imp].limit)) {
if (imp == TIPC_SYSTEM_IMPORTANCE) {
-- 
2.25.1



___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

[tipc-discussion] [net-next] tipc: do sanity check payload of a netlink message

2020-12-11 Thread Hoang Huu Le

From: Hoang Le 

We initialize nlmsghdr without any payload in tipc_nl_compat_dumpit(),
then, result of calling parse attributes always fails and return with
'-EINVAL' error.

To fix error returning when parsing attributes of a netlink message,
we do a sanity check the length of message payload.

Signed-off-by: Hoang Le 
---
 net/tipc/netlink_compat.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/net/tipc/netlink_compat.c b/net/tipc/netlink_compat.c
index 82f154989418..5a1ce64039f7 100644
--- a/net/tipc/netlink_compat.c
+++ b/net/tipc/netlink_compat.c
@@ -213,12 +213,14 @@ static int __tipc_nl_compat_dumpit(struct 
tipc_nl_compat_cmd_dump *cmd,
}
 
info.attrs = attrbuf;
-   err = nlmsg_parse_deprecated(cb.nlh, GENL_HDRLEN, attrbuf,
-tipc_genl_family.maxattr,
-tipc_genl_family.policy, NULL);
-   if (err)
-   goto err_out;
 
+   if (nlmsg_len(cb.nlh) > 0) {
+   err = nlmsg_parse_deprecated(cb.nlh, GENL_HDRLEN, attrbuf,
+tipc_genl_family.maxattr,
+tipc_genl_family.policy, NULL);
+   if (err)
+   goto err_out;
+   }
do {
int rem;
 
-- 
2.25.1



___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Re: [tipc-discussion] BUG: rwlock bad magic on CPU, kworker/0:LINE/NUM, ADDR

2020-12-09 Thread Hoang Huu Le

Hi Jon,

See my inline comment …

Regards,
Hoang
From: Jon Maloy 
Sent: Wednesday, December 9, 2020 5:22 PM
To: Hoang Huu Le 
Cc: tipc-discussion@lists.sourceforge.net; tipc-dek 
Subject: Fwd: BUG: rwlock bad magic on CPU, kworker/0:LINE/NUM, ADDR

Hi Hoang,
This was the one I had in mind. To me it looks like we still have a problem.

///jon

 Forwarded Message 
Subject:
Re: BUG: rwlock bad magic on CPU, kworker/0:LINE/NUM, ADDR
Date:
Mon, 30 Nov 2020 12:35:30 +0100
From:
Dmitry Vyukov <mailto:dvyu...@google.com>
To:
syzbot 
<mailto:syzbot+cb987a9c796abc570...@syzkaller.appspotmail.com>
CC:
David Miller <mailto:da...@davemloft.net>, 
jma...@redhat.com<mailto:jma...@redhat.com>, Jakub Kicinski 
<mailto:k...@kernel.org>, LKML 
<mailto:linux-ker...@vger.kernel.org>, netdev 
<mailto:net...@vger.kernel.org>, syzkaller-bugs 
<mailto:syzkaller-b...@googlegroups.com>, 
tipc-discussion@lists.sourceforge.net<mailto:tipc-discussion@lists.sourceforge.net>,
 Ying Xue <mailto:ying@windriver.com>

On Mon, Nov 30, 2020 at 12:33 PM syzbot
<mailto:syzbot+cb987a9c796abc570...@syzkaller.appspotmail.com>
 wrote:

Hello,

syzbot found the following issue on:

HEAD commit:90cf87d1 enetc: Let the hardware auto-advance the taprio b..

git tree:   net

console output: https://syzkaller.appspot.com/x/log.txt?x=135479b350

kernel config:  https://syzkaller.appspot.com/x/.config?x=5720c06118e6c4cc

dashboard link: https://syzkaller.appspot.com/bug?extid=cb987a9c796abc570b47

compiler:   gcc (GCC) 10.1.0-syz 20200507

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:

Reported-by: 
syzbot+cb987a9c796abc570...@syzkaller.appspotmail.com<mailto:syzbot+cb987a9c796abc570...@syzkaller.appspotmail.com>

tipc: 32-bit node address hash set to aa1414ac

BUG: rwlock bad magic on CPU#0, kworker/0:18/18158, 859f2a8d

CPU: 0 PID: 18158 Comm: kworker/0:18 Not tainted 5.10.0-rc4-syzkaller #0

Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011

Workqueue: events tipc_net_finalize_work

Call Trace:

 __dump_stack lib/dump_stack.c:77 [inline]

 dump_stack+0x107/0x163 lib/dump_stack.c:118

 rwlock_bug kernel/locking/spinlock_debug.c:144 [inline]

 debug_write_lock_before kernel/locking/spinlock_debug.c:182 [inline]

 do_raw_write_lock+0x1ef/0x280 kernel/locking/spinlock_debug.c:206

 tipc_mon_reinit_self+0x1f7/0x630 net/tipc/monitor.c:685

[Hoang] This is new too me. I will take a look.

There was also "general protection fault in tipc_mon_reinit_self":
https://syzkaller.appspot.com/bug?id=dc141b9a05cb48d3d9b46837bc2fdc9e7d95dbe9
which also happened once. Smells like an intricate race condition.

[Hoang] I guess the race condition already fixed in v5.10.

 tipc_net_finalize net/tipc/net.c:134 [inline]

 tipc_net_finalize+0x1df/0x310 net/tipc/net.c:125

 process_one_work+0x933/0x15a0 kernel/workqueue.c:2272

 worker_thread+0x64c/0x1120 kernel/workqueue.c:2418

 kthread+0x3af/0x4a0 kernel/kthread.c:292

 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296

---

This report is generated by a bot. It may contain errors.

See https://goo.gl/tpsmEJ for more information about syzbot.

syzbot engineers can be reached at 
syzkal...@googlegroups.com<mailto:syzkal...@googlegroups.com>.

syzbot will keep track of this issue. See:

https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

--

You received this message because you are subscribed to the Google Groups 
"syzkaller-bugs" group.

To unsubscribe from this group and stop receiving emails from it, send an email 
to 
syzkaller-bugs+unsubscr...@googlegroups.com<mailto:syzkaller-bugs+unsubscr...@googlegroups.com>.

To view this discussion on the web visit 
https://groups.google.com/d/msgid/syzkaller-bugs/4e5bdb05b5516009%40google.com.

___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

[tipc-discussion] [net-next] tipc: support 128bit node identity for peer removing

2020-12-02 Thread Hoang Huu Le

From: Hoang Le 

We add the support to remove a specific node down with 128bit
node identifier, as an alternative to legacy 32-bit node address.

example:
$tipc peer remove identiy <1001002|1677>

Acked-by: Jon Maloy 
Signed-off-by: Hoang Le 
---
 net/tipc/node.c | 21 ++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/net/tipc/node.c b/net/tipc/node.c
index 032d8fc09894..81d69779016c 100644
--- a/net/tipc/node.c
+++ b/net/tipc/node.c
@@ -2220,6 +2220,9 @@ int tipc_nl_peer_rm(struct sk_buff *skb, struct genl_info 
*info)
struct tipc_net *tn = net_generic(net, tipc_net_id);
struct nlattr *attrs[TIPC_NLA_NET_MAX + 1];
struct tipc_node *peer, *temp_node;
+   u8 node_id[NODE_ID_LEN];
+   u64 *w0 = (u64 *)_id[0];
+   u64 *w1 = (u64 *)_id[8];
u32 addr;
int err;
 
@@ -2233,10 +2236,22 @@ int tipc_nl_peer_rm(struct sk_buff *skb, struct 
genl_info *info)
if (err)
return err;
 
-   if (!attrs[TIPC_NLA_NET_ADDR])
-   return -EINVAL;
+   /* attrs[TIPC_NLA_NET_NODEID] and attrs[TIPC_NLA_NET_ADDR] are
+* mutually exclusive cases
+*/
+   if (attrs[TIPC_NLA_NET_ADDR]) {
+   addr = nla_get_u32(attrs[TIPC_NLA_NET_ADDR]);
+   if (!addr)
+   return -EINVAL;
+   }
 
-   addr = nla_get_u32(attrs[TIPC_NLA_NET_ADDR]);
+   if (attrs[TIPC_NLA_NET_NODEID]) {
+   if (!attrs[TIPC_NLA_NET_NODEID_W1])
+   return -EINVAL;
+   *w0 = nla_get_u64(attrs[TIPC_NLA_NET_NODEID]);
+   *w1 = nla_get_u64(attrs[TIPC_NLA_NET_NODEID_W1]);
+   addr = hash128to32(node_id);
+   }
 
if (in_own_node(net, addr))
return -ENOTSUPP;
-- 
2.25.1



___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

[tipc-discussion] [net-next] tipc: support 128bit node identity for peer removing

2020-12-01 Thread Hoang Huu Le

From: Hoang Le 

We add the support to remove a specific node down with 128bit
node identifier, as an alternative to legacy 32-bit node address.

example:
$tipc peer remove identiy <1001002|1677>

Signed-off-by: Hoang Le 
---
 net/tipc/node.c | 21 ++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/net/tipc/node.c b/net/tipc/node.c
index cd67b7d5169f..a7479f68a146 100644
--- a/net/tipc/node.c
+++ b/net/tipc/node.c
@@ -2195,6 +2195,9 @@ int tipc_nl_peer_rm(struct sk_buff *skb, struct genl_info 
*info)
struct tipc_net *tn = net_generic(net, tipc_net_id);
struct nlattr *attrs[TIPC_NLA_NET_MAX + 1];
struct tipc_node *peer, *temp_node;
+   u8 node_id[NODE_ID_LEN];
+   u64 *w0 = (u64 *)_id[0];
+   u64 *w1 = (u64 *)_id[8];
u32 addr;
int err;
 
@@ -2208,10 +2211,22 @@ int tipc_nl_peer_rm(struct sk_buff *skb, struct 
genl_info *info)
if (err)
return err;
 
-   if (!attrs[TIPC_NLA_NET_ADDR])
-   return -EINVAL;
+   /* attrs[TIPC_NLA_NET_NODEID] and attrs[TIPC_NLA_NET_ADDR] are
+* mutually exclusive case
+*/
+   if (attrs[TIPC_NLA_NET_ADDR]) {
+   addr = nla_get_u32(attrs[TIPC_NLA_NET_ADDR]);
+   if (!addr)
+   return -EINVAL;
+   }
 
-   addr = nla_get_u32(attrs[TIPC_NLA_NET_ADDR]);
+   if (attrs[TIPC_NLA_NET_NODEID]) {
+   if (!attrs[TIPC_NLA_NET_NODEID_W1])
+   return -EINVAL;
+   *w0 = nla_get_u64(attrs[TIPC_NLA_NET_NODEID]);
+   *w1 = nla_get_u64(attrs[TIPC_NLA_NET_NODEID_W1]);
+   addr = hash128to32(node_id);
+   }
 
if (in_own_node(net, addr))
return -ENOTSUPP;
-- 
2.25.1



___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

[tipc-discussion] [net] tipc: fix incompatiable mtu of transmission

2020-11-26 Thread Hoang Huu Le

From: Hoang Le 

In commit 682cd3cf946b6
("tipc: confgiure and apply UDP bearer MTU on running links"), we
introduced a function to change UDP bearer MTU and applied this new value
across existing per-link. However, we did not apply this new MTU value at
node level. This lead to packet dropped at link level if its size is
greater than new MTU value.

To fix this issue, we also apply this new MTU value for node level.

Fixes: 682cd3cf946b6 ("tipc: confgiure and apply UDP bearer MTU on running 
links")
Signed-off-by: Hoang Le 
---
 net/tipc/node.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/tipc/node.c b/net/tipc/node.c
index cd67b7d5169f..9f6975dd7873 100644
--- a/net/tipc/node.c
+++ b/net/tipc/node.c
@@ -2182,6 +2182,8 @@ void tipc_node_apply_property(struct net *net, struct 
tipc_bearer *b,
else if (prop == TIPC_NLA_PROP_MTU)
tipc_link_set_mtu(e->link, b->mtu);
}
+   /* Update MTU for node link entry */
+   e->mtu = tipc_link_mss(e->link);
tipc_node_write_unlock(n);
tipc_bearer_xmit(net, bearer_id, , >maddr, NULL);
}
-- 
2.25.1



___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Re: [tipc-discussion] [net] tipc: re-configure queue limit for broadcast link

2020-10-14 Thread Hoang Huu Le

Thanks for your reviewing.
Yes, in this commit, we intend to fix the queue calculation limited, and, 
besides we're planning to fix both in another fix. However, it should be used 
the default (i.e BCLINK_WIN_DEFAULT) one.
Since, we keep to choose fix window size for broadcast link.

Regards,
Hoang
> -Original Message-
> From: Jakub Kicinski 
> Sent: Thursday, October 15, 2020 7:47 AM
> To: Hoang Huu Le 
> Cc: tipc-discussion@lists.sourceforge.net; jma...@redhat.com; 
> ma...@donjonn.com; ying@windriver.com;
> net...@vger.kernel.org
> Subject: Re: [net] tipc: re-configure queue limit for broadcast link
> 
> On Tue, 13 Oct 2020 13:18:10 +0700 Hoang Huu Le wrote:
> > The queue limit of the broadcast link is being calculated base on initial
> > MTU. However, when MTU value changed (e.g manual changing MTU on NIC
> > device, MTU negotiation etc.,) we do not re-calculate queue limit.
> > This gives throughput does not reflect with the change.
> >
> > So fix it by calling the function to re-calculate queue limit of the
> > broadcast link.
> >
> > Acked-by: Jon Maloy 
> > Signed-off-by: Hoang Huu Le 
> > ---
> >  net/tipc/bcast.c | 6 +-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
> > index 940d176e0e87..c77fd13e2777 100644
> > --- a/net/tipc/bcast.c
> > +++ b/net/tipc/bcast.c
> > @@ -108,6 +108,7 @@ static void tipc_bcbase_select_primary(struct net *net)
> >  {
> > struct tipc_bc_base *bb = tipc_bc_base(net);
> > int all_dests =  tipc_link_bc_peers(bb->link);
> > +   int max_win = tipc_link_max_win(bb->link);
> > int i, mtu, prim;
> >
> > bb->primary_bearer = INVALID_BEARER_ID;
> > @@ -121,8 +122,11 @@ static void tipc_bcbase_select_primary(struct net *net)
> > continue;
> >
> > mtu = tipc_bearer_mtu(net, i);
> > -   if (mtu < tipc_link_mtu(bb->link))
> > +   if (mtu < tipc_link_mtu(bb->link)) {
> > tipc_link_set_mtu(bb->link, mtu);
> > +   tipc_link_set_queue_limits(bb->link, max_win,
> > +  max_win);
> 
> Is max/max okay here? Other places seem to use BCLINK_WIN_MIN.
> 
> > +   }
> > bb->bcast_support &= tipc_bearer_bcast_support(net, i);
> > if (bb->dests[i] < all_dests)
> > continue;



___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

[tipc-discussion] [next v2] tipc: fix incorrect setting window for bcast link

2020-10-14 Thread Hoang Huu Le

In commit 16ad3f4022bb
("tipc: introduce variable window congestion control"), we applied
the algorithm to select window size from minimum window to the
configured maximum window for unicast link, and, besides we chose
to keep the window size for broadcast link unchanged and equal (i.e
fix window 50)

However, when setting maximum window variable via command, the window
variable was re-initialized to unexpect value (i.e 32).

We fix this by updating the fix window for broadcast as we stated.

Fixes: 16ad3f4022bb ("tipc: introduce variable window congestion control")
Signed-off-by: Hoang Huu Le 
---
 net/tipc/bcast.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
index c77fd13e2777..bc88f21ec0b2 100644
--- a/net/tipc/bcast.c
+++ b/net/tipc/bcast.c
@@ -589,7 +589,7 @@ static int tipc_bc_link_set_queue_limits(struct net *net, 
u32 max_win)
if (max_win > TIPC_MAX_LINK_WIN)
return -EINVAL;
tipc_bcast_lock(net);
-   tipc_link_set_queue_limits(l, BCLINK_WIN_MIN, max_win);
+   tipc_link_set_queue_limits(l, tipc_link_min_win(l), max_win);
tipc_bcast_unlock(net);
return 0;
 }
-- 
2.25.1



___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

[tipc-discussion] [net] tipc: re-configure queue limit for broadcast link

2020-10-13 Thread Hoang Huu Le

The queue limit of the broadcast link is being calculated base on initial
MTU. However, when MTU value changed (e.g manual changing MTU on NIC
device, MTU negotiation etc.,) we do not re-calculate queue limit.
This gives throughput does not reflect with the change.

So fix it by calling the function to re-calculate queue limit of the
broadcast link.

Acked-by: Jon Maloy 
Signed-off-by: Hoang Huu Le 
---
 net/tipc/bcast.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
index 940d176e0e87..c77fd13e2777 100644
--- a/net/tipc/bcast.c
+++ b/net/tipc/bcast.c
@@ -108,6 +108,7 @@ static void tipc_bcbase_select_primary(struct net *net)
 {
struct tipc_bc_base *bb = tipc_bc_base(net);
int all_dests =  tipc_link_bc_peers(bb->link);
+   int max_win = tipc_link_max_win(bb->link);
int i, mtu, prim;
 
bb->primary_bearer = INVALID_BEARER_ID;
@@ -121,8 +122,11 @@ static void tipc_bcbase_select_primary(struct net *net)
continue;
 
mtu = tipc_bearer_mtu(net, i);
-   if (mtu < tipc_link_mtu(bb->link))
+   if (mtu < tipc_link_mtu(bb->link)) {
tipc_link_set_mtu(bb->link, mtu);
+   tipc_link_set_queue_limits(bb->link, max_win,
+  max_win);
+   }
bb->bcast_support &= tipc_bearer_bcast_support(net, i);
if (bb->dests[i] < all_dests)
continue;
-- 
2.25.1



___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Re: [tipc-discussion] [net] tipc: fix incorrect setting window for bcast link

2020-10-12 Thread Hoang Huu Le

> -Original Message-
> From: Jon Maloy 
> Sent: Tuesday, October 13, 2020 6:39 AM
> To: Hoang Huu Le ; 
> tipc-discussion@lists.sourceforge.net; ma...@donjonn.com;
> ying@windriver.com
> Subject: Re: [net] tipc: fix incorrect setting window for bcast link
> 
> Hi Hoang,
> I apologize for not paying enough attention to this problem until now.
> See below.
> 
> 
> On 9/30/20 9:43 PM, Hoang Huu Le wrote:
> > In commit 16ad3f4022bb
> > ("tipc: introduce variable window congestion control"), we applied
> > the Reno algorithm to select window size from minimum window to the
> > configured maximum window for unicast link.
> >
> > However, when setting window variable we do not specific to unicast link.
> > This lead to the window for broadcast link had effect as unexpect value
> > (i.e the window size is constantly 32).
> >
> > We fix this by updating the window for broadcast as its configuration.
> >
> > Signed-off-by: Hoang Huu Le 
> > ---
> >   net/tipc/bcast.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
> > index 940d176e0e87..abac9443b4d9 100644
> > --- a/net/tipc/bcast.c
> > +++ b/net/tipc/bcast.c
> > @@ -585,7 +585,7 @@ static int tipc_bc_link_set_queue_limits(struct net 
> > *net, u32 max_win)
> > if (max_win > TIPC_MAX_LINK_WIN)
> > return -EINVAL;
> > tipc_bcast_lock(net);
> > -   tipc_link_set_queue_limits(l, BCLINK_WIN_MIN, max_win);
> > +   tipc_link_set_queue_limits(l, max_win, max_win);
> I think this is dangerous. The broadcast link puts a much higher stress
> on the switch, and the risk of massive packet loss with ditto
> retransmissions is very high.
> A safer bet to stay with a fix window of 50 for now, to solve the
> problem you have at sites, and then you can possibly experiment with a
> variable window later.
So, should we return to 'not support' when user changes the window for 
broadcast link. Or just silent ignore the setting?

> If it gives significant higher throughput it might be worthwhile trying,
> but I am pretty sure that e.g. the base for calculating ssthresh (300)
> is too big.
> 
> ///jon
> > tipc_bcast_unlock(net);
> > return 0;
> >   }


___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

[tipc-discussion] [net-next] tipc: remove dead code in tipc_net and relatives

2020-10-08 Thread Hoang Huu Le

dist_queue is no longer used since commit 37922ea4a310
("tipc: permit overlapping service ranges in name table")

Signed-off-by: Hoang Huu Le 
---
 net/tipc/core.c   |  2 --
 net/tipc/core.h   |  3 ---
 net/tipc/name_distr.c | 19 ---
 3 files changed, 24 deletions(-)

diff --git a/net/tipc/core.c b/net/tipc/core.c
index c2ff42900b53..5cc1f0307215 100644
--- a/net/tipc/core.c
+++ b/net/tipc/core.c
@@ -81,8 +81,6 @@ static int __net_init tipc_init_net(struct net *net)
if (err)
goto out_nametbl;
 
-   INIT_LIST_HEAD(>dist_queue);
-
err = tipc_bcast_init(net);
if (err)
goto out_bclink;
diff --git a/net/tipc/core.h b/net/tipc/core.h
index 1d57a4d3b05e..df34dcdd0607 100644
--- a/net/tipc/core.h
+++ b/net/tipc/core.h
@@ -132,9 +132,6 @@ struct tipc_net {
spinlock_t nametbl_lock;
struct name_table *nametbl;
 
-   /* Name dist queue */
-   struct list_head dist_queue;
-
/* Topology subscription server */
struct tipc_topsrv *topsrv;
atomic_t subscription_count;
diff --git a/net/tipc/name_distr.c b/net/tipc/name_distr.c
index 2f9c148f17e2..4d50798fe36c 100644
--- a/net/tipc/name_distr.c
+++ b/net/tipc/name_distr.c
@@ -244,24 +244,6 @@ static void tipc_publ_purge(struct net *net, struct 
publication *publ, u32 addr)
kfree_rcu(p, rcu);
 }
 
-/**
- * tipc_dist_queue_purge - remove deferred updates from a node that went down
- */
-static void tipc_dist_queue_purge(struct net *net, u32 addr)
-{
-   struct tipc_net *tn = net_generic(net, tipc_net_id);
-   struct distr_queue_item *e, *tmp;
-
-   spin_lock_bh(>nametbl_lock);
-   list_for_each_entry_safe(e, tmp, >dist_queue, next) {
-   if (e->node != addr)
-   continue;
-   list_del(>next);
-   kfree(e);
-   }
-   spin_unlock_bh(>nametbl_lock);
-}
-
 void tipc_publ_notify(struct net *net, struct list_head *nsub_list,
  u32 addr, u16 capabilities)
 {
@@ -272,7 +254,6 @@ void tipc_publ_notify(struct net *net, struct list_head 
*nsub_list,
 
list_for_each_entry_safe(publ, tmp, nsub_list, binding_node)
tipc_publ_purge(net, publ, addr);
-   tipc_dist_queue_purge(net, addr);
spin_lock_bh(>nametbl_lock);
if (!(capabilities & TIPC_NAMED_BCAST))
nt->rc_dests--;
-- 
2.25.1



___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Re: [tipc-discussion] [net] tipc: fix NULL pointer dereference in tipc_named_rcv

2020-10-08 Thread Hoang Huu Le

Hi Jon,  Jakub,

I tried with your comment. But looks like we got into circular locking and 
deadlock could happen like this:
CPU0CPU1

   lock(>lock#2);
lock(>nametbl_lock);
lock(>lock#2);
   lock(>nametbl_lock);

  *** DEADLOCK ***

Regards,
Hoang
> -Original Message-
> From: Jon Maloy 
> Sent: Friday, October 9, 2020 1:01 AM
> To: Jakub Kicinski ; Hoang Huu Le 
> Cc: ma...@donjonn.com; ying@windriver.com; 
> tipc-discussion@lists.sourceforge.net; net...@vger.kernel.org
> Subject: Re: [net] tipc: fix NULL pointer dereference in tipc_named_rcv
> 
> 
> 
> On 10/8/20 1:25 PM, Jakub Kicinski wrote:
> > On Thu,  8 Oct 2020 14:31:56 +0700 Hoang Huu Le wrote:
> >> diff --git a/net/tipc/name_distr.c b/net/tipc/name_distr.c
> >> index 2f9c148f17e2..fe4edce459ad 100644
> >> --- a/net/tipc/name_distr.c
> >> +++ b/net/tipc/name_distr.c
> >> @@ -327,8 +327,13 @@ static struct sk_buff *tipc_named_dequeue(struct 
> >> sk_buff_head *namedq,
> >>struct tipc_msg *hdr;
> >>u16 seqno;
> >>
> >> +  spin_lock_bh(>lock);
> >>skb_queue_walk_safe(namedq, skb, tmp) {
> >> -  skb_linearize(skb);
> >> +  if (unlikely(skb_linearize(skb))) {
> >> +  __skb_unlink(skb, namedq);
> >> +  kfree_skb(skb);
> >> +  continue;
> >> +  }
> >>hdr = buf_msg(skb);
> >>seqno = msg_named_seqno(hdr);
> >>if (msg_is_last_bulk(hdr)) {
> >> @@ -338,12 +343,14 @@ static struct sk_buff *tipc_named_dequeue(struct 
> >> sk_buff_head *namedq,
> >>
> >>if (msg_is_bulk(hdr) || msg_is_legacy(hdr)) {
> >>__skb_unlink(skb, namedq);
> >> +  spin_unlock_bh(>lock);
> >>return skb;
> >>}
> >>
> >>if (*open && (*rcv_nxt == seqno)) {
> >>(*rcv_nxt)++;
> >>__skb_unlink(skb, namedq);
> >> +  spin_unlock_bh(>lock);
> >>return skb;
> >>}
> >>
> >> @@ -353,6 +360,7 @@ static struct sk_buff *tipc_named_dequeue(struct 
> >> sk_buff_head *namedq,
> >>continue;
> >>}
> >>}
> >> +  spin_unlock_bh(>lock);
> >>return NULL;
> >>   }
> >>
> >> diff --git a/net/tipc/node.c b/net/tipc/node.c
> >> index cf4b239fc569..d269ebe382e1 100644
> >> --- a/net/tipc/node.c
> >> +++ b/net/tipc/node.c
> >> @@ -1496,7 +1496,7 @@ static void node_lost_contact(struct tipc_node *n,
> >>
> >>/* Clean up broadcast state */
> >>tipc_bcast_remove_peer(n->net, n->bc_entry.link);
> >> -  __skb_queue_purge(>bc_entry.namedq);
> >> +  skb_queue_purge(>bc_entry.namedq);
> > Patch looks fine, but I'm not sure why not hold
> > spin_unlock_bh(>nametbl_lock) here instead?
> >
> > Seems like node_lost_contact() should be relatively rare,
> > so adding another lock to tipc_named_dequeue() is not the
> > right trade off.
> Actually, I agree with previous speaker here. We already have the
> nametbl_lock when tipc_named_dequeue() is called, and the same lock is
> accessible from no.c where node_lost_contact() is executed. The patch
> and the code becomes simpler.
> I suggest you post a v2 of this one.
> 
> ///jon
> 
> >>/* Abort any ongoing link failover */
> >>for (i = 0; i < MAX_BEARERS; i++) {


___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Re: [tipc-discussion] [net-next] tipc: introduce smooth change to replicast

2020-10-08 Thread Hoang Huu Le

Hi Jon,

See my inline comment.

Thanks,
Hoang
> -Original Message-
> From: Jon Maloy 
> Sent: Friday, October 9, 2020 1:27 AM
> To: Hoang Huu Le ; ma...@donjonn.com; 
> ying@windriver.com; tipc-
> discuss...@lists.sourceforge.net
> Subject: Re: [net-next] tipc: introduce smooth change to replicast
> 
> 
> 
> On 10/7/20 10:22 PM, Hoang Huu Le wrote:
> > In commit cad2929dc432 ("tipc: update a binding service via broadcast"),
> > We use broadcast to update a binding service for a large cluster.
> > However, if we try to publish a thousands of services at the
> > same time, we may to get "link overflow" happen because of queue limit
> > had reached.
> >
> > We now introduce a smooth change to replicast if the broadcast link has
> > reach to the limit of queue.
> To me this beats the whole purpose of using broadcast distribution in
> the first place.
> We wanted to save CPU and network resources by using  broadcast, and
> then, when things get tough, we fall back to the supposedly less
> efficient replicast method. Not good.
> 
> I wonder what is really happening when this overflow situation occurs.
> First, the reset limit is dimensioned so that it should be possible to
> publish MAX_PUBLICATIONS (65k) publications in one shot.
> With full bundling, which is what I expect here, there are 1460/20 = 73
> publication items in each buffer, so the reset limit (==max_bulk) should
> be 65k/73 = 897 buffers.

[Hoang] No, it's not right!
As I staged in another commit that the reset limit is 350 buffers 
(65k/(3744/20)) => #1.
where:
FB_MTU=3744
and if a user set window size is 50, we are fallback to 32 window-size => #2. 
So, if the broadcast link is under high load traffic, we will reach to the 
limit easily (#1 + #2).

> My figures are just from the top of my head, so you should double check
> them, but I find it unlikely that we hit this limit unless there is a
> lot of other broadcast going on at the same time, and even then I find
> it unlikely.
[Hoang]
I just implement a simple application:
[...]
num_service = 10k
for (i=0;i);
[...]

Then, run this app; sleep 2; kill SIGINT app; Then, the problem is reproducible.
 
> I suggest you try to find out what is really going on when we reach this
> situation.
> -What exactly is in the backlog queue?
> -Only publications?
> -How many?
> -A mixture of publications and other traffic?
Only publication/withdrawn buffers, around more thousands.

> -Has bundling really worked as supposed?
> -Do we still have some issue with the broadcast link that stops buffers
> being acked and released in a timely manner?
> - Have you been able to dump out such info when this problem occurs?
> - Are you able to re-produce it in your own system?
These answers are YES

> In the end it might be as simple as increasing the reset limit, but we
> really should try to understand what is happening first.
Yes, I think so. As previous patch I made the code change to update queue 
backlog supporting to 873 buffers.
But if I increase number of services in my app up to 20k (not real??>) . The 
issue is able to reproduce as well.
That is the reason why I propose this new solution + combine with two previous 
patches to solve the problem completely.
> 
> Regards
> ///jon
> 
> 
> >
> > Signed-off-by: Hoang Huu Le 
> > ---
> >   net/tipc/link.c |  5 -
> >   net/tipc/node.c | 12 ++--
> >   2 files changed, 14 insertions(+), 3 deletions(-)
> >
> > diff --git a/net/tipc/link.c b/net/tipc/link.c
> > index 06b880da2a8e..ca908ead753a 100644
> > --- a/net/tipc/link.c
> > +++ b/net/tipc/link.c
> > @@ -1022,7 +1022,10 @@ int tipc_link_xmit(struct tipc_link *l, struct 
> > sk_buff_head *list,
> > /* Allow oversubscription of one data msg per source at congestion */
> > if (unlikely(l->backlog[imp].len >= l->backlog[imp].limit)) {
> > if (imp == TIPC_SYSTEM_IMPORTANCE) {
> > -   pr_warn("%s<%s>, link overflow", link_rst_msg, l->name);
> > +   pr_warn_ratelimited("%s<%s>, link overflow",
> > +   link_rst_msg, l->name);
> > +   if (link_is_bc_sndlink(l))
> > +   return -EOVERFLOW;
> > return -ENOBUFS;
> > }
> > rc = link_schedule_user(l, hdr);
> > diff --git a/net/tipc/node.c b/net/tipc/node.c
> > index d269ebe382e1..a37976610367 100644
> > --- a/net/tipc/node.c
> > +++ b/net/tipc/node.c
> > @@ -1750,15 +1750,23 @@ void tipc_node_broadcast(struct net *net, struct 
> >

Re: [tipc-discussion] tipc.py

2020-10-08 Thread Hoang Huu Le

Hi Jon,

I've done at:
https://sourceforge.net/p/tipc/tipcutils/merge-requests/8/

Please take time to review it.

Regards,
Hoang
> -Original Message-
> From: Jon Maloy 
> Sent: Monday, October 5, 2020 10:44 PM
> To: Hoang Huu Le ; 
> tipc-discussion@lists.sourceforge.net; tipc-dek ; Xin
> Long ; Ying Xue 
> Subject: Re: tipc.py
> 
> 
> 
> On 10/5/20 12:35 AM, Hoang Huu Le wrote:
> > Hi Jon,
> >
> > I will make an effort on this. Do you think we need to keep these APIs 
> > compatibility work with Python2 or completely remove out?
> Great! To me you can just update the current code and don't worry about
> compatibility.
> If anybody has used the current module they will have their own copy,
> and I cannot imagine that anybody will write new programs for Puthon 2
> by now.
> 
> Regards
> ///jon
> 
> >
> > Regards,
> > Hoang
> >> -----Original Message-
> >> From: Jon Maloy 
> >> Sent: Saturday, October 3, 2020 5:59 AM
> >> To: Hoang Huu Le ; 
> >> tipc-discussion@lists.sourceforge.net; tipc-dek ;
> Xin
> >> Long ; Ying Xue 
> >> Subject: Re: tipc.py
> >>
> >>
> >>
> >> On 10/1/20 11:04 PM, Hoang Huu Le wrote:
> >>> Hi Jon,
> >>>
> >>> I've done this a long time ago:
> >>> 5057f8bb4de0 tipcutils: introduce python api
> >>>
> >>> Basically, it works with Python 2.
> >> Do you think it you would have time to do this for Python 3?
> >> Python 2 is practically dead now, as we all know.
> >>
> >> Regards
> >> ///jon
> >>> Regards,
> >>> Hoang
> >>>> -Original Message-
> >>>> From: Jon Maloy 
> >>>> Sent: Friday, October 2, 2020 4:56 AM
> >>>> To: tipc-discussion@lists.sourceforge.net; tipc-dek 
> >>>> ; Xin Long ; Ying Xue
> >>>> 
> >>>> Subject: tipc.py
> >>>>
> >>>> I am updating the programmer's manual, and realized that the Python API
> >>>> is missing.
> >>>> Since there are so many programmers knowing Python nowadays, but not C,
> >>>> I think it would
> >>>> be very useful to have this API in the manual, so those programmers can
> >>>> get a feeling
> >>>> for how simple it to use TIPC.
> >>>>
> >>>> Tuong started development of an API based on the libtipc C-API, but it
> >>>> seems to me it was never finished.
> >>>> However, Python does since a long time have native TIPC support,
> >>>> allegedly both in Python 2 and Pyton 3.
> >>>> I had never seen that API until now, but after some googling I came over
> >>>> the following link, that seems to contain
> >>>> that native implemenation:
> >>>>
> >>>> https://blitiri.com.ar/p/pytipc/
> >>>>
> >>>> I wonder if anybody has the time to try this one, and verify, using the
> >>>> examples, that it really works.
> >>>> It would be embarrassing to add this to the manual if it turns out it
> >>>> doesn't work.
> >>>>
> >>>> Regards
> >>>> ///jon
> >>>>
> >>>> PS. Does anybody volunteer to be become co-maintainer of the home page
> >>>> and project page
> >>>>  at SourceForge? I think we should even consider moving it to
> >>>> GitLab or GitHub.
> >>>>  Since we have our own domain (tipc.io) that could easily be
> >>>> re-steered to a different
> >>>>  host system.


___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

[tipc-discussion] [net] tipc: fix NULL pointer dereference in tipc_named_rcv

2020-10-08 Thread Hoang Huu Le

In the function node_lost_contact(), we call __skb_queue_purge() without
grabbing the list->lock. This can cause to a race-condition why processing
the list 'namedq' in calling path tipc_named_rcv()->tipc_named_dequeue().

[] BUG: kernel NULL pointer dereference, address: 
[] #PF: supervisor read access in kernel mode
[] #PF: error_code(0x) - not-present page
[] PGD 7ca63067 P4D 7ca63067 PUD 6c553067 PMD 0
[] Oops:  [#1] SMP NOPTI
[] CPU: 1 PID: 15 Comm: ksoftirqd/1 Tainted: G  O  5.9.0-rc6+ #2
[] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS [...]
[] RIP: 0010:tipc_named_rcv+0x103/0x320 [tipc]
[] Code: 41 89 44 24 10 49 8b 16 49 8b 46 08 49 c7 06 00 00 00 [...]
[] RSP: 0018:c90a7c58 EFLAGS: 0282
[] RAX: 12ec RBX:  RCX: 88807bde1270
[] RDX: 2c7c RSI: 2c7c RDI: 88807b38f1a8
[] RBP: 88807b006288 R08: 88806a367800 R09: 88806a367900
[] R10: 88806a367a00 R11: 88806a367b00 R12: 88807b006258
[] R13: 88807b00628a R14: 888069334d00 R15: 88806a434600
[] FS:  () GS:88807948() knlGS:0[...]
[] CS:  0010 DS:  ES:  CR0: 80050033
[] CR2:  CR3: 7732 CR4: 06e0
[] Call Trace:
[]  ? tipc_bcast_rcv+0x9a/0x1a0 [tipc]
[]  tipc_rcv+0x40d/0x670 [tipc]
[]  ? _raw_spin_unlock+0xa/0x20
[]  tipc_l2_rcv_msg+0x55/0x80 [tipc]
[]  __netif_receive_skb_one_core+0x8c/0xa0
[]  process_backlog+0x98/0x140
[]  net_rx_action+0x13a/0x420
[]  __do_softirq+0xdb/0x316
[]  ? smpboot_thread_fn+0x2f/0x1e0
[]  ? smpboot_thread_fn+0x74/0x1e0
[]  ? smpboot_thread_fn+0x14e/0x1e0
[]  run_ksoftirqd+0x1a/0x40
[]  smpboot_thread_fn+0x149/0x1e0
[]  ? sort_range+0x20/0x20
[]  kthread+0x131/0x150
[]  ? kthread_unuse_mm+0xa0/0xa0
[]  ret_from_fork+0x22/0x30
[] Modules linked in: veth tipc(O) ip6_udp_tunnel udp_tunnel [...]
[] CR2: 
[] ---[ end trace 65c276a8e2e2f310 ]---

To fix this, we need to grab the lock of the 'namedq' list on both
path calling.

Fixes: cad2929dc432 ("tipc: update a binding service via broadcast")
Acked-by: Jon Maloy 
Signed-off-by: Hoang Huu Le 
---
 net/tipc/name_distr.c | 10 +-
 net/tipc/node.c   |  2 +-
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/net/tipc/name_distr.c b/net/tipc/name_distr.c
index 2f9c148f17e2..fe4edce459ad 100644
--- a/net/tipc/name_distr.c
+++ b/net/tipc/name_distr.c
@@ -327,8 +327,13 @@ static struct sk_buff *tipc_named_dequeue(struct 
sk_buff_head *namedq,
struct tipc_msg *hdr;
u16 seqno;
 
+   spin_lock_bh(>lock);
skb_queue_walk_safe(namedq, skb, tmp) {
-   skb_linearize(skb);
+   if (unlikely(skb_linearize(skb))) {
+   __skb_unlink(skb, namedq);
+   kfree_skb(skb);
+   continue;
+   }
hdr = buf_msg(skb);
seqno = msg_named_seqno(hdr);
if (msg_is_last_bulk(hdr)) {
@@ -338,12 +343,14 @@ static struct sk_buff *tipc_named_dequeue(struct 
sk_buff_head *namedq,
 
if (msg_is_bulk(hdr) || msg_is_legacy(hdr)) {
__skb_unlink(skb, namedq);
+   spin_unlock_bh(>lock);
return skb;
}
 
if (*open && (*rcv_nxt == seqno)) {
(*rcv_nxt)++;
__skb_unlink(skb, namedq);
+   spin_unlock_bh(>lock);
return skb;
}
 
@@ -353,6 +360,7 @@ static struct sk_buff *tipc_named_dequeue(struct 
sk_buff_head *namedq,
continue;
}
}
+   spin_unlock_bh(>lock);
return NULL;
 }
 
diff --git a/net/tipc/node.c b/net/tipc/node.c
index cf4b239fc569..d269ebe382e1 100644
--- a/net/tipc/node.c
+++ b/net/tipc/node.c
@@ -1496,7 +1496,7 @@ static void node_lost_contact(struct tipc_node *n,
 
/* Clean up broadcast state */
tipc_bcast_remove_peer(n->net, n->bc_entry.link);
-   __skb_queue_purge(>bc_entry.namedq);
+   skb_queue_purge(>bc_entry.namedq);
 
/* Abort any ongoing link failover */
for (i = 0; i < MAX_BEARERS; i++) {
-- 
2.25.1



___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

[tipc-discussion] [net-next] tipc: introduce smooth change to replicast

2020-10-07 Thread Hoang Huu Le

In commit cad2929dc432 ("tipc: update a binding service via broadcast"),
We use broadcast to update a binding service for a large cluster.
However, if we try to publish a thousands of services at the
same time, we may to get "link overflow" happen because of queue limit
had reached.

We now introduce a smooth change to replicast if the broadcast link has
reach to the limit of queue.

Signed-off-by: Hoang Huu Le 
---
 net/tipc/link.c |  5 -
 net/tipc/node.c | 12 ++--
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/net/tipc/link.c b/net/tipc/link.c
index 06b880da2a8e..ca908ead753a 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -1022,7 +1022,10 @@ int tipc_link_xmit(struct tipc_link *l, struct 
sk_buff_head *list,
/* Allow oversubscription of one data msg per source at congestion */
if (unlikely(l->backlog[imp].len >= l->backlog[imp].limit)) {
if (imp == TIPC_SYSTEM_IMPORTANCE) {
-   pr_warn("%s<%s>, link overflow", link_rst_msg, l->name);
+   pr_warn_ratelimited("%s<%s>, link overflow",
+   link_rst_msg, l->name);
+   if (link_is_bc_sndlink(l))
+   return -EOVERFLOW;
return -ENOBUFS;
}
rc = link_schedule_user(l, hdr);
diff --git a/net/tipc/node.c b/net/tipc/node.c
index d269ebe382e1..a37976610367 100644
--- a/net/tipc/node.c
+++ b/net/tipc/node.c
@@ -1750,15 +1750,23 @@ void tipc_node_broadcast(struct net *net, struct 
sk_buff *skb, int rc_dests)
struct tipc_node *n;
u16 dummy;
u32 dst;
+   int rc = 0;
 
/* Use broadcast if all nodes support it */
if (!rc_dests && tipc_bcast_get_mode(net) != BCLINK_MODE_RCAST) {
+   txskb = pskb_copy(skb, GFP_ATOMIC);
+   if (!txskb)
+   goto rcast;
__skb_queue_head_init();
-   __skb_queue_tail(, skb);
-   tipc_bcast_xmit(net, , );
+   __skb_queue_tail(, txskb);
+   rc = tipc_bcast_xmit(net, , );
+   if (rc == -EOVERFLOW)
+   goto rcast;
+   kfree_skb(skb);
return;
}
 
+rcast:
/* Otherwise use legacy replicast method */
rcu_read_lock();
list_for_each_entry_rcu(n, tipc_nodes(net), list) {
-- 
2.25.1



___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

[tipc-discussion] [net-next] tipc: fix NULL pointer dereference in tipc_named_rcv

2020-10-07 Thread Hoang Huu Le

In the function node_lost_contact(), we call __skb_queue_purge() without
grabbing the list->lock. This can cause to a race-condition why processing
the list 'namedq' in calling path tipc_named_rcv()->tipc_named_dequeue().

[] BUG: kernel NULL pointer dereference, address: 
[] #PF: supervisor read access in kernel mode
[] #PF: error_code(0x) - not-present page
[] PGD 7ca63067 P4D 7ca63067 PUD 6c553067 PMD 0
[] Oops:  [#1] SMP NOPTI
[] CPU: 1 PID: 15 Comm: ksoftirqd/1 Tainted: G  O  5.9.0-rc6+ #2
[] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS [...]
[] RIP: 0010:tipc_named_rcv+0x103/0x320 [tipc]
[] Code: 41 89 44 24 10 49 8b 16 49 8b 46 08 49 c7 06 00 00 00 [...]
[] RSP: 0018:c90a7c58 EFLAGS: 0282
[] RAX: 12ec RBX:  RCX: 88807bde1270
[] RDX: 2c7c RSI: 2c7c RDI: 88807b38f1a8
[] RBP: 88807b006288 R08: 88806a367800 R09: 88806a367900
[] R10: 88806a367a00 R11: 88806a367b00 R12: 88807b006258
[] R13: 88807b00628a R14: 888069334d00 R15: 88806a434600
[] FS:  () GS:88807948() knlGS:0[...]
[] CS:  0010 DS:  ES:  CR0: 80050033
[] CR2:  CR3: 7732 CR4: 06e0
[] Call Trace:
[]  ? tipc_bcast_rcv+0x9a/0x1a0 [tipc]
[]  tipc_rcv+0x40d/0x670 [tipc]
[]  ? _raw_spin_unlock+0xa/0x20
[]  tipc_l2_rcv_msg+0x55/0x80 [tipc]
[]  __netif_receive_skb_one_core+0x8c/0xa0
[]  process_backlog+0x98/0x140
[]  net_rx_action+0x13a/0x420
[]  __do_softirq+0xdb/0x316
[]  ? smpboot_thread_fn+0x2f/0x1e0
[]  ? smpboot_thread_fn+0x74/0x1e0
[]  ? smpboot_thread_fn+0x14e/0x1e0
[]  run_ksoftirqd+0x1a/0x40
[]  smpboot_thread_fn+0x149/0x1e0
[]  ? sort_range+0x20/0x20
[]  kthread+0x131/0x150
[]  ? kthread_unuse_mm+0xa0/0xa0
[]  ret_from_fork+0x22/0x30
[] Modules linked in: veth tipc(O) ip6_udp_tunnel udp_tunnel [...]
[] CR2: 
[] ---[ end trace 65c276a8e2e2f310 ]---

To fix this, we need to grab the lock of the 'namedq' list on both
path calling.

Fixes: cad2929dc432 ("tipc: update a binding service via broadcast")
Signed-off-by: Hoang Huu Le 
---
 net/tipc/name_distr.c | 10 +-
 net/tipc/node.c   |  2 +-
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/net/tipc/name_distr.c b/net/tipc/name_distr.c
index 2f9c148f17e2..fe4edce459ad 100644
--- a/net/tipc/name_distr.c
+++ b/net/tipc/name_distr.c
@@ -327,8 +327,13 @@ static struct sk_buff *tipc_named_dequeue(struct 
sk_buff_head *namedq,
struct tipc_msg *hdr;
u16 seqno;
 
+   spin_lock_bh(>lock);
skb_queue_walk_safe(namedq, skb, tmp) {
-   skb_linearize(skb);
+   if (unlikely(skb_linearize(skb))) {
+   __skb_unlink(skb, namedq);
+   kfree_skb(skb);
+   continue;
+   }
hdr = buf_msg(skb);
seqno = msg_named_seqno(hdr);
if (msg_is_last_bulk(hdr)) {
@@ -338,12 +343,14 @@ static struct sk_buff *tipc_named_dequeue(struct 
sk_buff_head *namedq,
 
if (msg_is_bulk(hdr) || msg_is_legacy(hdr)) {
__skb_unlink(skb, namedq);
+   spin_unlock_bh(>lock);
return skb;
}
 
if (*open && (*rcv_nxt == seqno)) {
(*rcv_nxt)++;
__skb_unlink(skb, namedq);
+   spin_unlock_bh(>lock);
return skb;
}
 
@@ -353,6 +360,7 @@ static struct sk_buff *tipc_named_dequeue(struct 
sk_buff_head *namedq,
continue;
}
}
+   spin_unlock_bh(>lock);
return NULL;
 }
 
diff --git a/net/tipc/node.c b/net/tipc/node.c
index cf4b239fc569..d269ebe382e1 100644
--- a/net/tipc/node.c
+++ b/net/tipc/node.c
@@ -1496,7 +1496,7 @@ static void node_lost_contact(struct tipc_node *n,
 
/* Clean up broadcast state */
tipc_bcast_remove_peer(n->net, n->bc_entry.link);
-   __skb_queue_purge(>bc_entry.namedq);
+   skb_queue_purge(>bc_entry.namedq);
 
/* Abort any ongoing link failover */
for (i = 0; i < MAX_BEARERS; i++) {
-- 
2.25.1



___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Re: [tipc-discussion] tipc.py

2020-10-04 Thread Hoang Huu Le

Hi Jon,

I will make an effort on this. Do you think we need to keep these APIs 
compatibility work with Python2 or completely remove out?

Regards,
Hoang
> -Original Message-
> From: Jon Maloy 
> Sent: Saturday, October 3, 2020 5:59 AM
> To: Hoang Huu Le ; 
> tipc-discussion@lists.sourceforge.net; tipc-dek ; Xin
> Long ; Ying Xue 
> Subject: Re: tipc.py
> 
> 
> 
> On 10/1/20 11:04 PM, Hoang Huu Le wrote:
> > Hi Jon,
> >
> > I've done this a long time ago:
> > 5057f8bb4de0 tipcutils: introduce python api
> >
> > Basically, it works with Python 2.
> Do you think it you would have time to do this for Python 3?
> Python 2 is practically dead now, as we all know.
> 
> Regards
> ///jon
> >
> > Regards,
> > Hoang
> >> -Original Message-
> >> From: Jon Maloy 
> >> Sent: Friday, October 2, 2020 4:56 AM
> >> To: tipc-discussion@lists.sourceforge.net; tipc-dek 
> >> ; Xin Long ; Ying Xue
> >> 
> >> Subject: tipc.py
> >>
> >> I am updating the programmer's manual, and realized that the Python API
> >> is missing.
> >> Since there are so many programmers knowing Python nowadays, but not C,
> >> I think it would
> >> be very useful to have this API in the manual, so those programmers can
> >> get a feeling
> >> for how simple it to use TIPC.
> >>
> >> Tuong started development of an API based on the libtipc C-API, but it
> >> seems to me it was never finished.
> >> However, Python does since a long time have native TIPC support,
> >> allegedly both in Python 2 and Pyton 3.
> >> I had never seen that API until now, but after some googling I came over
> >> the following link, that seems to contain
> >> that native implemenation:
> >>
> >> https://blitiri.com.ar/p/pytipc/
> >>
> >> I wonder if anybody has the time to try this one, and verify, using the
> >> examples, that it really works.
> >> It would be embarrassing to add this to the manual if it turns out it
> >> doesn't work.
> >>
> >> Regards
> >> ///jon
> >>
> >> PS. Does anybody volunteer to be become co-maintainer of the home page
> >> and project page
> >>     at SourceForge? I think we should even consider moving it to
> >> GitLab or GitHub.
> >>     Since we have our own domain (tipc.io) that could easily be
> >> re-steered to a different
> >>     host system.


___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Re: [tipc-discussion] tipc.py

2020-10-01 Thread Hoang Huu Le

Hi Jon,

I've done this a long time ago:
5057f8bb4de0 tipcutils: introduce python api

Basically, it works with Python 2.

Regards,
Hoang
> -Original Message-
> From: Jon Maloy 
> Sent: Friday, October 2, 2020 4:56 AM
> To: tipc-discussion@lists.sourceforge.net; tipc-dek 
> ; Xin Long ; Ying Xue
> 
> Subject: tipc.py
> 
> I am updating the programmer's manual, and realized that the Python API
> is missing.
> Since there are so many programmers knowing Python nowadays, but not C,
> I think it would
> be very useful to have this API in the manual, so those programmers can
> get a feeling
> for how simple it to use TIPC.
> 
> Tuong started development of an API based on the libtipc C-API, but it
> seems to me it was never finished.
> However, Python does since a long time have native TIPC support,
> allegedly both in Python 2 and Pyton 3.
> I had never seen that API until now, but after some googling I came over
> the following link, that seems to contain
> that native implemenation:
> 
> https://blitiri.com.ar/p/pytipc/
> 
> I wonder if anybody has the time to try this one, and verify, using the
> examples, that it really works.
> It would be embarrassing to add this to the manual if it turns out it
> doesn't work.
> 
> Regards
> ///jon
> 
> PS. Does anybody volunteer to be become co-maintainer of the home page
> and project page
>    at SourceForge? I think we should even consider moving it to
> GitLab or GitHub.
>    Since we have our own domain (tipc.io) that could easily be
> re-steered to a different
>    host system.


___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Re: [tipc-discussion] [net] tipc: fix incorrect setting window for bcast link

2020-10-01 Thread Hoang Huu Le

Hi Jon,

See my inline comment.

Regards,
Hoang
> -Original Message-
> From: Jon Maloy 
> Sent: Friday, October 2, 2020 4:40 AM
> To: Hoang Huu Le ; 
> tipc-discussion@lists.sourceforge.net; ma...@donjonn.com;
> ying@windriver.com; Xin Long 
> Subject: Re: [net] tipc: fix incorrect setting window for bcast link
> 
> 
> 
> On 9/30/20 9:43 PM, Hoang Huu Le wrote:
> > In commit 16ad3f4022bb
> > ("tipc: introduce variable window congestion control"), we applied
> > the Reno algorithm to select window size from minimum window to the
> > configured maximum window for unicast link.
> >
> > However, when setting window variable we do not specific to unicast link.
> > This lead to the window for broadcast link had effect as unexpect value
> > (i.e the window size is constantly 32).
> This was intentional, although I thought the value was 50, not 32.
> In my experience we cannot afford a generous variable window
> in the broadcast link the same way we do with the unicast link.
> There will be too many losses and retransmissions because of the
> way switches work.

[Hoang]
When it is created, the value is 50. However, if we use the tipc command:
i.e $ tipc link set win 50 link broadcast
The window is set to 32 - this value is constant since then. 
Is it counting as bug?

> 
> >
> > We fix this by updating the window for broadcast as its configuration.
> >
> > Signed-off-by: Hoang Huu Le 
> > ---
> >   net/tipc/bcast.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
> > index 940d176e0e87..abac9443b4d9 100644
> > --- a/net/tipc/bcast.c
> > +++ b/net/tipc/bcast.c
> > @@ -585,7 +585,7 @@ static int tipc_bc_link_set_queue_limits(struct net 
> > *net, u32 max_win)
> > if (max_win > TIPC_MAX_LINK_WIN)
> > return -EINVAL;
> > tipc_bcast_lock(net);
> > -   tipc_link_set_queue_limits(l, BCLINK_WIN_MIN, max_win);
> > +   tipc_link_set_queue_limits(l, max_win, max_win);
> > tipc_bcast_unlock(net);
> > return 0;
> >   }
> What is the effect of this change? Do we still have a fix window?
[Hoang]
No, now window can be configured as user intention.
If you'd like to use the fix window 50. I will update the code change. 

> What happens when we have buffer overflow? The broadcast
> send link can never be reset I rember correctly.
> Did you test this with high load, e.g. using the multicast_blast test
> program?
[Hoang] I tried to publish hundreds or services that uses SYSTEM_IMPORTANCE - 
it was not being reset when the link is overflow.
For others imp, the overflow issue will not happen because of oversubscription 
allowing. I will give a try with the test.
> 
> Regards
> ///jon


___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

[tipc-discussion] [net-next] tipc: re-configure queue limit for broadcast link

2020-10-01 Thread Hoang Huu Le

The queue limit of the broadcast link is being calculated base on initial
MTU. However, when MTU value changed (e.g manual changing MTU on NIC device,
MTU negotiation etc.,) we do not re-calculate queue limit. This gives
throughput does not reflect with the change.

So fix it by calling the function to re-calculate queue limit of the
broadcast link.

Signed-off-by: Hoang Huu Le 
---
 net/tipc/bcast.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
index abac9443b4d9..bc566b304571 100644
--- a/net/tipc/bcast.c
+++ b/net/tipc/bcast.c
@@ -108,6 +108,7 @@ static void tipc_bcbase_select_primary(struct net *net)
 {
struct tipc_bc_base *bb = tipc_bc_base(net);
int all_dests =  tipc_link_bc_peers(bb->link);
+   int max_win = tipc_link_max_win(bb->link);
int i, mtu, prim;
 
bb->primary_bearer = INVALID_BEARER_ID;
@@ -121,8 +122,11 @@ static void tipc_bcbase_select_primary(struct net *net)
continue;
 
mtu = tipc_bearer_mtu(net, i);
-   if (mtu < tipc_link_mtu(bb->link))
+   if (mtu < tipc_link_mtu(bb->link)) {
tipc_link_set_mtu(bb->link, mtu);
+   tipc_link_set_queue_limits(bb->link, max_win,
+  max_win);
+   }
bb->bcast_support &= tipc_bearer_bcast_support(net, i);
if (bb->dests[i] < all_dests)
continue;
-- 
2.25.1



___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

[tipc-discussion] [net] tipc: fix incorrect setting window for bcast link

2020-09-30 Thread Hoang Huu Le

In commit 16ad3f4022bb
("tipc: introduce variable window congestion control"), we applied
the Reno algorithm to select window size from minimum window to the
configured maximum window for unicast link.

However, when setting window variable we do not specific to unicast link.
This lead to the window for broadcast link had effect as unexpect value
(i.e the window size is constantly 32).

We fix this by updating the window for broadcast as its configuration.

Signed-off-by: Hoang Huu Le 
---
 net/tipc/bcast.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
index 940d176e0e87..abac9443b4d9 100644
--- a/net/tipc/bcast.c
+++ b/net/tipc/bcast.c
@@ -585,7 +585,7 @@ static int tipc_bc_link_set_queue_limits(struct net *net, 
u32 max_win)
if (max_win > TIPC_MAX_LINK_WIN)
return -EINVAL;
tipc_bcast_lock(net);
-   tipc_link_set_queue_limits(l, BCLINK_WIN_MIN, max_win);
+   tipc_link_set_queue_limits(l, max_win, max_win);
tipc_bcast_unlock(net);
return 0;
 }
-- 
2.25.1



___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

[tipc-discussion] [net-next v3] tipc: fix a deadlock when flushing scheduled work

2020-09-04 Thread Hoang Huu Le

In the commit fdeba99b1e58
("tipc: fix use-after-free in tipc_bcast_get_mode"), we're trying
to make sure the tipc_net_finalize_work work item finished if it
enqueued. But calling flush_scheduled_work() is not just above
work item but either any scheduled work. This has turned out to be
overkill and this caused to deadlock as syzbot reported:

==
WARNING: possible circular locking dependency detected
5.9.0-rc2-next-20200828-syzkaller #0 Not tainted
--
kworker/u4:6/349 is trying to acquire lock:
8880aa063d38 ((wq_completion)events){+.+.}-{0:0}, at: 
flush_workqueue+0xe1/0x13e0 kernel/workqueue.c:2777

but task is already holding lock:
8a879430 (pernet_ops_rwsem){}-{3:3}, at: cleanup_net+0x9b/0xb10 
net/core/net_namespace.c:565

[...]
 Possible unsafe locking scenario:

   CPU0CPU1
   
  lock(pernet_ops_rwsem);
   lock(>s_type->i_mutex_key#13);
   lock(pernet_ops_rwsem);
  lock((wq_completion)events);

 *** DEADLOCK ***
[...]

To fix the original issue, we replace above calling by introducing
a bit flag. When a namespace cleaned-up, bit flag is set to zero:
- tipc_net_finalize functionial just does return immediately.
- tipc_net_finalize_work does not enqueue into the scheduled work queue.

Reported-by: syzbot+d5aa7e0385f6a5d0f...@syzkaller.appspotmail.com
Fixes: fdeba99b1e58 ("tipc: fix use-after-free in tipc_bcast_get_mode")
Signed-off-by: Hoang Huu Le 
---
 net/tipc/core.c |  8 
 net/tipc/core.h |  1 +
 net/tipc/net.c  | 10 +-
 3 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/net/tipc/core.c b/net/tipc/core.c
index 37d8695548cf..5e7bb768f45c 100644
--- a/net/tipc/core.c
+++ b/net/tipc/core.c
@@ -60,6 +60,7 @@ static int __net_init tipc_init_net(struct net *net)
tn->trial_addr = 0;
tn->addr_trial_end = 0;
tn->capabilities = TIPC_NODE_CAPABILITIES;
+   test_and_set_bit_lock(0, >net_exit_flag);
memset(tn->node_id, 0, sizeof(tn->node_id));
memset(tn->node_id_string, 0, sizeof(tn->node_id_string));
tn->mon_threshold = TIPC_DEF_MON_THRESHOLD;
@@ -110,10 +111,6 @@ static void __net_exit tipc_exit_net(struct net *net)
tipc_detach_loopback(net);
tipc_net_stop(net);
 
-   /* Make sure the tipc_net_finalize_work stopped
-* before releasing the resources.
-*/
-   flush_scheduled_work();
tipc_bcast_stop(net);
tipc_nametbl_stop(net);
tipc_sk_rht_destroy(net);
@@ -124,6 +121,9 @@ static void __net_exit tipc_exit_net(struct net *net)
 
 static void __net_exit tipc_pernet_pre_exit(struct net *net)
 {
+   struct tipc_net *tn = tipc_net(net);
+
+   clear_bit_unlock(0, >net_exit_flag);
tipc_node_pre_cleanup_net(net);
 }
 
diff --git a/net/tipc/core.h b/net/tipc/core.h
index 631d83c9705f..aa75882dd932 100644
--- a/net/tipc/core.h
+++ b/net/tipc/core.h
@@ -143,6 +143,7 @@ struct tipc_net {
/* TX crypto handler */
struct tipc_crypto *crypto_tx;
 #endif
+   unsigned long net_exit_flag;
 };
 
 static inline struct tipc_net *tipc_net(struct net *net)
diff --git a/net/tipc/net.c b/net/tipc/net.c
index 85400e4242de..8ad5b9ad89c0 100644
--- a/net/tipc/net.c
+++ b/net/tipc/net.c
@@ -132,6 +132,9 @@ static void tipc_net_finalize(struct net *net, u32 addr)
 {
struct tipc_net *tn = tipc_net(net);
 
+   if (unlikely(!test_bit(0, >net_exit_flag)))
+   return;
+
if (cmpxchg(>node_addr, 0, addr))
return;
tipc_set_node_addr(net, addr);
@@ -153,8 +156,13 @@ static void tipc_net_finalize_work(struct work_struct 
*work)
 
 void tipc_sched_net_finalize(struct net *net, u32 addr)
 {
-   struct tipc_net_work *fwork = kzalloc(sizeof(*fwork), GFP_ATOMIC);
+   struct tipc_net *tn = tipc_net(net);
+   struct tipc_net_work *fwork;
+
+   if (unlikely(!test_bit(0, >net_exit_flag)))
+   return;
 
+   fwork = kzalloc(sizeof(*fwork), GFP_ATOMIC);
if (!fwork)
return;
INIT_WORK(>work, tipc_net_finalize_work);
-- 
2.25.1



___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

[tipc-discussion] [net-next v2 2/2] tipc: fix use-after-free in tipc_bcast_get_mode

2020-09-03 Thread Hoang Huu Le

Syzbot has reported those issues as:

==
BUG: KASAN: use-after-free in tipc_bcast_get_mode+0x3ab/0x400 
net/tipc/bcast.c:759
Read of size 1 at addr 88805e6b3571 by task kworker/0:6/3850

CPU: 0 PID: 3850 Comm: kworker/0:6 Not tainted 5.8.0-rc7-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Workqueue: events tipc_net_finalize_work

Thread 1's call trace:
[...]
  kfree+0x103/0x2c0 mm/slab.c:3757 <- bcbase releasing
  tipc_bcast_stop+0x1b0/0x2f0 net/tipc/bcast.c:721
  tipc_exit_net+0x24/0x270 net/tipc/core.c:112
[...]

Thread 2's call trace:
[...]
  tipc_bcast_get_mode+0x3ab/0x400 net/tipc/bcast.c:759 <- bcbase
has already been freed by Thread 1

  tipc_node_broadcast+0x9e/0xcc0 net/tipc/node.c:1744
  tipc_nametbl_publish+0x60b/0x970 net/tipc/name_table.c:752
  tipc_net_finalize net/tipc/net.c:141 [inline]
  tipc_net_finalize+0x1fa/0x310 net/tipc/net.c:131
  tipc_net_finalize_work+0x55/0x80 net/tipc/net.c:150
[...]

==
BUG: KASAN: use-after-free in tipc_named_reinit+0xef/0x290 
net/tipc/name_distr.c:344
Read of size 8 at addr 888052ab2000 by task kworker/0:13/30628
CPU: 0 PID: 30628 Comm: kworker/0:13 Not tainted 5.8.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Workqueue: events tipc_net_finalize_work
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1f0/0x31e lib/dump_stack.c:118
 print_address_description+0x66/0x5a0 mm/kasan/report.c:383
 __kasan_report mm/kasan/report.c:513 [inline]
 kasan_report+0x132/0x1d0 mm/kasan/report.c:530
 tipc_named_reinit+0xef/0x290 net/tipc/name_distr.c:344
 tipc_net_finalize+0x85/0xe0 net/tipc/net.c:138
 tipc_net_finalize_work+0x50/0x70 net/tipc/net.c:150
 process_one_work+0x789/0xfc0 kernel/workqueue.c:2269
 worker_thread+0xaa4/0x1460 kernel/workqueue.c:2415
 kthread+0x37e/0x3a0 drivers/block/aoe/aoecmd.c:1234
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293
[...]
Freed by task 14058:
 save_stack mm/kasan/common.c:48 [inline]
 set_track mm/kasan/common.c:56 [inline]
 kasan_set_free_info mm/kasan/common.c:316 [inline]
 __kasan_slab_free+0x114/0x170 mm/kasan/common.c:455
 __cache_free mm/slab.c:3426 [inline]
 kfree+0x10a/0x220 mm/slab.c:3757
 tipc_exit_net+0x29/0x50 net/tipc/core.c:113
 ops_exit_list net/core/net_namespace.c:186 [inline]
 cleanup_net+0x708/0xba0 net/core/net_namespace.c:603
 process_one_work+0x789/0xfc0 kernel/workqueue.c:2269
 worker_thread+0xaa4/0x1460 kernel/workqueue.c:2415
 kthread+0x37e/0x3a0 drivers/block/aoe/aoecmd.c:1234
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293

Fix it by calling cancel_work_sync() to make sure the
tipc_net_finalize_work() finished before releasing bcbase object.

Reported-by: syzbot+6ea1f7a8df64596ef...@syzkaller.appspotmail.com
Reported-by: syzbot+e9cc557752ab126c1...@syzkaller.appspotmail.com
Signed-off-by: Hoang Huu Le 
---
 net/tipc/core.c |  6 ++
 net/tipc/core.h | 10 ++
 net/tipc/net.c  | 21 +
 3 files changed, 25 insertions(+), 12 deletions(-)

diff --git a/net/tipc/core.c b/net/tipc/core.c
index 4f6dc74adf45..f81b80e7 100644
--- a/net/tipc/core.c
+++ b/net/tipc/core.c
@@ -60,6 +60,7 @@ static int __net_init tipc_init_net(struct net *net)
tn->trial_addr = 0;
tn->addr_trial_end = 0;
tn->capabilities = TIPC_NODE_CAPABILITIES;
+   clear_bit_unlock(0, >final_work_flag);
memset(tn->node_id, 0, sizeof(tn->node_id));
memset(tn->node_id_string, 0, sizeof(tn->node_id_string));
tn->mon_threshold = TIPC_DEF_MON_THRESHOLD;
@@ -119,7 +120,12 @@ static void __net_exit tipc_exit_net(struct net *net)
 
 static void __net_exit tipc_pernet_pre_exit(struct net *net)
 {
+   struct tipc_net *tn = tipc_net(net);
+
tipc_node_pre_cleanup_net(net);
+   /* Make sure the tipc_net_finalize_work() finished */
+   if (likely(test_bit(0, >final_work_flag)))
+   cancel_work_sync(>final_work.work);
 }
 
 static struct pernet_operations tipc_pernet_pre_exit_ops = {
diff --git a/net/tipc/core.h b/net/tipc/core.h
index 631d83c9705f..396105f17bc9 100644
--- a/net/tipc/core.h
+++ b/net/tipc/core.h
@@ -90,6 +90,12 @@ extern unsigned int tipc_net_id __read_mostly;
 extern int sysctl_tipc_rmem[3] __read_mostly;
 extern int sysctl_tipc_named_timeout __read_mostly;
 
+struct tipc_net_work {
+   struct work_struct work;
+   struct net *net;
+   u32 addr;
+};
+
 struct tipc_net {
u8  node_id[NODE_ID_LEN];
u32 node_addr;
@@ -143,6 +149,10 @@ struct tipc_net {
/* TX crypto handler */
struct tipc_crypto *crypto_tx;
 #endif
+   /* Work item for finalize */
+   struct tipc_net_work final_work;
+   /* Flag to indicate final_work queued */
+   unsigned long final_work_fla

[tipc-discussion] [net-next v2 1/2] Revert "tipc: fix use-after-free in tipc_bcast_get_mode"

2020-09-03 Thread Hoang Huu Le

This reverts commit fdeba99b1e58ecd18c2940c453e19e4ef20ff591.

The commit caused to another deadlock as syzbot reported:
==
WARNING: possible circular locking dependency detected
5.9.0-rc2-next-20200828-syzkaller #0 Not tainted
--
kworker/u4:6/349 is trying to acquire lock:
8880aa063d38 ((wq_completion)events){+.+.}-{0:0}, at: 
flush_workqueue+0xe1/0x13e0 kernel/workqueue.c:2777

but task is already holding lock:
8a879430 (pernet_ops_rwsem){}-{3:3}, at: cleanup_net+0x9b/0xb10 
net/core/net_namespace.c:565

[...]
 Possible unsafe locking scenario:

   CPU0CPU1
   
  lock(pernet_ops_rwsem);
   lock(>s_type->i_mutex_key#13);
   lock(pernet_ops_rwsem);
  lock((wq_completion)events);

 *** DEADLOCK ***
[...]

We need to find out another solution to fix the original issue.

Reported-by: syzbot+d5aa7e0385f6a5d0f...@syzkaller.appspotmail.com
Fixes: fdeba99b1e58 ("tipc: fix use-after-free in tipc_bcast_get_mode")
Signed-off-by: Hoang Huu Le 
---
 net/tipc/core.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/net/tipc/core.c b/net/tipc/core.c
index 37d8695548cf..4f6dc74adf45 100644
--- a/net/tipc/core.c
+++ b/net/tipc/core.c
@@ -109,11 +109,6 @@ static void __net_exit tipc_exit_net(struct net *net)
 {
tipc_detach_loopback(net);
tipc_net_stop(net);
-
-   /* Make sure the tipc_net_finalize_work stopped
-* before releasing the resources.
-*/
-   flush_scheduled_work();
tipc_bcast_stop(net);
tipc_nametbl_stop(net);
tipc_sk_rht_destroy(net);
-- 
2.25.1



___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Re: [tipc-discussion] [PATCH] tipc: fix use-after-free in tipc_bcast_get_mode

2020-09-02 Thread Hoang Huu Le

Hi Jon,

Below solution does not fully resolve the issue. I'm finding out another 
solution.

Regards,
Hoang
-Original Message-
From: Jon Maloy  
Sent: Tuesday, September 1, 2020 8:53 PM
To: Hoang Huu Le ; ma...@donjonn.com; 
ying@windriver.com; tipc-discussion@lists.sourceforge.net
Cc: syzbot+6ea1f7a8df64596ef...@syzkaller.appspotmail.com; 
syzbot+e9cc557752ab126c1...@syzkaller.appspotmail.com
Subject: Re: [PATCH] tipc: fix use-after-free in tipc_bcast_get_mode



On 8/25/20 11:52 PM, Hoang Huu Le wrote:
> Syzbot has reported those issues as:
>
> ==
> BUG: KASAN: use-after-free in tipc_bcast_get_mode+0x3ab/0x400 
> net/tipc/bcast.c:759
> Read of size 1 at addr 88805e6b3571 by task kworker/0:6/3850
>
> CPU: 0 PID: 3850 Comm: kworker/0:6 Not tainted 5.8.0-rc7-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> Google 01/01/2011
> Workqueue: events tipc_net_finalize_work
>
> Thread 1's call trace:
> [...]
>kfree+0x103/0x2c0 mm/slab.c:3757 <- bcbase releasing
>tipc_bcast_stop+0x1b0/0x2f0 net/tipc/bcast.c:721
>tipc_exit_net+0x24/0x270 net/tipc/core.c:112
> [...]
>
> Thread 2's call trace:
> [...]
>tipc_bcast_get_mode+0x3ab/0x400 net/tipc/bcast.c:759 <- bcbase
> has already been freed by Thread 1
>
>tipc_node_broadcast+0x9e/0xcc0 net/tipc/node.c:1744
>tipc_nametbl_publish+0x60b/0x970 net/tipc/name_table.c:752
>tipc_net_finalize net/tipc/net.c:141 [inline]
>tipc_net_finalize+0x1fa/0x310 net/tipc/net.c:131
>tipc_net_finalize_work+0x55/0x80 net/tipc/net.c:150
> [...]
>
> ==
> BUG: KASAN: use-after-free in tipc_named_reinit+0xef/0x290 
> net/tipc/name_distr.c:344
> Read of size 8 at addr 888052ab2000 by task kworker/0:13/30628
> CPU: 0 PID: 30628 Comm: kworker/0:13 Not tainted 5.8.0-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> Google 01/01/2011
> Workqueue: events tipc_net_finalize_work
> Call Trace:
>   __dump_stack lib/dump_stack.c:77 [inline]
>   dump_stack+0x1f0/0x31e lib/dump_stack.c:118
>   print_address_description+0x66/0x5a0 mm/kasan/report.c:383
>   __kasan_report mm/kasan/report.c:513 [inline]
>   kasan_report+0x132/0x1d0 mm/kasan/report.c:530
>   tipc_named_reinit+0xef/0x290 net/tipc/name_distr.c:344
>   tipc_net_finalize+0x85/0xe0 net/tipc/net.c:138
>   tipc_net_finalize_work+0x50/0x70 net/tipc/net.c:150
>   process_one_work+0x789/0xfc0 kernel/workqueue.c:2269
>   worker_thread+0xaa4/0x1460 kernel/workqueue.c:2415
>   kthread+0x37e/0x3a0 drivers/block/aoe/aoecmd.c:1234
>   ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293
> [...]
> Freed by task 14058:
>   save_stack mm/kasan/common.c:48 [inline]
>   set_track mm/kasan/common.c:56 [inline]
>   kasan_set_free_info mm/kasan/common.c:316 [inline]
>   __kasan_slab_free+0x114/0x170 mm/kasan/common.c:455
>   __cache_free mm/slab.c:3426 [inline]
>   kfree+0x10a/0x220 mm/slab.c:3757
>   tipc_exit_net+0x29/0x50 net/tipc/core.c:113
>   ops_exit_list net/core/net_namespace.c:186 [inline]
>   cleanup_net+0x708/0xba0 net/core/net_namespace.c:603
>   process_one_work+0x789/0xfc0 kernel/workqueue.c:2269
>   worker_thread+0xaa4/0x1460 kernel/workqueue.c:2415
>   kthread+0x37e/0x3a0 drivers/block/aoe/aoecmd.c:1234
>   ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293
>
> Solution 1 (preferer):
> Fix it by calling flush_scheduled_work() to make sure the
> tipc_net_finalize_work() stopped before releasing bcbase object.
>
> Solution 2:
> Fix it by introducing a bit flag and returning if flag is zero
> (object had already been freed)
>
> Reported-by: syzbot+6ea1f7a8df64596ef...@syzkaller.appspotmail.com
> Reported-by: syzbot+e9cc557752ab126c1...@syzkaller.appspotmail.com
> Signed-off-by: Hoang Huu Le 
> ---
>   net/tipc/bcast.c | 1 +
>   net/tipc/core.c  | 1 +
>   net/tipc/core.h  | 1 +
>   net/tipc/net.c   | 3 +++
>   4 files changed, 6 insertions(+)
>
> diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
> index 940d176e0e87..56b624c8b6d4 100644
> --- a/net/tipc/bcast.c
> +++ b/net/tipc/bcast.c
> @@ -718,6 +718,7 @@ void tipc_bcast_stop(struct net *net)
>   struct tipc_net *tn = net_generic(net, tipc_net_id);
>   
>   synchronize_net();
> + clear_bit_unlock(0, >net_exit_flag);
>   kfree(tn->bcbase);
>   kfree(tn->bcl);
>   }
> diff --git a/net/tipc/core.c b/net/tipc/core.c
> index 4f6dc74adf45..93ea7dc05bf2 100644
> --- a/net/tipc/core.c
> +++ b/net/tipc/core.c
> @@ -60,6 +60,7 @@ static int __net_init tipc_init_net(struct net *net)
>   tn->trial_addr = 0

[tipc-discussion] [net-next] Revert "tipc: fix use-after-free in tipc_bcast_get_mode"

2020-08-31 Thread Hoang Huu Le

This reverts commit fdeba99b1e58ecd18c2940c453e19e4ef20ff591.

The commit caused to another deadlock as syzbot reported:
==
WARNING: possible circular locking dependency detected
5.9.0-rc2-next-20200828-syzkaller #0 Not tainted
--
kworker/u4:6/349 is trying to acquire lock:
8880aa063d38 ((wq_completion)events){+.+.}-{0:0}, at: 
flush_workqueue+0xe1/0x13e0 kernel/workqueue.c:2777

but task is already holding lock:
8a879430 (pernet_ops_rwsem){}-{3:3}, at: cleanup_net+0x9b/0xb10 
net/core/net_namespace.c:565

[...]
 Possible unsafe locking scenario:

   CPU0CPU1
   
  lock(pernet_ops_rwsem);
   lock(>s_type->i_mutex_key#13);
   lock(pernet_ops_rwsem);
  lock((wq_completion)events);

 *** DEADLOCK ***
[...]

We need to find out another solution to fix the original issue.

Reported-by: syzbot+d5aa7e0385f6a5d0f...@syzkaller.appspotmail.com
Fixes: fdeba99b1e58 ("tipc: fix use-after-free in tipc_bcast_get_mode")
Signed-off-by: Hoang Huu Le 
---
 net/tipc/core.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/net/tipc/core.c b/net/tipc/core.c
index 37d8695548cf..4f6dc74adf45 100644
--- a/net/tipc/core.c
+++ b/net/tipc/core.c
@@ -109,11 +109,6 @@ static void __net_exit tipc_exit_net(struct net *net)
 {
tipc_detach_loopback(net);
tipc_net_stop(net);
-
-   /* Make sure the tipc_net_finalize_work stopped
-* before releasing the resources.
-*/
-   flush_scheduled_work();
tipc_bcast_stop(net);
tipc_nametbl_stop(net);
tipc_sk_rht_destroy(net);
-- 
2.25.1



___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Re: [tipc-discussion] [net-next] tipc: fix use-after-free in tipc_bcast_get_mode

2020-08-31 Thread Hoang Huu Le

Thanks for pointing that out. I will revert the patch and submit another 
solution.

-Original Message-
From: Eric Dumazet  
Sent: Tuesday, September 1, 2020 12:08 AM
To: Hoang Huu Le ; jma...@redhat.com; 
ma...@donjonn.com; net...@vger.kernel.org; tipc-discussion@lists.sourceforge.net
Cc: syzbot+6ea1f7a8df64596ef...@syzkaller.appspotmail.com; 
syzbot+e9cc557752ab126c1...@syzkaller.appspotmail.com
Subject: Re: [net-next] tipc: fix use-after-free in tipc_bcast_get_mode



On 8/26/20 7:56 PM, Hoang Huu Le wrote:
> Syzbot has reported those issues as:
> 
> ==
> BUG: KASAN: use-after-free in tipc_bcast_get_mode+0x3ab/0x400 
> net/tipc/bcast.c:759
> Read of size 1 at addr 88805e6b3571 by task kworker/0:6/3850
> 
> CPU: 0 PID: 3850 Comm: kworker/0:6 Not tainted 5.8.0-rc7-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> Google 01/01/2011
> Workqueue: events tipc_net_finalize_work
> 
> Thread 1's call trace:
> [...]
>   kfree+0x103/0x2c0 mm/slab.c:3757 <- bcbase releasing
>   tipc_bcast_stop+0x1b0/0x2f0 net/tipc/bcast.c:721
>   tipc_exit_net+0x24/0x270 net/tipc/core.c:112
> [...]
> 
> Thread 2's call trace:
> [...]
>   tipc_bcast_get_mode+0x3ab/0x400 net/tipc/bcast.c:759 <- bcbase
> has already been freed by Thread 1
> 
>   tipc_node_broadcast+0x9e/0xcc0 net/tipc/node.c:1744
>   tipc_nametbl_publish+0x60b/0x970 net/tipc/name_table.c:752
>   tipc_net_finalize net/tipc/net.c:141 [inline]
>   tipc_net_finalize+0x1fa/0x310 net/tipc/net.c:131
>   tipc_net_finalize_work+0x55/0x80 net/tipc/net.c:150
> [...]
> 
> ==
> BUG: KASAN: use-after-free in tipc_named_reinit+0xef/0x290 
> net/tipc/name_distr.c:344
> Read of size 8 at addr 888052ab2000 by task kworker/0:13/30628
> CPU: 0 PID: 30628 Comm: kworker/0:13 Not tainted 5.8.0-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> Google 01/01/2011
> Workqueue: events tipc_net_finalize_work
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x1f0/0x31e lib/dump_stack.c:118
>  print_address_description+0x66/0x5a0 mm/kasan/report.c:383
>  __kasan_report mm/kasan/report.c:513 [inline]
>  kasan_report+0x132/0x1d0 mm/kasan/report.c:530
>  tipc_named_reinit+0xef/0x290 net/tipc/name_distr.c:344
>  tipc_net_finalize+0x85/0xe0 net/tipc/net.c:138
>  tipc_net_finalize_work+0x50/0x70 net/tipc/net.c:150
>  process_one_work+0x789/0xfc0 kernel/workqueue.c:2269
>  worker_thread+0xaa4/0x1460 kernel/workqueue.c:2415
>  kthread+0x37e/0x3a0 drivers/block/aoe/aoecmd.c:1234
>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293
> [...]
> Freed by task 14058:
>  save_stack mm/kasan/common.c:48 [inline]
>  set_track mm/kasan/common.c:56 [inline]
>  kasan_set_free_info mm/kasan/common.c:316 [inline]
>  __kasan_slab_free+0x114/0x170 mm/kasan/common.c:455
>  __cache_free mm/slab.c:3426 [inline]
>  kfree+0x10a/0x220 mm/slab.c:3757
>  tipc_exit_net+0x29/0x50 net/tipc/core.c:113
>  ops_exit_list net/core/net_namespace.c:186 [inline]
>  cleanup_net+0x708/0xba0 net/core/net_namespace.c:603
>  process_one_work+0x789/0xfc0 kernel/workqueue.c:2269
>  worker_thread+0xaa4/0x1460 kernel/workqueue.c:2415
>  kthread+0x37e/0x3a0 drivers/block/aoe/aoecmd.c:1234
>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293
> 
> Fix it by calling flush_scheduled_work() to make sure the
> tipc_net_finalize_work() stopped before releasing bcbase object.
> 
> Reported-by: syzbot+6ea1f7a8df64596ef...@syzkaller.appspotmail.com
> Reported-by: syzbot+e9cc557752ab126c1...@syzkaller.appspotmail.com
> Acked-by: Jon Maloy 
> Signed-off-by: Hoang Huu Le 
> ---
>  net/tipc/core.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/net/tipc/core.c b/net/tipc/core.c
> index 4f6dc74adf45..37d8695548cf 100644
> --- a/net/tipc/core.c
> +++ b/net/tipc/core.c
> @@ -109,6 +109,11 @@ static void __net_exit tipc_exit_net(struct net *net)
>  {
>   tipc_detach_loopback(net);
>   tipc_net_stop(net);
> +
> + /* Make sure the tipc_net_finalize_work stopped
> +  * before releasing the resources.
> +  */
> + flush_scheduled_work();
>   tipc_bcast_stop(net);
>   tipc_nametbl_stop(net);
>   tipc_sk_rht_destroy(net);
> 

Lockdep disagrees with this change.

==
WARNING: possible circular locking dependency detected
5.9.0-rc2-next-20200828-syzkaller #0 Not tainted
--
kworker/u4:5/197 is trying to acquire lock:
8880aa063d38 ((wq_completion)events){+.+.}-{0:0}, at: 
flush

[tipc-discussion] [net-next] tipc: fix use-after-free in tipc_bcast_get_mode

2020-08-26 Thread Hoang Huu Le

Syzbot has reported those issues as:

==
BUG: KASAN: use-after-free in tipc_bcast_get_mode+0x3ab/0x400 
net/tipc/bcast.c:759
Read of size 1 at addr 88805e6b3571 by task kworker/0:6/3850

CPU: 0 PID: 3850 Comm: kworker/0:6 Not tainted 5.8.0-rc7-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Workqueue: events tipc_net_finalize_work

Thread 1's call trace:
[...]
  kfree+0x103/0x2c0 mm/slab.c:3757 <- bcbase releasing
  tipc_bcast_stop+0x1b0/0x2f0 net/tipc/bcast.c:721
  tipc_exit_net+0x24/0x270 net/tipc/core.c:112
[...]

Thread 2's call trace:
[...]
  tipc_bcast_get_mode+0x3ab/0x400 net/tipc/bcast.c:759 <- bcbase
has already been freed by Thread 1

  tipc_node_broadcast+0x9e/0xcc0 net/tipc/node.c:1744
  tipc_nametbl_publish+0x60b/0x970 net/tipc/name_table.c:752
  tipc_net_finalize net/tipc/net.c:141 [inline]
  tipc_net_finalize+0x1fa/0x310 net/tipc/net.c:131
  tipc_net_finalize_work+0x55/0x80 net/tipc/net.c:150
[...]

==
BUG: KASAN: use-after-free in tipc_named_reinit+0xef/0x290 
net/tipc/name_distr.c:344
Read of size 8 at addr 888052ab2000 by task kworker/0:13/30628
CPU: 0 PID: 30628 Comm: kworker/0:13 Not tainted 5.8.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Workqueue: events tipc_net_finalize_work
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1f0/0x31e lib/dump_stack.c:118
 print_address_description+0x66/0x5a0 mm/kasan/report.c:383
 __kasan_report mm/kasan/report.c:513 [inline]
 kasan_report+0x132/0x1d0 mm/kasan/report.c:530
 tipc_named_reinit+0xef/0x290 net/tipc/name_distr.c:344
 tipc_net_finalize+0x85/0xe0 net/tipc/net.c:138
 tipc_net_finalize_work+0x50/0x70 net/tipc/net.c:150
 process_one_work+0x789/0xfc0 kernel/workqueue.c:2269
 worker_thread+0xaa4/0x1460 kernel/workqueue.c:2415
 kthread+0x37e/0x3a0 drivers/block/aoe/aoecmd.c:1234
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293
[...]
Freed by task 14058:
 save_stack mm/kasan/common.c:48 [inline]
 set_track mm/kasan/common.c:56 [inline]
 kasan_set_free_info mm/kasan/common.c:316 [inline]
 __kasan_slab_free+0x114/0x170 mm/kasan/common.c:455
 __cache_free mm/slab.c:3426 [inline]
 kfree+0x10a/0x220 mm/slab.c:3757
 tipc_exit_net+0x29/0x50 net/tipc/core.c:113
 ops_exit_list net/core/net_namespace.c:186 [inline]
 cleanup_net+0x708/0xba0 net/core/net_namespace.c:603
 process_one_work+0x789/0xfc0 kernel/workqueue.c:2269
 worker_thread+0xaa4/0x1460 kernel/workqueue.c:2415
 kthread+0x37e/0x3a0 drivers/block/aoe/aoecmd.c:1234
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293

Fix it by calling flush_scheduled_work() to make sure the
tipc_net_finalize_work() stopped before releasing bcbase object.

Reported-by: syzbot+6ea1f7a8df64596ef...@syzkaller.appspotmail.com
Reported-by: syzbot+e9cc557752ab126c1...@syzkaller.appspotmail.com
Acked-by: Jon Maloy 
Signed-off-by: Hoang Huu Le 
---
 net/tipc/core.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/net/tipc/core.c b/net/tipc/core.c
index 4f6dc74adf45..37d8695548cf 100644
--- a/net/tipc/core.c
+++ b/net/tipc/core.c
@@ -109,6 +109,11 @@ static void __net_exit tipc_exit_net(struct net *net)
 {
tipc_detach_loopback(net);
tipc_net_stop(net);
+
+   /* Make sure the tipc_net_finalize_work stopped
+* before releasing the resources.
+*/
+   flush_scheduled_work();
tipc_bcast_stop(net);
tipc_nametbl_stop(net);
tipc_sk_rht_destroy(net);
-- 
2.25.1



___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

[tipc-discussion] [net-next] tipc: fix use-after-free in tipc_bcast_get_mode

2020-08-26 Thread Hoang Huu Le

Syzbot has reported those issues as:

==
BUG: KASAN: use-after-free in tipc_bcast_get_mode+0x3ab/0x400 
net/tipc/bcast.c:759
Read of size 1 at addr 88805e6b3571 by task kworker/0:6/3850

CPU: 0 PID: 3850 Comm: kworker/0:6 Not tainted 5.8.0-rc7-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Workqueue: events tipc_net_finalize_work

Thread 1's call trace:
[...]
  kfree+0x103/0x2c0 mm/slab.c:3757 <- bcbase releasing
  tipc_bcast_stop+0x1b0/0x2f0 net/tipc/bcast.c:721
  tipc_exit_net+0x24/0x270 net/tipc/core.c:112
[...]

Thread 2's call trace:
[...]
  tipc_bcast_get_mode+0x3ab/0x400 net/tipc/bcast.c:759 <- bcbase
has already been freed by Thread 1

  tipc_node_broadcast+0x9e/0xcc0 net/tipc/node.c:1744
  tipc_nametbl_publish+0x60b/0x970 net/tipc/name_table.c:752
  tipc_net_finalize net/tipc/net.c:141 [inline]
  tipc_net_finalize+0x1fa/0x310 net/tipc/net.c:131
  tipc_net_finalize_work+0x55/0x80 net/tipc/net.c:150
[...]

==
BUG: KASAN: use-after-free in tipc_named_reinit+0xef/0x290 
net/tipc/name_distr.c:344
Read of size 8 at addr 888052ab2000 by task kworker/0:13/30628
CPU: 0 PID: 30628 Comm: kworker/0:13 Not tainted 5.8.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Workqueue: events tipc_net_finalize_work
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1f0/0x31e lib/dump_stack.c:118
 print_address_description+0x66/0x5a0 mm/kasan/report.c:383
 __kasan_report mm/kasan/report.c:513 [inline]
 kasan_report+0x132/0x1d0 mm/kasan/report.c:530
 tipc_named_reinit+0xef/0x290 net/tipc/name_distr.c:344
 tipc_net_finalize+0x85/0xe0 net/tipc/net.c:138
 tipc_net_finalize_work+0x50/0x70 net/tipc/net.c:150
 process_one_work+0x789/0xfc0 kernel/workqueue.c:2269
 worker_thread+0xaa4/0x1460 kernel/workqueue.c:2415
 kthread+0x37e/0x3a0 drivers/block/aoe/aoecmd.c:1234
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293
[...]
Freed by task 14058:
 save_stack mm/kasan/common.c:48 [inline]
 set_track mm/kasan/common.c:56 [inline]
 kasan_set_free_info mm/kasan/common.c:316 [inline]
 __kasan_slab_free+0x114/0x170 mm/kasan/common.c:455
 __cache_free mm/slab.c:3426 [inline]
 kfree+0x10a/0x220 mm/slab.c:3757
 tipc_exit_net+0x29/0x50 net/tipc/core.c:113
 ops_exit_list net/core/net_namespace.c:186 [inline]
 cleanup_net+0x708/0xba0 net/core/net_namespace.c:603
 process_one_work+0x789/0xfc0 kernel/workqueue.c:2269
 worker_thread+0xaa4/0x1460 kernel/workqueue.c:2415
 kthread+0x37e/0x3a0 drivers/block/aoe/aoecmd.c:1234
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293

Fix it by calling flush_scheduled_work() to make sure the
tipc_net_finalize_work() stopped before releasing bcbase object.

Reported-by: syzbot+6ea1f7a8df64596ef...@syzkaller.appspotmail.com
Reported-by: syzbot+e9cc557752ab126c1...@syzkaller.appspotmail.com
Acked-by: Jon Maloy 
Signed-off-by: Hoang Huu Le 
---
 net/tipc/core.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/net/tipc/core.c b/net/tipc/core.c
index 4f6dc74adf45..37d8695548cf 100644
--- a/net/tipc/core.c
+++ b/net/tipc/core.c
@@ -109,6 +109,11 @@ static void __net_exit tipc_exit_net(struct net *net)
 {
tipc_detach_loopback(net);
tipc_net_stop(net);
+
+   /* Make sure the tipc_net_finalize_work stopped
+* before releasing the resources.
+*/
+   flush_scheduled_work();
tipc_bcast_stop(net);
tipc_nametbl_stop(net);
tipc_sk_rht_destroy(net);
-- 
2.25.1



___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

[tipc-discussion] [iproute2-next] tipc: support 128bit node identity for peer removing

2020-08-26 Thread Hoang Huu Le

From: Hoang Le 

Problem:
In kernel upstream, we add the support to set node identity with
128bit. However, we are still using legacy format in command tipc
peer removing. Then, we got a problem when trying to remove
offline node i.e:

$ tipc node list
Node IdentityHash State
d6babc1c1c6d 1cbcd7ca down

$ tipc peer remove address d6babc1c1c6d
invalid network address, syntax: Z.C.N
error: No such device or address

Solution:
We add the support to remove a specific node down with 128bit
node identifier, as an alternative to legacy 32-bit node address.

Acked-by: Jon Maloy 
Signed-off-by: Hoang Huu Le 
---
 tipc/peer.c | 53 -
 1 file changed, 52 insertions(+), 1 deletion(-)

diff --git a/tipc/peer.c b/tipc/peer.c
index f6380777033d..f14ec35e6f71 100644
--- a/tipc/peer.c
+++ b/tipc/peer.c
@@ -59,17 +59,68 @@ static int cmd_peer_rm_addr(struct nlmsghdr *nlh, const 
struct cmd *cmd,
return msg_doit(nlh, NULL, NULL);
 }
 
+static int cmd_peer_rm_nodeid(struct nlmsghdr *nlh, const struct cmd *cmd,
+ struct cmdl *cmdl, void *data)
+{
+   char buf[MNL_SOCKET_BUFFER_SIZE];
+   __u8 id[16] = {0,};
+   __u64 *w0 = (__u64 *)[0];
+   __u64 *w1 = (__u64 *)[8];
+   struct nlattr *nest;
+   char *str;
+
+   if (cmdl->argc != cmdl->optind + 1) {
+   fprintf(stderr, "Usage: %s peer remove identity NODEID\n",
+   cmdl->argv[0]);
+   return -EINVAL;
+   }
+
+   str = shift_cmdl(cmdl);
+   if (str2nodeid(str, id)) {
+   fprintf(stderr, "Invalid node identity\n");
+   return -EINVAL;
+   }
+
+   nlh = msg_init(buf, TIPC_NL_PEER_REMOVE);
+   if (!nlh) {
+   fprintf(stderr, "error, message initialisation failed\n");
+   return -1;
+   }
+
+   nest = mnl_attr_nest_start(nlh, TIPC_NLA_NET);
+   mnl_attr_put_u64(nlh, TIPC_NLA_NET_NODEID, *w0);
+   mnl_attr_put_u64(nlh, TIPC_NLA_NET_NODEID_W1, *w1);
+   mnl_attr_nest_end(nlh, nest);
+
+   return msg_doit(nlh, NULL, NULL);
+}
+
 static void cmd_peer_rm_help(struct cmdl *cmdl)
+{
+   fprintf(stderr, "Usage: %s peer remove PROPERTY\n\n"
+   "PROPERTIES\n"
+   " identity NODEID - Remove peer node identity\n",
+   cmdl->argv[0]);
+}
+
+static void cmd_peer_rm_addr_help(struct cmdl *cmdl)
 {
fprintf(stderr, "Usage: %s peer remove address ADDRESS\n",
cmdl->argv[0]);
 }
 
+static void cmd_peer_rm_nodeid_help(struct cmdl *cmdl)
+{
+   fprintf(stderr, "Usage: %s peer remove identity NODEID\n",
+   cmdl->argv[0]);
+}
+
 static int cmd_peer_rm(struct nlmsghdr *nlh, const struct cmd *cmd,
struct cmdl *cmdl, void *data)
 {
const struct cmd cmds[] = {
-   { "address",cmd_peer_rm_addr,   cmd_peer_rm_help },
+   { "address",  cmd_peer_rm_addr,   cmd_peer_rm_addr_help },
+   { "identity", cmd_peer_rm_nodeid, cmd_peer_rm_nodeid_help },
{ NULL }
};
 
-- 
2.25.1



___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

[tipc-discussion] [PATCH] tipc: fix use-after-free in tipc_bcast_get_mode

2020-08-25 Thread Hoang Huu Le

Syzbot has reported those issues as:

==
BUG: KASAN: use-after-free in tipc_bcast_get_mode+0x3ab/0x400 
net/tipc/bcast.c:759
Read of size 1 at addr 88805e6b3571 by task kworker/0:6/3850

CPU: 0 PID: 3850 Comm: kworker/0:6 Not tainted 5.8.0-rc7-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Workqueue: events tipc_net_finalize_work

Thread 1's call trace:
[...]
  kfree+0x103/0x2c0 mm/slab.c:3757 <- bcbase releasing
  tipc_bcast_stop+0x1b0/0x2f0 net/tipc/bcast.c:721
  tipc_exit_net+0x24/0x270 net/tipc/core.c:112
[...]

Thread 2's call trace:
[...]
  tipc_bcast_get_mode+0x3ab/0x400 net/tipc/bcast.c:759 <- bcbase
has already been freed by Thread 1

  tipc_node_broadcast+0x9e/0xcc0 net/tipc/node.c:1744
  tipc_nametbl_publish+0x60b/0x970 net/tipc/name_table.c:752
  tipc_net_finalize net/tipc/net.c:141 [inline]
  tipc_net_finalize+0x1fa/0x310 net/tipc/net.c:131
  tipc_net_finalize_work+0x55/0x80 net/tipc/net.c:150
[...]

==
BUG: KASAN: use-after-free in tipc_named_reinit+0xef/0x290 
net/tipc/name_distr.c:344
Read of size 8 at addr 888052ab2000 by task kworker/0:13/30628
CPU: 0 PID: 30628 Comm: kworker/0:13 Not tainted 5.8.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Workqueue: events tipc_net_finalize_work
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1f0/0x31e lib/dump_stack.c:118
 print_address_description+0x66/0x5a0 mm/kasan/report.c:383
 __kasan_report mm/kasan/report.c:513 [inline]
 kasan_report+0x132/0x1d0 mm/kasan/report.c:530
 tipc_named_reinit+0xef/0x290 net/tipc/name_distr.c:344
 tipc_net_finalize+0x85/0xe0 net/tipc/net.c:138
 tipc_net_finalize_work+0x50/0x70 net/tipc/net.c:150
 process_one_work+0x789/0xfc0 kernel/workqueue.c:2269
 worker_thread+0xaa4/0x1460 kernel/workqueue.c:2415
 kthread+0x37e/0x3a0 drivers/block/aoe/aoecmd.c:1234
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293
[...]
Freed by task 14058:
 save_stack mm/kasan/common.c:48 [inline]
 set_track mm/kasan/common.c:56 [inline]
 kasan_set_free_info mm/kasan/common.c:316 [inline]
 __kasan_slab_free+0x114/0x170 mm/kasan/common.c:455
 __cache_free mm/slab.c:3426 [inline]
 kfree+0x10a/0x220 mm/slab.c:3757
 tipc_exit_net+0x29/0x50 net/tipc/core.c:113
 ops_exit_list net/core/net_namespace.c:186 [inline]
 cleanup_net+0x708/0xba0 net/core/net_namespace.c:603
 process_one_work+0x789/0xfc0 kernel/workqueue.c:2269
 worker_thread+0xaa4/0x1460 kernel/workqueue.c:2415
 kthread+0x37e/0x3a0 drivers/block/aoe/aoecmd.c:1234
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293

Solution 1 (preferer):
Fix it by calling flush_scheduled_work() to make sure the
tipc_net_finalize_work() stopped before releasing bcbase object.

Solution 2:
Fix it by introducing a bit flag and returning if flag is zero
(object had already been freed)

Reported-by: syzbot+6ea1f7a8df64596ef...@syzkaller.appspotmail.com
Reported-by: syzbot+e9cc557752ab126c1...@syzkaller.appspotmail.com
Signed-off-by: Hoang Huu Le 
---
 net/tipc/core.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/net/tipc/core.c b/net/tipc/core.c
index 4f6dc74adf45..37d8695548cf 100644
--- a/net/tipc/core.c
+++ b/net/tipc/core.c
@@ -109,6 +109,11 @@ static void __net_exit tipc_exit_net(struct net *net)
 {
tipc_detach_loopback(net);
tipc_net_stop(net);
+
+   /* Make sure the tipc_net_finalize_work stopped
+* before releasing the resources.
+*/
+   flush_scheduled_work();
tipc_bcast_stop(net);
tipc_nametbl_stop(net);
tipc_sk_rht_destroy(net);
-- 
2.25.1



___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

[tipc-discussion] [PATCH] tipc: fix use-after-free in tipc_bcast_get_mode

2020-08-25 Thread Hoang Huu Le

Syzbot has reported those issues as:

==
BUG: KASAN: use-after-free in tipc_bcast_get_mode+0x3ab/0x400 
net/tipc/bcast.c:759
Read of size 1 at addr 88805e6b3571 by task kworker/0:6/3850

CPU: 0 PID: 3850 Comm: kworker/0:6 Not tainted 5.8.0-rc7-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Workqueue: events tipc_net_finalize_work

Thread 1's call trace:
[...]
  kfree+0x103/0x2c0 mm/slab.c:3757 <- bcbase releasing
  tipc_bcast_stop+0x1b0/0x2f0 net/tipc/bcast.c:721
  tipc_exit_net+0x24/0x270 net/tipc/core.c:112
[...]

Thread 2's call trace:
[...]
  tipc_bcast_get_mode+0x3ab/0x400 net/tipc/bcast.c:759 <- bcbase
has already been freed by Thread 1

  tipc_node_broadcast+0x9e/0xcc0 net/tipc/node.c:1744
  tipc_nametbl_publish+0x60b/0x970 net/tipc/name_table.c:752
  tipc_net_finalize net/tipc/net.c:141 [inline]
  tipc_net_finalize+0x1fa/0x310 net/tipc/net.c:131
  tipc_net_finalize_work+0x55/0x80 net/tipc/net.c:150
[...]

==
BUG: KASAN: use-after-free in tipc_named_reinit+0xef/0x290 
net/tipc/name_distr.c:344
Read of size 8 at addr 888052ab2000 by task kworker/0:13/30628
CPU: 0 PID: 30628 Comm: kworker/0:13 Not tainted 5.8.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Workqueue: events tipc_net_finalize_work
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1f0/0x31e lib/dump_stack.c:118
 print_address_description+0x66/0x5a0 mm/kasan/report.c:383
 __kasan_report mm/kasan/report.c:513 [inline]
 kasan_report+0x132/0x1d0 mm/kasan/report.c:530
 tipc_named_reinit+0xef/0x290 net/tipc/name_distr.c:344
 tipc_net_finalize+0x85/0xe0 net/tipc/net.c:138
 tipc_net_finalize_work+0x50/0x70 net/tipc/net.c:150
 process_one_work+0x789/0xfc0 kernel/workqueue.c:2269
 worker_thread+0xaa4/0x1460 kernel/workqueue.c:2415
 kthread+0x37e/0x3a0 drivers/block/aoe/aoecmd.c:1234
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293
[...]
Freed by task 14058:
 save_stack mm/kasan/common.c:48 [inline]
 set_track mm/kasan/common.c:56 [inline]
 kasan_set_free_info mm/kasan/common.c:316 [inline]
 __kasan_slab_free+0x114/0x170 mm/kasan/common.c:455
 __cache_free mm/slab.c:3426 [inline]
 kfree+0x10a/0x220 mm/slab.c:3757
 tipc_exit_net+0x29/0x50 net/tipc/core.c:113
 ops_exit_list net/core/net_namespace.c:186 [inline]
 cleanup_net+0x708/0xba0 net/core/net_namespace.c:603
 process_one_work+0x789/0xfc0 kernel/workqueue.c:2269
 worker_thread+0xaa4/0x1460 kernel/workqueue.c:2415
 kthread+0x37e/0x3a0 drivers/block/aoe/aoecmd.c:1234
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293

Solution 1 (preferer):
Fix it by calling flush_scheduled_work() to make sure the
tipc_net_finalize_work() stopped before releasing bcbase object.

Solution 2:
Fix it by introducing a bit flag and returning if flag is zero
(object had already been freed)

Reported-by: syzbot+6ea1f7a8df64596ef...@syzkaller.appspotmail.com
Reported-by: syzbot+e9cc557752ab126c1...@syzkaller.appspotmail.com
Signed-off-by: Hoang Huu Le 
---
 net/tipc/bcast.c | 1 +
 net/tipc/core.c  | 1 +
 net/tipc/core.h  | 1 +
 net/tipc/net.c   | 3 +++
 4 files changed, 6 insertions(+)

diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
index 940d176e0e87..56b624c8b6d4 100644
--- a/net/tipc/bcast.c
+++ b/net/tipc/bcast.c
@@ -718,6 +718,7 @@ void tipc_bcast_stop(struct net *net)
struct tipc_net *tn = net_generic(net, tipc_net_id);
 
synchronize_net();
+   clear_bit_unlock(0, >net_exit_flag);
kfree(tn->bcbase);
kfree(tn->bcl);
 }
diff --git a/net/tipc/core.c b/net/tipc/core.c
index 4f6dc74adf45..93ea7dc05bf2 100644
--- a/net/tipc/core.c
+++ b/net/tipc/core.c
@@ -60,6 +60,7 @@ static int __net_init tipc_init_net(struct net *net)
tn->trial_addr = 0;
tn->addr_trial_end = 0;
tn->capabilities = TIPC_NODE_CAPABILITIES;
+   test_and_set_bit_lock(0, >net_exit_flag);
memset(tn->node_id, 0, sizeof(tn->node_id));
memset(tn->node_id_string, 0, sizeof(tn->node_id_string));
tn->mon_threshold = TIPC_DEF_MON_THRESHOLD;
diff --git a/net/tipc/core.h b/net/tipc/core.h
index 631d83c9705f..aa75882dd932 100644
--- a/net/tipc/core.h
+++ b/net/tipc/core.h
@@ -143,6 +143,7 @@ struct tipc_net {
/* TX crypto handler */
struct tipc_crypto *crypto_tx;
 #endif
+   unsigned long net_exit_flag;
 };
 
 static inline struct tipc_net *tipc_net(struct net *net)
diff --git a/net/tipc/net.c b/net/tipc/net.c
index 85400e4242de..0dcbfcff5ad3 100644
--- a/net/tipc/net.c
+++ b/net/tipc/net.c
@@ -132,6 +132,9 @@ static void tipc_net_finalize(struct net *net, u32 addr)
 {
struct tipc_net *tn = tipc_net(net);
 
+   if (unlikely(!test_bit(0, >net_exit_flag)))
+   return;
+
if (cmpxchg(>node_addr, 0, addr))

[tipc-discussion] [iproute2-next v2] tipc: support 128bit node identity for peer removing

2020-08-24 Thread Hoang Huu Le

From: Hoang Le 

Problem:
In kernel upstream, we add the support to set node identity with
128bit. However, we are still using legacy format in command tipc
peer removing. Then, we got a problem when trying to remove
offline node i.e:

$ tipc node list
Node IdentityHash State
d6babc1c1c6d 1cbcd7ca down

$ tipc peer remove address d6babc1c1c6d
invalid network address, syntax: Z.C.N
error: No such device or address

Solution:
We add the support to remove a specific node down with 128bit
node identifier, as an alternative to legacy 32-bit node address.

Signed-off-by: Hoang Le 
Signed-off-by: Hoang Huu Le 
---
 tipc/peer.c | 53 -
 1 file changed, 52 insertions(+), 1 deletion(-)

diff --git a/tipc/peer.c b/tipc/peer.c
index f6380777033d..f14ec35e6f71 100644
--- a/tipc/peer.c
+++ b/tipc/peer.c
@@ -59,17 +59,68 @@ static int cmd_peer_rm_addr(struct nlmsghdr *nlh, const 
struct cmd *cmd,
return msg_doit(nlh, NULL, NULL);
 }
 
+static int cmd_peer_rm_nodeid(struct nlmsghdr *nlh, const struct cmd *cmd,
+ struct cmdl *cmdl, void *data)
+{
+   char buf[MNL_SOCKET_BUFFER_SIZE];
+   __u8 id[16] = {0,};
+   __u64 *w0 = (__u64 *)[0];
+   __u64 *w1 = (__u64 *)[8];
+   struct nlattr *nest;
+   char *str;
+
+   if (cmdl->argc != cmdl->optind + 1) {
+   fprintf(stderr, "Usage: %s peer remove identity NODEID\n",
+   cmdl->argv[0]);
+   return -EINVAL;
+   }
+
+   str = shift_cmdl(cmdl);
+   if (str2nodeid(str, id)) {
+   fprintf(stderr, "Invalid node identity\n");
+   return -EINVAL;
+   }
+
+   nlh = msg_init(buf, TIPC_NL_PEER_REMOVE);
+   if (!nlh) {
+   fprintf(stderr, "error, message initialisation failed\n");
+   return -1;
+   }
+
+   nest = mnl_attr_nest_start(nlh, TIPC_NLA_NET);
+   mnl_attr_put_u64(nlh, TIPC_NLA_NET_NODEID, *w0);
+   mnl_attr_put_u64(nlh, TIPC_NLA_NET_NODEID_W1, *w1);
+   mnl_attr_nest_end(nlh, nest);
+
+   return msg_doit(nlh, NULL, NULL);
+}
+
 static void cmd_peer_rm_help(struct cmdl *cmdl)
+{
+   fprintf(stderr, "Usage: %s peer remove PROPERTY\n\n"
+   "PROPERTIES\n"
+   " identity NODEID - Remove peer node identity\n",
+   cmdl->argv[0]);
+}
+
+static void cmd_peer_rm_addr_help(struct cmdl *cmdl)
 {
fprintf(stderr, "Usage: %s peer remove address ADDRESS\n",
cmdl->argv[0]);
 }
 
+static void cmd_peer_rm_nodeid_help(struct cmdl *cmdl)
+{
+   fprintf(stderr, "Usage: %s peer remove identity NODEID\n",
+   cmdl->argv[0]);
+}
+
 static int cmd_peer_rm(struct nlmsghdr *nlh, const struct cmd *cmd,
struct cmdl *cmdl, void *data)
 {
const struct cmd cmds[] = {
-   { "address",cmd_peer_rm_addr,   cmd_peer_rm_help },
+   { "address",  cmd_peer_rm_addr,   cmd_peer_rm_addr_help },
+   { "identity", cmd_peer_rm_nodeid, cmd_peer_rm_nodeid_help },
{ NULL }
};
 
-- 
2.25.1



___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

[tipc-discussion] [iproute2-next] tipc: support 128bit node identity for peer removing

2020-08-24 Thread Hoang Huu Le

From: Hoang Le 

Problem:
In kernel upstream, we add the support to set node identity with
128bit. However, we are still using legacy format in command tipc
peer removing. Then, we got a problem when trying to remove
offline node i.e:

Node IdentityHash State
d6babc1c1c6d 1cbcd7ca down

invalid network address, syntax: Z.C.N
error: No such device or address

Solution:
We add the support to remove a specific node down with 128bit
node identifier, as an alternative to legacy 32-bit node address.

Signed-off-by: Hoang Le 
Signed-off-by: Hoang Huu Le 
---
 tipc/peer.c | 53 -
 1 file changed, 52 insertions(+), 1 deletion(-)

diff --git a/tipc/peer.c b/tipc/peer.c
index f6380777033d..f14ec35e6f71 100644
--- a/tipc/peer.c
+++ b/tipc/peer.c
@@ -59,17 +59,68 @@ static int cmd_peer_rm_addr(struct nlmsghdr *nlh, const 
struct cmd *cmd,
return msg_doit(nlh, NULL, NULL);
 }
 
+static int cmd_peer_rm_nodeid(struct nlmsghdr *nlh, const struct cmd *cmd,
+ struct cmdl *cmdl, void *data)
+{
+   char buf[MNL_SOCKET_BUFFER_SIZE];
+   __u8 id[16] = {0,};
+   __u64 *w0 = (__u64 *)[0];
+   __u64 *w1 = (__u64 *)[8];
+   struct nlattr *nest;
+   char *str;
+
+   if (cmdl->argc != cmdl->optind + 1) {
+   fprintf(stderr, "Usage: %s peer remove identity NODEID\n",
+   cmdl->argv[0]);
+   return -EINVAL;
+   }
+
+   str = shift_cmdl(cmdl);
+   if (str2nodeid(str, id)) {
+   fprintf(stderr, "Invalid node identity\n");
+   return -EINVAL;
+   }
+
+   nlh = msg_init(buf, TIPC_NL_PEER_REMOVE);
+   if (!nlh) {
+   fprintf(stderr, "error, message initialisation failed\n");
+   return -1;
+   }
+
+   nest = mnl_attr_nest_start(nlh, TIPC_NLA_NET);
+   mnl_attr_put_u64(nlh, TIPC_NLA_NET_NODEID, *w0);
+   mnl_attr_put_u64(nlh, TIPC_NLA_NET_NODEID_W1, *w1);
+   mnl_attr_nest_end(nlh, nest);
+
+   return msg_doit(nlh, NULL, NULL);
+}
+
 static void cmd_peer_rm_help(struct cmdl *cmdl)
+{
+   fprintf(stderr, "Usage: %s peer remove PROPERTY\n\n"
+   "PROPERTIES\n"
+   " identity NODEID - Remove peer node identity\n",
+   cmdl->argv[0]);
+}
+
+static void cmd_peer_rm_addr_help(struct cmdl *cmdl)
 {
fprintf(stderr, "Usage: %s peer remove address ADDRESS\n",
cmdl->argv[0]);
 }
 
+static void cmd_peer_rm_nodeid_help(struct cmdl *cmdl)
+{
+   fprintf(stderr, "Usage: %s peer remove identity NODEID\n",
+   cmdl->argv[0]);
+}
+
 static int cmd_peer_rm(struct nlmsghdr *nlh, const struct cmd *cmd,
struct cmdl *cmdl, void *data)
 {
const struct cmd cmds[] = {
-   { "address",cmd_peer_rm_addr,   cmd_peer_rm_help },
+   { "address",  cmd_peer_rm_addr,   cmd_peer_rm_addr_help },
+   { "identity", cmd_peer_rm_nodeid, cmd_peer_rm_nodeid_help },
{ NULL }
};
 
-- 
2.25.1



___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

[tipc-discussion] [iproute2-next] tipc: fixed a compile warning in tipc/link.c

2020-07-08 Thread Hoang Huu Le

Fixes: 5027f233e35b ("tipc: add link broadcast get")
Signed-off-by: Hoang Huu Le 
---
 tipc/link.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tipc/link.c b/tipc/link.c
index ba77a20152ea..192736eaa154 100644
--- a/tipc/link.c
+++ b/tipc/link.c
@@ -217,7 +217,7 @@ static int cmd_link_get_bcast_cb(const struct nlmsghdr 
*nlh, void *data)
print_string(PRINT_ANY, "method", "%s", "AUTOSELECT");
close_json_object();
open_json_object(NULL);
-   print_uint(PRINT_ANY, "ratio", " ratio:%u%\n",
+   print_uint(PRINT_ANY, "ratio", " ratio:%u\n",
   mnl_attr_get_u32(props[prop_ratio]));
break;
default:
-- 
2.25.1



___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Re: [tipc-discussion] [next-net v6] tipc: update a binding service via broadcast

2020-06-09 Thread Hoang Huu Le




-Original Message-
From: Jon Maloy  
Sent: Monday, June 8, 2020 8:33 PM
To: Hoang Huu Le ; ma...@donjonn.com; 
ying@windriver.com; tipc-discussion@lists.sourceforge.net
Subject: Re: [next-net v6] tipc: update a binding service via broadcast



On 6/6/20 11:10 PM, Hoang Huu Le wrote:
> -Original Message-
> From: Jon Maloy 
> Sent: Friday, June 5, 2020 8:03 PM
> To: Hoang Huu Le ; ma...@donjonn.com; 
> ying@windriver.com; tipc-discussion@lists.sourceforge.net
> Subject: Re: [next-net v6] tipc: update a binding service via broadcast
>
>
>
> On 6/5/20 3:52 AM, Hoang Huu Le wrote:
>> Currently, updating binding table (add service binding to
>> name table/withdraw a service binding) is being sent over replicast.
>> However, if we are scaling up clusters to > 100 nodes/containers this
>> method is less affection because of looping through nodes in a cluster one
>> by one.
[...]
>> +if (*open && (*rcv_nxt == seqno)) {
>> +(*rcv_nxt)++;
>> +__skb_unlink(skb, namedq);
>> +return skb;
>> +}
>> +
>> +if (less(seqno, *rcv_nxt)) {
>> +__skb_unlink(skb, namedq);
>> +kfree_skb(skb);
>> +continue;
> Still not needed. This queue should be flushed in
> tipc_node_lost_contact(), which I now see we don't do.
> [Hoang] Yes, that's right. I will verify and send it out.
>
> This has to e fixed too.
> ///jon
I hate to admit it, but we might actually need this test after all. 
Imagine that somebody has done 'publish' just after the broadcast link 
came up (in tipc_bcast_add_peer()) , but before tipc_named_node_up() is 
called. The context of those two calls is not atomic, so I think it is 
possible that this publication might end up both in the bcast_link 
backlog queue and in the bulk distribution.
This publication message will have a named_seqno that is lower than the 
agreed synch point, and should be dropped at reception.

Given the crucial role of the binding table for the overall TIPC 
functionality I think it is better be safe than sorry here, and keep 
this test.
[Hoang] Finally, I'm able to reproduce the problem as same as above scene:

357 if (less(seqno, *rcv_nxt)) {
358 pr_info("DROP[%x->%x]: %s blk %d lblk %d nxt %d 
legacy %d seqno %u bc %u hsz %u dsz %u qlen %u\n",
359 msg_orignode(hdr), tipc_own_addr(net),
360 msg_type(hdr) == PUBLICATION ? 
"PUBL":"DRAW",
361 msg_is_bulk(hdr), msg_is_last_bulk(hdr),
362 *rcv_nxt, msg_is_legacy(hdr),
363 msg_named_seqno(hdr), msg_non_seq(hdr),
364 msg_hdr_sz(hdr), msg_data_sz(hdr),
365 skb_queue_len(namedq));
366
367 __skb_unlink(skb, namedq);
368 kfree_skb(skb);
369 continue;
370 }

---
[12528.036895] tipc: Established link <1001024:eth0-1001001:brtipc> on network 
plane A
[12528.043857] tipc: Established link <1001002:brtipc-1001001:brtipc> on 
network plane A
[12528.136462] tipc: DROP[1001001->1001002]: DRAW blk 0 lblk 0 nxt 3895 legacy 
0 seqno 3878 bc 0 hsz 40 dsz 20 qlen 23
[12528.140864] tipc: DROP[1001001->1001002]: DRAW blk 0 lblk 0 nxt 3895 legacy 
0 seqno 3879 bc 0 hsz 40 dsz 20 qlen 22
[...]
[12528.210959] tipc: DROP[1001001->1001002]: DRAW blk 0 lblk 0 nxt 3895 legacy 
0 seqno 3893 bc 0 hsz 40 dsz 20 qlen 8
[12528.218903] tipc: DROP[1001001->1001002]: DRAW blk 0 lblk 0 nxt 3895 legacy 
0 seqno 3894 bc 0 hsz 40 dsz 20 qlen 7
[12528.227214] tipc: DROP[1001001->1001024]: DRAW blk 0 lblk 0 nxt 3895 legacy 
0 seqno 3878 bc 0 hsz 40 dsz 20 qlen 23
[12528.231285] tipc: DROP[1001001->1001024]: DRAW blk 0 lblk 0 nxt 3895 legacy 
0 seqno 3879 bc 0 hsz 40 dsz 20 qlen 22
[...]
[12528.277445] tipc: DROP[1001001->1001024]: DRAW blk 0 lblk 0 nxt 3895 legacy 
0 seqno 3893 bc 0 hsz 40 dsz 20 qlen 8
[12528.280847] tipc: DROP[1001001->1001024]: DRAW blk 0 lblk 0 nxt 3895 legacy 
0 seqno 3894 bc 0 hsz 40 dsz 20 qlen 7 
---  
I will re-post the patch including the test as well.

///jon

>> +}
>> +}
>> +return NULL;
>> +}
>> +
>>/**
>> * tipc_named_rcv - process name table update messages sent by another 
>> node
>> */
>> -void tipc_named_rcv(struct net *net, struct sk_buff_head *inputq)
>> +void tipc_named_rcv(struct net *net, struct sk_buff_head *namedq,
>> +u16 *rcv_nxt, bool *open)
>>{
>> -st

Re: [tipc-discussion] [net-next] tipc: update a binding service via broadcast

2020-06-07 Thread Hoang Huu Le

Thanks Jon!
Hoang
-Original Message-
From: Jon Maloy  
Sent: Monday, June 8, 2020 8:57 AM
To: Hoang Huu Le ; 
tipc-discussion@lists.sourceforge.net; tipc-dek 
Subject: Re: [tipc-discussion] [net-next] tipc: update a binding service via 
broadcast

In case you are not aware of this, check this link before you send 
anything to net-next.

http://vger.kernel.org/~davem/net-next.html

///jon



On 6/7/20 9:50 PM, Hoang Huu Le wrote:
>
> -Original Message-
> From: Jon Maloy 
> Sent: Monday, June 8, 2020 2:14 AM
> To: tipc-discussion@lists.sourceforge.net
> Subject: Re: [tipc-discussion] [net-next] tipc: update a binding service via 
> broadcast
>
>
>
> On 6/7/20 3:03 PM, Jon Maloy wrote:
>>
>> On 6/7/20 12:24 AM, Hoang Huu Le wrote:
>>> Currently, updating binding table (add service binding to
>>> name table/withdraw a service binding) is being sent over replicast.
>>> However, if we are scaling up clusters to > 100 nodes/containers this
>>> method is less affection because of looping through nodes in a
>>> cluster one
>>> by one.
>>>
>>> It is worth to use broadcast to update a binding service. This way, the
>>> binding table can be updated on all peer nodes in one shot.
>>>
>>> Broadcast is used when all peer nodes, as indicated by a new capability
>>> flag TIPC_NAMED_BCAST, support reception of this message type.
>>>
>>> Four problems need to be considered when introducing this feature.
>>> 1) When establishing a link to a new peer node we still update this by a
>>> unicast 'bulk' update. This may lead to race conditions, where a later
>>> broadcast publication/withdrawal bypass the 'bulk', resulting in
>>> disordered publications, or even that a withdrawal may arrive before the
>>> corresponding publication. We solve this by adding an 'is_last_bulk' bit
>>> in the last bulk messages so that it can be distinguished from all other
>>> messages. Only when this message has arrived do we open up for reception
>>> of broadcast publications/withdrawals.
>> Add a line feed between these paragraphs before you send the patch.
>> Otherwise, still acked by me.
>>
>> ///jon
> Oh, already posted... Just ignore my comment above.
> [Hoang] net-next is closed. I will re-post the patch later with your 
> suggestion.
>
> ///jon
>>> 2) When a first legacy node is added to the cluster all distribution
>>> will switch over to use the legacy 'replicast' method, while the
>>> opposite happens when the last legacy node leaves the cluster. This
>>> entails another risk of message disordering that has to be handled. We
>>> solve this by adding a sequence number to the broadcast/replicast
>>> messages, so that disordering can be discovered and corrected. Note
>>> however that we don't need to consider potential message loss or
>>> duplication at this protocol level.
>>> 3) Bulk messages don't contain any sequence numbers, and will always
>>> arrive in order. Hence we must exempt those from the sequence number
>>> control and deliver them unconditionally. We solve this by adding a new
>>> 'is_bulk' bit in those messages so that they can be recognized.
>>> 4) Legacy messages, which don't contain any new bits or sequence
>>> numbers, but neither can arrive out of order, also need to be exempt
>>> from the initial synchronization and sequence number check, and
>>> delivered unconditionally. Therefore, we add another 'is_not_legacy' bit
>>> to all new messages so that those can be distinguished from legacy
>>> messages and the latter delivered directly.
>>>
>>> Signed-off-by: Hoang Huu Le 
>>> Acked-by: Jon Maloy 
>>> ---
>>>    net/tipc/bcast.c  |   6 +--
>>>    net/tipc/bcast.h  |   4 +-
>>>    net/tipc/link.c   |   2 +-
>>>    net/tipc/msg.h    |  40 
>>>    net/tipc/name_distr.c | 109 +++---
>>>    net/tipc/name_distr.h |   9 ++--
>>>    net/tipc/name_table.c |   9 +++-
>>>    net/tipc/name_table.h |   2 +
>>>    net/tipc/node.c   |  29 ---
>>>    net/tipc/node.h   |   8 ++--
>>>    10 files changed, 170 insertions(+), 48 deletions(-)
>>>
>>> diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
>>> index 383f87bc1061..940d176e0e87 100644
>>> --- a/net/tipc/bcast.c
>>> +++ b/net/tipc/bcast.c
>>> @@ -250,8 +250,8 @@ static void tipc_bcast_select_xmit_method(struct
>>> net *net, int dests,
>>>

Re: [tipc-discussion] [net-next] tipc: update a binding service via broadcast

2020-06-07 Thread Hoang Huu Le



-Original Message-
From: Jon Maloy  
Sent: Monday, June 8, 2020 2:14 AM
To: tipc-discussion@lists.sourceforge.net
Subject: Re: [tipc-discussion] [net-next] tipc: update a binding service via 
broadcast



On 6/7/20 3:03 PM, Jon Maloy wrote:
>
>
> On 6/7/20 12:24 AM, Hoang Huu Le wrote:
>> Currently, updating binding table (add service binding to
>> name table/withdraw a service binding) is being sent over replicast.
>> However, if we are scaling up clusters to > 100 nodes/containers this
>> method is less affection because of looping through nodes in a 
>> cluster one
>> by one.
>>
>> It is worth to use broadcast to update a binding service. This way, the
>> binding table can be updated on all peer nodes in one shot.
>>
>> Broadcast is used when all peer nodes, as indicated by a new capability
>> flag TIPC_NAMED_BCAST, support reception of this message type.
>>
>> Four problems need to be considered when introducing this feature.
>> 1) When establishing a link to a new peer node we still update this by a
>> unicast 'bulk' update. This may lead to race conditions, where a later
>> broadcast publication/withdrawal bypass the 'bulk', resulting in
>> disordered publications, or even that a withdrawal may arrive before the
>> corresponding publication. We solve this by adding an 'is_last_bulk' bit
>> in the last bulk messages so that it can be distinguished from all other
>> messages. Only when this message has arrived do we open up for reception
>> of broadcast publications/withdrawals.
> Add a line feed between these paragraphs before you send the patch.
> Otherwise, still acked by me.
>
> ///jon
Oh, already posted... Just ignore my comment above.
[Hoang] net-next is closed. I will re-post the patch later with your suggestion.

///jon
>
>> 2) When a first legacy node is added to the cluster all distribution
>> will switch over to use the legacy 'replicast' method, while the
>> opposite happens when the last legacy node leaves the cluster. This
>> entails another risk of message disordering that has to be handled. We
>> solve this by adding a sequence number to the broadcast/replicast
>> messages, so that disordering can be discovered and corrected. Note
>> however that we don't need to consider potential message loss or
>> duplication at this protocol level.
>> 3) Bulk messages don't contain any sequence numbers, and will always
>> arrive in order. Hence we must exempt those from the sequence number
>> control and deliver them unconditionally. We solve this by adding a new
>> 'is_bulk' bit in those messages so that they can be recognized.
>> 4) Legacy messages, which don't contain any new bits or sequence
>> numbers, but neither can arrive out of order, also need to be exempt
>> from the initial synchronization and sequence number check, and
>> delivered unconditionally. Therefore, we add another 'is_not_legacy' bit
>> to all new messages so that those can be distinguished from legacy
>> messages and the latter delivered directly.
>>
>> Signed-off-by: Hoang Huu Le 
>> Acked-by: Jon Maloy 
>> ---
>>   net/tipc/bcast.c  |   6 +--
>>   net/tipc/bcast.h  |   4 +-
>>   net/tipc/link.c   |   2 +-
>>   net/tipc/msg.h    |  40 
>>   net/tipc/name_distr.c | 109 +++---
>>   net/tipc/name_distr.h |   9 ++--
>>   net/tipc/name_table.c |   9 +++-
>>   net/tipc/name_table.h |   2 +
>>   net/tipc/node.c   |  29 ---
>>   net/tipc/node.h   |   8 ++--
>>   10 files changed, 170 insertions(+), 48 deletions(-)
>>
>> diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
>> index 383f87bc1061..940d176e0e87 100644
>> --- a/net/tipc/bcast.c
>> +++ b/net/tipc/bcast.c
>> @@ -250,8 +250,8 @@ static void tipc_bcast_select_xmit_method(struct 
>> net *net, int dests,
>>    * Consumes the buffer chain.
>>    * Returns 0 if success, otherwise errno: -EHOSTUNREACH,-EMSGSIZE
>>    */
>> -static int tipc_bcast_xmit(struct net *net, struct sk_buff_head *pkts,
>> -   u16 *cong_link_cnt)
>> +int tipc_bcast_xmit(struct net *net, struct sk_buff_head *pkts,
>> +    u16 *cong_link_cnt)
>>   {
>>   struct tipc_link *l = tipc_bc_sndlink(net);
>>   struct sk_buff_head xmitq;
>> @@ -752,7 +752,7 @@ void tipc_nlist_purge(struct tipc_nlist *nl)
>>   nl->local = false;
>>   }
>>   -u32 tipc_bcast_get_broadcast_mode(struct net *net)
>> +u32 tipc_bcast_get_mode(struct net *net)
>>   {
>>   struct tipc_bc_base *bb = tipc_bc_base(ne

Re: [tipc-discussion] [next-net v6] tipc: update a binding service via broadcast

2020-06-07 Thread Hoang Huu Le




-Original Message-
From: Jon Maloy  
Sent: Monday, June 8, 2020 2:12 AM
To: Hoang Huu Le ; ma...@donjonn.com; 
ying@windriver.com; tipc-discussion@lists.sourceforge.net
Subject: Re: [next-net v6] tipc: update a binding service via broadcast



On 6/6/20 11:10 PM, Hoang Huu Le wrote:
> -Original Message-
> From: Jon Maloy 
> Sent: Friday, June 5, 2020 8:03 PM
> To: Hoang Huu Le ; ma...@donjonn.com; 
> ying@windriver.com; tipc-discussion@lists.sourceforge.net
> Subject: Re: [next-net v6] tipc: update a binding service via broadcast
>
>
>
> On 6/5/20 3:52 AM, Hoang Huu Le wrote:
>> Currently, updating binding table (add service binding to
>> name table/withdraw a service binding) is being sent over replicast.
>> However, if we are scaling up clusters to > 100 nodes/containers this
>> method is less affection because of looping through nodes in a cluster one
>> by one.
>>
[...]
> Still not needed. This queue should be flushed in
> tipc_node_lost_contact(), which I now see we don't do.
> [Hoang] Yes, that's right. I will verify and send it out.
Actually, this might explain the mysterious "Failed to withdraw" 
printouts you observed during testing earlier.
Those withdraw items might be from a previous session just lingering in 
the queue.
On the other hand, such a bug is so obvious and would have such grave 
consequences (what if there are old 'publish' items in the queue?) that 
I find it hard to believe that it can have remain unnoticed all this time.
Are you sure we are not cleaning up this queue somewhere else?

If it really is so we must also issue a correction patch to 'net' for 
this issue.
[Hoang] Yes, I checked and already do the clean stuff at v2-3 in this feature 
... So, we should apply this for 'net' too.

///jon


>
> This has to e fixed too.
> ///jon
>> +}
>> +}
>> +return NULL;
>> +}
>> +
>>/**
>> * tipc_named_rcv - process name table update messages sent by another 
>> node
>> */
>> -void tipc_named_rcv(struct net *net, struct sk_buff_head *inputq)
>> +void tipc_named_rcv(struct net *net, struct sk_buff_head *namedq,
>> +u16 *rcv_nxt, bool *open)
>>{
>> -struct tipc_net *tn = net_generic(net, tipc_net_id);
>> -struct tipc_msg *msg;
>> +struct tipc_net *tn = tipc_net(net);
>>  struct distr_item *item;
>> -uint count;
>> -u32 node;
>> +struct tipc_msg *hdr;
>>  struct sk_buff *skb;
>> -int mtype;
>> +u32 count, node = 0;
>>
>>  spin_lock_bh(>nametbl_lock);
>> -for (skb = skb_dequeue(inputq); skb; skb = skb_dequeue(inputq)) {
>> -skb_linearize(skb);
>> -msg = buf_msg(skb);
>> -mtype = msg_type(msg);
>> -item = (struct distr_item *)msg_data(msg);
>> -count = msg_data_sz(msg) / ITEM_SIZE;
>> -node = msg_orignode(msg);
>> +while ((skb = tipc_named_dequeue(namedq, rcv_nxt, open))) {
>> +hdr = buf_msg(skb);
>> +node = msg_orignode(hdr);
>> +item = (struct distr_item *)msg_data(hdr);
>> +count = msg_data_sz(hdr) / ITEM_SIZE;
>>  while (count--) {
>> -tipc_update_nametbl(net, item, node, mtype);
>> +tipc_update_nametbl(net, item, node, msg_type(hdr));
>>  item++;
>>  }
>>  kfree_skb(skb);
>> @@ -345,6 +402,6 @@ void tipc_named_reinit(struct net *net)
>>  publ->node = self;
>>  list_for_each_entry_rcu(publ, >cluster_scope, binding_node)
>>  publ->node = self;
>> -
>> +nt->rc_dests = 0;
>>  spin_unlock_bh(>nametbl_lock);
>>}
>> diff --git a/net/tipc/name_distr.h b/net/tipc/name_distr.h
>> index 63fc73e0fa6c..092323158f06 100644
>> --- a/net/tipc/name_distr.h
>> +++ b/net/tipc/name_distr.h
>> @@ -67,11 +67,14 @@ struct distr_item {
>>  __be32 key;
>>};
>>
>> +void tipc_named_bcast(struct net *net, struct sk_buff *skb);
>>struct sk_buff *tipc_named_publish(struct net *net, struct publication 
>> *publ);
>>struct sk_buff *tipc_named_withdraw(struct net *net, struct publication 
>> *publ);
>> -void tipc_named_node_up(struct net *net, u32 dnode);
>> -void tipc_named_rcv(struct net *net, struct sk_buff_head *msg_queue);
>> +void tipc_named_node_up(struct net *net, u32 dnode, u16 capabilities);
>> +void tipc_named_rcv(struct net *net, struct sk_buff_head *namedq,
>> +u16 *

[tipc-discussion] [net-next] tipc: update a binding service via broadcast

2020-06-06 Thread Hoang Huu Le

Currently, updating binding table (add service binding to
name table/withdraw a service binding) is being sent over replicast.
However, if we are scaling up clusters to > 100 nodes/containers this
method is less affection because of looping through nodes in a cluster one
by one.

It is worth to use broadcast to update a binding service. This way, the
binding table can be updated on all peer nodes in one shot.

Broadcast is used when all peer nodes, as indicated by a new capability
flag TIPC_NAMED_BCAST, support reception of this message type.

Four problems need to be considered when introducing this feature.
1) When establishing a link to a new peer node we still update this by a
unicast 'bulk' update. This may lead to race conditions, where a later
broadcast publication/withdrawal bypass the 'bulk', resulting in
disordered publications, or even that a withdrawal may arrive before the
corresponding publication. We solve this by adding an 'is_last_bulk' bit
in the last bulk messages so that it can be distinguished from all other
messages. Only when this message has arrived do we open up for reception
of broadcast publications/withdrawals.
2) When a first legacy node is added to the cluster all distribution
will switch over to use the legacy 'replicast' method, while the
opposite happens when the last legacy node leaves the cluster. This
entails another risk of message disordering that has to be handled. We
solve this by adding a sequence number to the broadcast/replicast
messages, so that disordering can be discovered and corrected. Note
however that we don't need to consider potential message loss or
duplication at this protocol level.
3) Bulk messages don't contain any sequence numbers, and will always
arrive in order. Hence we must exempt those from the sequence number
control and deliver them unconditionally. We solve this by adding a new
'is_bulk' bit in those messages so that they can be recognized.
4) Legacy messages, which don't contain any new bits or sequence
numbers, but neither can arrive out of order, also need to be exempt
from the initial synchronization and sequence number check, and
delivered unconditionally. Therefore, we add another 'is_not_legacy' bit
to all new messages so that those can be distinguished from legacy
messages and the latter delivered directly.

Signed-off-by: Hoang Huu Le 
Acked-by: Jon Maloy 
---
 net/tipc/bcast.c  |   6 +--
 net/tipc/bcast.h  |   4 +-
 net/tipc/link.c   |   2 +-
 net/tipc/msg.h|  40 
 net/tipc/name_distr.c | 109 +++---
 net/tipc/name_distr.h |   9 ++--
 net/tipc/name_table.c |   9 +++-
 net/tipc/name_table.h |   2 +
 net/tipc/node.c   |  29 ---
 net/tipc/node.h   |   8 ++--
 10 files changed, 170 insertions(+), 48 deletions(-)

diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
index 383f87bc1061..940d176e0e87 100644
--- a/net/tipc/bcast.c
+++ b/net/tipc/bcast.c
@@ -250,8 +250,8 @@ static void tipc_bcast_select_xmit_method(struct net *net, 
int dests,
  * Consumes the buffer chain.
  * Returns 0 if success, otherwise errno: -EHOSTUNREACH,-EMSGSIZE
  */
-static int tipc_bcast_xmit(struct net *net, struct sk_buff_head *pkts,
-  u16 *cong_link_cnt)
+int tipc_bcast_xmit(struct net *net, struct sk_buff_head *pkts,
+   u16 *cong_link_cnt)
 {
struct tipc_link *l = tipc_bc_sndlink(net);
struct sk_buff_head xmitq;
@@ -752,7 +752,7 @@ void tipc_nlist_purge(struct tipc_nlist *nl)
nl->local = false;
 }
 
-u32 tipc_bcast_get_broadcast_mode(struct net *net)
+u32 tipc_bcast_get_mode(struct net *net)
 {
struct tipc_bc_base *bb = tipc_bc_base(net);
 
diff --git a/net/tipc/bcast.h b/net/tipc/bcast.h
index 4240c95188b1..2d9352dc7b0e 100644
--- a/net/tipc/bcast.h
+++ b/net/tipc/bcast.h
@@ -90,6 +90,8 @@ void tipc_bcast_toggle_rcast(struct net *net, bool supp);
 int tipc_mcast_xmit(struct net *net, struct sk_buff_head *pkts,
struct tipc_mc_method *method, struct tipc_nlist *dests,
u16 *cong_link_cnt);
+int tipc_bcast_xmit(struct net *net, struct sk_buff_head *pkts,
+   u16 *cong_link_cnt);
 int tipc_bcast_rcv(struct net *net, struct tipc_link *l, struct sk_buff *skb);
 void tipc_bcast_ack_rcv(struct net *net, struct tipc_link *l,
struct tipc_msg *hdr);
@@ -101,7 +103,7 @@ int tipc_nl_add_bc_link(struct net *net, struct tipc_nl_msg 
*msg,
 int tipc_nl_bc_link_set(struct net *net, struct nlattr *attrs[]);
 int tipc_bclink_reset_stats(struct net *net, struct tipc_link *l);
 
-u32 tipc_bcast_get_broadcast_mode(struct net *net);
+u32 tipc_bcast_get_mode(struct net *net);
 u32 tipc_bcast_get_broadcast_ratio(struct net *net);
 
 void tipc_mcast_filter_msg(struct net *net, struct sk_buff_head *defq,
diff --git a/net/tipc/link.c b/net/tipc/link.c
index ee3b8d0576b8..eac89a3e22ce 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -

Re: [tipc-discussion] [next-net v6] tipc: update a binding service via broadcast

2020-06-06 Thread Hoang Huu Le

-Original Message-
From: Jon Maloy  
Sent: Friday, June 5, 2020 8:03 PM
To: Hoang Huu Le ; ma...@donjonn.com; 
ying@windriver.com; tipc-discussion@lists.sourceforge.net
Subject: Re: [next-net v6] tipc: update a binding service via broadcast



On 6/5/20 3:52 AM, Hoang Huu Le wrote:
> Currently, updating binding table (add service binding to
> name table/withdraw a service binding) is being sent over replicast.
> However, if we are scaling up clusters to > 100 nodes/containers this
> method is less affection because of looping through nodes in a cluster one
> by one.
>
> It is worth to use broadcast to update a binding service. This way, the
> binding table can be updated on all peer nodes in one shot.
>
> Broadcast is used when all peer nodes, as indicated by a new capability
> flag TIPC_NAMED_BCAST, support reception of this message type.
>
> Four problems need to be considered when introducing this feature.
> 1) When establishing a link to a new peer node we still update this by a
> unicast 'bulk' update. This may lead to race conditions, where a later
> broadcast publication/withdrawal bypass the 'bulk', resulting in
> disordered publications, or even that a withdrawal may arrive before the
> corresponding publication. We solve this by adding an 'is_last_bulk' bit
> in the last bulk messages so that it can be distinguished from all other
> messages. Only when this message has arrived do we open up for reception
> of broadcast publications/withdrawals.
> 2) When a first legacy node is added to the cluster all distribution
> will switch over to use the legacy 'replicast' method, while the
> opposite happens when the last legacy node leaves the cluster. This
> entails another risk of message disordering that has to be handled. We
> solve this by adding a sequence number to the broadcast/replicast
> messages, so that disordering can be discovered and corrected. Note
> however that we don't need to consider potential message loss or
> duplication at this protocol level.
> 3) Bulk messages don't contain any sequence numbers, and will always
> arrive in order. Hence we must exempt those from the sequence number
> control and deliver them unconditionally. We solve this by adding a new
> 'is_bulk' bit in those messages so that they can be recognized.
> 4) Legacy messages, which don't contain any new bits or sequence
> numbers, but neither can arrive out of order, also need to be exempt
> from the initial synchronization and sequence number check, and
> delivered unconditionally. Therefore, we add another 'is_not_legacy' bit
> to all new messages so that those can be distinguished from legacy
> messages and the latter delivered directly.
---
If you add the above three dashes before the version info that will not 
by accident be considered part of the commit log.
But of course our internal versioning is irrelevant to David M, so you 
should just remove this info before posting.
[Hoang] Thanks Jon. Frankly, I didn't know that.

> v2: resolve synchronization problem when switching from unicast to
> broadcast
>
> v5: - never use broadcast if there is a single node not understanding it
>  - always use broadcast otherwise
>  - add sequence numbering to non-bulk messages
>
> v6: update Jon's comment
>
> Signed-off-by: Hoang Huu Le 
> Acked-by: Jon Maloy 
> ---
>   net/tipc/bcast.c  |   6 +--
>   net/tipc/bcast.h  |   4 +-
>   net/tipc/link.c   |   2 +-
>   net/tipc/msg.h|  40 +++
>   net/tipc/name_distr.c | 115 +++---
>   net/tipc/name_distr.h |   9 ++--
>   net/tipc/name_table.c |   9 +++-
>   net/tipc/name_table.h |   2 +
>   net/tipc/node.c   |  28 +++---
>   net/tipc/node.h   |   8 +--
>   10 files changed, 175 insertions(+), 48 deletions(-)
>
> diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
> index 4c20be08b9c4..9d085ad6f0cf 100644
> --- a/net/tipc/bcast.c
> +++ b/net/tipc/bcast.c
> @@ -249,8 +249,8 @@ static void tipc_bcast_select_xmit_method(struct net 
> *net, int dests,
>* Consumes the buffer chain.
>* Returns 0 if success, otherwise errno: -EHOSTUNREACH,-EMSGSIZE
>*/
> -static int tipc_bcast_xmit(struct net *net, struct sk_buff_head *pkts,
> -u16 *cong_link_cnt)
> +int tipc_bcast_xmit(struct net *net, struct sk_buff_head *pkts,
> + u16 *cong_link_cnt)
>   {
>   struct tipc_link *l = tipc_bc_sndlink(net);
>   struct sk_buff_head xmitq;
> @@ -746,7 +746,7 @@ void tipc_nlist_purge(struct tipc_nlist *nl)
>   nl->local = false;
>   }
>   
> -u32 tipc_bcast_get_broadcast_mode(struct net *net)
> +u32 tipc_bcast_get_mode(struct net *net)
>   {
>   struct tipc_bc_base *bb = tipc_bc_base

[tipc-discussion] [next-net v6] tipc: update a binding service via broadcast

2020-06-05 Thread Hoang Huu Le

Currently, updating binding table (add service binding to
name table/withdraw a service binding) is being sent over replicast.
However, if we are scaling up clusters to > 100 nodes/containers this
method is less affection because of looping through nodes in a cluster one
by one.

It is worth to use broadcast to update a binding service. This way, the
binding table can be updated on all peer nodes in one shot.

Broadcast is used when all peer nodes, as indicated by a new capability
flag TIPC_NAMED_BCAST, support reception of this message type.

Four problems need to be considered when introducing this feature.
1) When establishing a link to a new peer node we still update this by a
unicast 'bulk' update. This may lead to race conditions, where a later
broadcast publication/withdrawal bypass the 'bulk', resulting in
disordered publications, or even that a withdrawal may arrive before the
corresponding publication. We solve this by adding an 'is_last_bulk' bit
in the last bulk messages so that it can be distinguished from all other
messages. Only when this message has arrived do we open up for reception
of broadcast publications/withdrawals.
2) When a first legacy node is added to the cluster all distribution
will switch over to use the legacy 'replicast' method, while the
opposite happens when the last legacy node leaves the cluster. This
entails another risk of message disordering that has to be handled. We
solve this by adding a sequence number to the broadcast/replicast
messages, so that disordering can be discovered and corrected. Note
however that we don't need to consider potential message loss or
duplication at this protocol level.
3) Bulk messages don't contain any sequence numbers, and will always
arrive in order. Hence we must exempt those from the sequence number
control and deliver them unconditionally. We solve this by adding a new
'is_bulk' bit in those messages so that they can be recognized.
4) Legacy messages, which don't contain any new bits or sequence
numbers, but neither can arrive out of order, also need to be exempt
from the initial synchronization and sequence number check, and
delivered unconditionally. Therefore, we add another 'is_not_legacy' bit
to all new messages so that those can be distinguished from legacy
messages and the latter delivered directly.

v2: resolve synchronization problem when switching from unicast to
broadcast

v5: - never use broadcast if there is a single node not understanding it
- always use broadcast otherwise
- add sequence numbering to non-bulk messages

v6: update Jon's comment

Signed-off-by: Hoang Huu Le 
Acked-by: Jon Maloy 
---
 net/tipc/bcast.c  |   6 +--
 net/tipc/bcast.h  |   4 +-
 net/tipc/link.c   |   2 +-
 net/tipc/msg.h|  40 +++
 net/tipc/name_distr.c | 115 +++---
 net/tipc/name_distr.h |   9 ++--
 net/tipc/name_table.c |   9 +++-
 net/tipc/name_table.h |   2 +
 net/tipc/node.c   |  28 +++---
 net/tipc/node.h   |   8 +--
 10 files changed, 175 insertions(+), 48 deletions(-)

diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
index 4c20be08b9c4..9d085ad6f0cf 100644
--- a/net/tipc/bcast.c
+++ b/net/tipc/bcast.c
@@ -249,8 +249,8 @@ static void tipc_bcast_select_xmit_method(struct net *net, 
int dests,
  * Consumes the buffer chain.
  * Returns 0 if success, otherwise errno: -EHOSTUNREACH,-EMSGSIZE
  */
-static int tipc_bcast_xmit(struct net *net, struct sk_buff_head *pkts,
-  u16 *cong_link_cnt)
+int tipc_bcast_xmit(struct net *net, struct sk_buff_head *pkts,
+   u16 *cong_link_cnt)
 {
struct tipc_link *l = tipc_bc_sndlink(net);
struct sk_buff_head xmitq;
@@ -746,7 +746,7 @@ void tipc_nlist_purge(struct tipc_nlist *nl)
nl->local = false;
 }
 
-u32 tipc_bcast_get_broadcast_mode(struct net *net)
+u32 tipc_bcast_get_mode(struct net *net)
 {
struct tipc_bc_base *bb = tipc_bc_base(net);
 
diff --git a/net/tipc/bcast.h b/net/tipc/bcast.h
index 9e847d9617d3..b3b883e2a823 100644
--- a/net/tipc/bcast.h
+++ b/net/tipc/bcast.h
@@ -89,6 +89,8 @@ void tipc_bcast_toggle_rcast(struct net *net, bool supp);
 int tipc_mcast_xmit(struct net *net, struct sk_buff_head *pkts,
struct tipc_mc_method *method, struct tipc_nlist *dests,
u16 *cong_link_cnt);
+int tipc_bcast_xmit(struct net *net, struct sk_buff_head *pkts,
+   u16 *cong_link_cnt);
 int tipc_bcast_rcv(struct net *net, struct tipc_link *l, struct sk_buff *skb);
 void tipc_bcast_ack_rcv(struct net *net, struct tipc_link *l,
struct tipc_msg *hdr);
@@ -98,7 +100,7 @@ int tipc_nl_add_bc_link(struct net *net, struct tipc_nl_msg 
*msg);
 int tipc_nl_bc_link_set(struct net *net, struct nlattr *attrs[]);
 int tipc_bclink_reset_stats(struct net *net);
 
-u32 tipc_bcast_get_broadcast_mode(struct net *net);
+u32 tipc_bcast_get_mode(struct net *net);
 u32 tipc_bcast_get_broadcast

Re: [tipc-discussion] [net-next] tipc: update a binding service via broadcast

2020-06-04 Thread Hoang Huu Le

Hi Jon,

See my inline comment.

Regards,
Hoang
-Original Message-
From: Jon Maloy  
Sent: Thursday, June 4, 2020 8:42 PM
To: Hoang Huu Le ; ma...@donjonn.com; 
ying@windriver.com; tipc-discussion@lists.sourceforge.net
Subject: Re: [net-next] tipc: update a binding service via broadcast

On 6/4/20 5:21 AM, Hoang Huu Le wrote:
> Currently, updating binding table (add service binding to
> name table/withdraw a service binding) is being sent over replicast.
> However, if we are scaling up clusters to > 100 nodes/containers this
> method is less affection because of looping through nodes in a cluster one
> by one.
>
> It is worth to use broadcast to update a binding service.
s/Then binding table updates in/all nodes in one shoy. This way, the 
binding table can be updated on all peer nodes in one shot./

Broadcast is used when all peer nodes, as indicated by a new capability 
flag TIPC_NAMED_BCAST, support reception of this message type.

Four problems need to be considered when introducing this feature.
1) When establishing a link to a new peer node we still update this by a 
unicast 'bulk' update. This may lead to race conditions, where a later 
broadcast publication/withdrawal bypass the 'bulk', resulting in 
disordered publications, or even that a withdrawal may arrive before the 
corresponding publication. We solve this by adding an 'is_last_bulk' bit 
in the last bulk messages so that it can be distinguished from all other 
messages. Only when this message has arrived do we open up for reception 
of broadcast publications/withdrawals.
2) When a first legacy node is added to the cluster all distribution 
will switch over to use the legacy 'replicast' method, while the 
opposite happens when the last legacy node leaves the cluster. This 
entails another risk of message disordering that has to be handled. We 
solve this by adding a sequence number to the broadcast/replicast 
messages, so that disordering can be discovered and corrected. Note 
however that we don't need to consider potential message loss or 
duplication at this protocol level.
3) Bulk messages don't contain any sequence numbers, and will always 
arrive in order. Hence we must exempt those from the sequence number 
control and deliver them unconditionally. We solve this by adding a new 
'is_bulk' bit in those messages so that they can be recognized.
4) Legacy messages, which don't contain any new bits or sequence 
numbers, but neither can arrive out of order, also need to be exempt 
from the initial synchronization and sequence number check, and 
delivered unconditionally. Therefore, we add another 'is_not_legacy' bit 
to all new messages so that those can be distinguished from legacy 
messages and the latter delivered directly.

> Signed-off-by: Hoang Huu Le 
> ---
>   net/tipc/bcast.c  |   6 +--
>   net/tipc/bcast.h  |   4 +-
>   net/tipc/link.c   |   2 +-
>   net/tipc/msg.h|  40 +++
>   net/tipc/name_distr.c | 115 +++---
>   net/tipc/name_distr.h |   9 ++--
>   net/tipc/name_table.c |   9 +++-
>   net/tipc/name_table.h |   2 +
>   net/tipc/node.c   |  28 +++---
>   net/tipc/node.h   |   8 +--
>   10 files changed, 175 insertions(+), 48 deletions(-)
>
> diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
> index 4c20be08b9c4..9d085ad6f0cf 100644
> --- a/net/tipc/bcast.c
> +++ b/net/tipc/bcast.c
> @@ -249,8 +249,8 @@ static void tipc_bcast_select_xmit_method(struct net 
> *net, int dests,
>* Consumes the buffer chain.
>* Returns 0 if success, otherwise errno: -EHOSTUNREACH,-EMSGSIZE
>*/
> -static int tipc_bcast_xmit(struct net *net, struct sk_buff_head *pkts,
> -u16 *cong_link_cnt)
> +int tipc_bcast_xmit(struct net *net, struct sk_buff_head *pkts,
> + u16 *cong_link_cnt)
>   {
>   struct tipc_link *l = tipc_bc_sndlink(net);
>   struct sk_buff_head xmitq;
> @@ -746,7 +746,7 @@ void tipc_nlist_purge(struct tipc_nlist *nl)
>   nl->local = false;
>   }
>   
> -u32 tipc_bcast_get_broadcast_mode(struct net *net)
> +u32 tipc_bcast_get_mode(struct net *net)
>   {
>   struct tipc_bc_base *bb = tipc_bc_base(net);
>   
> diff --git a/net/tipc/bcast.h b/net/tipc/bcast.h
> index 9e847d9617d3..b3b883e2a823 100644
> --- a/net/tipc/bcast.h
> +++ b/net/tipc/bcast.h
> @@ -89,6 +89,8 @@ void tipc_bcast_toggle_rcast(struct net *net, bool supp);
>   int tipc_mcast_xmit(struct net *net, struct sk_buff_head *pkts,
>   struct tipc_mc_method *method, struct tipc_nlist *dests,
>   u16 *cong_link_cnt);
> +int tipc_bcast_xmit(struct net *net, struct sk_buff_head *pkts,
> + u16 *cong_link_cnt);
>   int tipc_bcast_rcv(struct net *net, struct tipc_link *l, struct sk_buff 
> *skb);
&

Re: [tipc-discussion] FW: [PATCH 2/2] tipc: update a binding service via broadcast

2020-06-04 Thread Hoang Huu Le

From: Jon Maloy 
Sent: Thursday, June 4, 2020 8:59 PM
To: Hoang Huu Le ; 
tipc-discussion@lists.sourceforge.net; tipc-dek ; 
Tuong Tong Lien ; Xin Long ; 
Tung Quang Nguyen ; Ying Xue 

Subject: Re: FW: [PATCH 2/2] tipc: update a binding service via broadcast

On 6/4/20 5:14 AM, Hoang Huu Le wrote:
Hi Jon,

Please see my inline comment

Regards,
Hoang
From: Jon Maloy <mailto:jma...@redhat.com>
Sent: Friday, May 29, 2020 11:11 PM
To: 
tipc-discussion@lists.sourceforge.net<mailto:tipc-discussion@lists.sourceforge.net>;
 tipc-dek <mailto:tipc-...@dektech.com.au>; Tuong Tong 
Lien <mailto:tuong.t.l...@dektech.com.au>; Xin 
Long <mailto:lucien@gmail.com>; Tung Quang Nguyen 
<mailto:tung.q.ngu...@dektech.com.au>; Ying Xue 
<mailto:ying@windriver.com>
Subject: Fwd: Re: FW: [PATCH 2/2] tipc: update a binding service via broadcast

Added more recipients.

 Forwarded Message 
Subject:
Re: FW: [PATCH 2/2] tipc: update a binding service via broadcast
Date:
Fri, 29 May 2020 12:08:02 -0400
From:
Jon Maloy <mailto:jma...@redhat.com>
To:
Hoang Huu Le <mailto:hoang.h...@dektech.com.au>, 
ma...@donjonn.com<mailto:ma...@donjonn.com> 
<mailto:ma...@donjonn.com>

Hi Hoang,
See below.

On 5/27/20 6:49 AM, Hoang Huu Le wrote:

Hi Jon,

I got DRAFT version base on your idea (attachment file).
But from my point, this version introduce too much code implementation at 
sending side.
I don't think this is bright idea to keep rcast_list and bcast_list in the name 
table.
I think we should find out a new way or just ignore the feature.
Yes, you are right.
I came up with a new idea, to just add a sequence number to the 
broadcast/replicast messages and re-order them at reception. This even handles 
the case if the first broadcast message arrives before the first bulk message, 
something we have not anticipated before.
I couldn't resist the temptation trying to code it, as you can see i the patch 
I just sent out.
It is totally untested, I just added the code as I thought it should be and 
made sure it compiled.
There is still a little too much new code to my taste, but this might be a way 
forward.
Please give your feedback on this.

I also noticed a couple of things while working with this:

1) There is still an 'expires' field in the tipc_mcast_method, and it seems to 
even be considered when switching bcast/rcast. The whole point of adding 
the mcast synchronization mechanism was to get rid of this delay. Have you 
tested that syncronization really works without the 'expires' ?
[Hoang] Yes, I did.
I issue the command below to force rcast/bcast regardless ‘expires’.

$ tipc link set bro REPLICAST
or
$tipc link set bro BROADCAST
Yes, but what happens when the protocol selects by itself, based on number of 
destinations, and this number has just passed a threshold?
[Hoang] It is really hard to find a strategy making this happen. So, I haven’t 
cover it in the past but I will try.

2) There are some remnants of old code for the name table dist_queue. This 
functionality was made redundant by me at least two years ago, so this should 
be cleaned up.
[Hoang] I will check and clean those stuff in separate commit.

3) We might have a potential race condition when new nodes come up, so that 
publications are distributed twice.
a) A publication is added to the name table, and the name table lock is 
released.
b) A new node comes up, and the new publication is delivered in the bulk 
message.
c) The broadcast of the publication goes ahead and sends it out to all 
nodes, even the one that just came up.
d) We end up with a double publication on one of the nodes.
e) One of those will linger in the name table after the publication is 
withdrawn.
I have never seen this happen, and my analysis might be wrong, but to me 
this looks like a possible scenario.
Note that my patch doesn't fix this, but we could possibly arrange it by 
adding a 'distributed' flag i the publication item on the sending side, so that 
the bulk distribution will ignore it.

[Hoang] I try to simulate as your scenario description on 8 nodes with 
publication >100 services at the same time and bring interface down/up. 
Sometimes I got below error logs:
[  537.322414] tipc: Failed to remove binding 1000,2 from 1001001
[…]
[  537.358957] tipc: Failed to remove binding 1000,11 from 1001001

I’m not sure above counting as bug whether or not. If yes, we also fix this in 
another commit too.
This is not what I expected, but might be another manifestation of the same 
problem.
We are probably observing a replicast withdraw arriving before the 
corresponding bulk publication.
If you see a binding <1000,2> in the table after this printout that would be a 
confirmation.

[Hoang] No, I don’t see this in my testing.

For my scenario:
Do you see duplicate publication instances before you do withdraw?
Do you see lingering publication after a withdraw?

Luck

[tipc-discussion] [net-next] tipc: update a binding service via broadcast

2020-06-04 Thread Hoang Huu Le

Currently, updating binding table (add service binding to
name table/withdraw a service binding) is being sent over replicast.
However, if we are scaling up clusters to > 100 nodes/containers this
method is less affection because of looping through nodes in a cluster one
by one.

It is worth to use broadcast to update a binding service. Then binding
table updates in all nodes for one shot.

The mechanism is backward compatible as sync message slient dropped.

v2: resolve synchronization problem when switching from unicast to
broadcast

v5: - never use broadcast if there is a single node not understanding it
- always use broadcast otherwise
- add sequence numbering to non-bulk messages

Signed-off-by: Hoang Huu Le 
---
 net/tipc/bcast.c  |   6 +--
 net/tipc/bcast.h  |   4 +-
 net/tipc/link.c   |   2 +-
 net/tipc/msg.h|  40 +++
 net/tipc/name_distr.c | 115 +++---
 net/tipc/name_distr.h |   9 ++--
 net/tipc/name_table.c |   9 +++-
 net/tipc/name_table.h |   2 +
 net/tipc/node.c   |  28 +++---
 net/tipc/node.h   |   8 +--
 10 files changed, 175 insertions(+), 48 deletions(-)

diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
index 4c20be08b9c4..9d085ad6f0cf 100644
--- a/net/tipc/bcast.c
+++ b/net/tipc/bcast.c
@@ -249,8 +249,8 @@ static void tipc_bcast_select_xmit_method(struct net *net, 
int dests,
  * Consumes the buffer chain.
  * Returns 0 if success, otherwise errno: -EHOSTUNREACH,-EMSGSIZE
  */
-static int tipc_bcast_xmit(struct net *net, struct sk_buff_head *pkts,
-  u16 *cong_link_cnt)
+int tipc_bcast_xmit(struct net *net, struct sk_buff_head *pkts,
+   u16 *cong_link_cnt)
 {
struct tipc_link *l = tipc_bc_sndlink(net);
struct sk_buff_head xmitq;
@@ -746,7 +746,7 @@ void tipc_nlist_purge(struct tipc_nlist *nl)
nl->local = false;
 }
 
-u32 tipc_bcast_get_broadcast_mode(struct net *net)
+u32 tipc_bcast_get_mode(struct net *net)
 {
struct tipc_bc_base *bb = tipc_bc_base(net);
 
diff --git a/net/tipc/bcast.h b/net/tipc/bcast.h
index 9e847d9617d3..b3b883e2a823 100644
--- a/net/tipc/bcast.h
+++ b/net/tipc/bcast.h
@@ -89,6 +89,8 @@ void tipc_bcast_toggle_rcast(struct net *net, bool supp);
 int tipc_mcast_xmit(struct net *net, struct sk_buff_head *pkts,
struct tipc_mc_method *method, struct tipc_nlist *dests,
u16 *cong_link_cnt);
+int tipc_bcast_xmit(struct net *net, struct sk_buff_head *pkts,
+   u16 *cong_link_cnt);
 int tipc_bcast_rcv(struct net *net, struct tipc_link *l, struct sk_buff *skb);
 void tipc_bcast_ack_rcv(struct net *net, struct tipc_link *l,
struct tipc_msg *hdr);
@@ -98,7 +100,7 @@ int tipc_nl_add_bc_link(struct net *net, struct tipc_nl_msg 
*msg);
 int tipc_nl_bc_link_set(struct net *net, struct nlattr *attrs[]);
 int tipc_bclink_reset_stats(struct net *net);
 
-u32 tipc_bcast_get_broadcast_mode(struct net *net);
+u32 tipc_bcast_get_mode(struct net *net);
 u32 tipc_bcast_get_broadcast_ratio(struct net *net);
 
 void tipc_mcast_filter_msg(struct net *net, struct sk_buff_head *defq,
diff --git a/net/tipc/link.c b/net/tipc/link.c
index d4675e922a8f..da0b30733549 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -2646,7 +2646,7 @@ int tipc_nl_add_bc_link(struct net *net, struct 
tipc_nl_msg *msg)
struct nlattr *attrs;
struct nlattr *prop;
struct tipc_net *tn = net_generic(net, tipc_net_id);
-   u32 bc_mode = tipc_bcast_get_broadcast_mode(net);
+   u32 bc_mode = tipc_bcast_get_mode(net);
u32 bc_ratio = tipc_bcast_get_broadcast_ratio(net);
struct tipc_link *bcl = tn->bcl;
 
diff --git a/net/tipc/msg.h b/net/tipc/msg.h
index 871feadbbc19..d53914316684 100644
--- a/net/tipc/msg.h
+++ b/net/tipc/msg.h
@@ -409,6 +409,36 @@ static inline void msg_set_errcode(struct tipc_msg *m, u32 
err)
msg_set_bits(m, 1, 25, 0xf, err);
 }
 
+static inline void msg_set_bulk(struct tipc_msg *m)
+{
+   msg_set_bits(m, 1, 28, 0x1, 1);
+}
+
+static inline u32 msg_is_bulk(struct tipc_msg *m)
+{
+   return msg_bits(m, 1, 28, 0x1);
+}
+
+static inline void msg_set_last_bulk(struct tipc_msg *m)
+{
+   msg_set_bits(m, 1, 27, 0x1, 1);
+}
+
+static inline u32 msg_is_last_bulk(struct tipc_msg *m)
+{
+   return msg_bits(m, 1, 27, 0x1);
+}
+
+static inline void msg_set_non_legacy(struct tipc_msg *m)
+{
+   msg_set_bits(m, 1, 26, 0x1, 1);
+}
+
+static inline u32 msg_is_legacy(struct tipc_msg *m)
+{
+   return !msg_bits(m, 1, 26, 0x1);
+}
+
 static inline u32 msg_reroute_cnt(struct tipc_msg *m)
 {
return msg_bits(m, 1, 21, 0xf);
@@ -538,6 +568,16 @@ static inline void msg_set_origport(struct tipc_msg *m, 
u32 p)
msg_set_word(m, 4, p);
 }
 
+static inline u16 msg_named_seqno(struct tipc_msg *m)
+{
+   return msg_bits(m, 4, 0, 0x);
+}
+
+static inline void msg_set_named_seqno(struct ti

Re: [tipc-discussion] FW: [PATCH 2/2] tipc: update a binding service via broadcast

2020-06-04 Thread Hoang Huu Le

Hi Jon,

Please see my inline comment

Regards,
Hoang
From: Jon Maloy 
Sent: Friday, May 29, 2020 11:11 PM
To: tipc-discussion@lists.sourceforge.net; tipc-dek ; 
Tuong Tong Lien ; Xin Long ; 
Tung Quang Nguyen ; Ying Xue 

Subject: Fwd: Re: FW: [PATCH 2/2] tipc: update a binding service via broadcast

Added more recipients.

 Forwarded Message 
Subject:
Re: FW: [PATCH 2/2] tipc: update a binding service via broadcast
Date:
Fri, 29 May 2020 12:08:02 -0400
From:
Jon Maloy <mailto:jma...@redhat.com>
To:
Hoang Huu Le <mailto:hoang.h...@dektech.com.au>, 
ma...@donjonn.com<mailto:ma...@donjonn.com> 
<mailto:ma...@donjonn.com>

Hi Hoang,
See below.

On 5/27/20 6:49 AM, Hoang Huu Le wrote:

Hi Jon,

I got DRAFT version base on your idea (attachment file).
But from my point, this version introduce too much code implementation at 
sending side.
I don't think this is bright idea to keep rcast_list and bcast_list in the name 
table.
I think we should find out a new way or just ignore the feature.
Yes, you are right.
I came up with a new idea, to just add a sequence number to the 
broadcast/replicast messages and re-order them at reception. This even handles 
the case if the first broadcast message arrives before the first bulk message, 
something we have not anticipated before.
I couldn't resist the temptation trying to code it, as you can see i the patch 
I just sent out.
It is totally untested, I just added the code as I thought it should be and 
made sure it compiled.
There is still a little too much new code to my taste, but this might be a way 
forward.
Please give your feedback on this.

I also noticed a couple of things while working with this:

1) There is still an 'expires' field in the tipc_mcast_method, and it seems to 
even be considered when switching bcast/rcast. The whole point of adding 
the mcast synchronization mechanism was to get rid of this delay. Have you 
tested that syncronization really works without the 'expires' ?
[Hoang] Yes, I did.
I issue the command below to force rcast/bcast regardless ‘expires’.

$ tipc link set bro REPLICAST
or
$tipc link set bro BROADCAST

2) There are some remnants of old code for the name table dist_queue. This 
functionality was made redundant by me at least two years ago, so this should 
be cleaned up.
[Hoang] I will check and clean those stuff in separate commit.

3) We might have a potential race condition when new nodes come up, so that 
publications are distributed twice.
a) A publication is added to the name table, and the name table lock is 
released.
b) A new node comes up, and the new publication is delivered in the bulk 
message.
c) The broadcast of the publication goes ahead and sends it out to all 
nodes, even the one that just came up.
d) We end up with a double publication on one of the nodes.
e) One of those will linger in the name table after the publication is 
withdrawn.
I have never seen this happen, and my analysis might be wrong, but to me 
this looks like a possible scenario.
Note that my patch doesn't fix this, but we could possibly arrange it by 
adding a 'distributed' flag i the publication item on the sending side, so that 
the bulk distribution will ignore it.

[Hoang] I try to simulate as your scenario description on 8 nodes with 
publication >100 services at the same time and bring interface down/up. 
Sometimes I got below error logs:
[  537.322414] tipc: Failed to remove binding 1000,2 from 1001001
[…]
[  537.358957] tipc: Failed to remove binding 1000,11 from 1001001

I’m not sure above counting as bug whether or not. If yes, we also fix this in 
another commit too.

Regards
///jon

Regards,
Hoang
-Original Message-
From: Jon Maloy <mailto:jma...@redhat.com>
Sent: Thursday, May 21, 2020 9:44 PM
To: Hoang Huu Le <mailto:hoang.h...@dektech.com.au>; 
ma...@donjonn.com<mailto:ma...@donjonn.com>
Subject: Re: FW: [PATCH 2/2] tipc: update a binding service via broadcast

Hi,
I have one more comment below. Looking forward to your feedback.
///jon

On 5/20/20 9:03 PM, Hoang Huu Le wrote:

Yeah, thanks Jon.
I will investigate more on your idea when I finish the issue with lab 
infrastructure.

Hoang
-Original Message-
From: Jon Maloy <mailto:jma...@redhat.com>
Sent: Thursday, May 21, 2020 7:13 AM
To: Hoang Huu Le <mailto:hoang.h...@dektech.com.au>; 
ma...@donjonn.com<mailto:ma...@donjonn.com>
Subject: Re: FW: [PATCH 2/2] tipc: update a binding service via broadcast

Hi Hoang,
Below I try to summarize my newest proposal in relation to v2 of your patch.

1) The bulk can be sent just as is done now, with the addition that we
add the NOT_LAST bit to the header.
2) We need a new capability bit identifying nodes which support
broadcast NAME_DISTR. I don't see we can just
 reuse TIPC_MCAST_RBCTL, because the recipients need to have code
for handling concurrent
 bulk/broadcast receptions. This bit is not added

Re: [tipc-discussion] [ ] tipc: update a binding service via broadcast

2020-06-02 Thread Hoang Huu Le



-Original Message-
From: Jon Maloy  
Sent: Tuesday, June 2, 2020 8:37 PM
To: Hoang Huu Le ; 
tipc-discussion@lists.sourceforge.net
Cc: Tung Quang Nguyen ; Tuong Tong Lien 
; ma...@donjonn.com; x...@redhat.com; 
ying@windriver.com; parthasarathy.bhuvara...@gmail.com
Subject: Re: [ ] tipc: update a binding service via broadcast



On 6/1/20 7:45 PM, Hoang Huu Le wrote:
> Hi Jon,
>
> See my comment inline.
>
> Hoang
> -Original Message-
> From: Jon Maloy 
> Sent: Monday, June 1, 2020 7:40 PM
> To: Hoang Huu Le ; 
> tipc-discussion@lists.sourceforge.net
> Cc: Tung Quang Nguyen ; Tuong Tong Lien 
> ; ma...@donjonn.com; x...@redhat.com; 
> ying@windriver.com; parthasarathy.bhuvara...@gmail.com
> Subject: Re: [ ] tipc: update a binding service via broadcast
>
> Hi Hoang.
> See below.
>
> On 6/1/20 5:33 AM, Hoang Huu Le wrote:
>> Hi Jon,
>>
>> There is a concern in function tipc_node_broadcast.
>> See my comment inline.
>>
>> Regards,
>> Hoang
>> -Original Message-
>> From: jma...@redhat.com 
>> Sent: Friday, May 29, 2020 10:34 PM
>> To: tipc-discussion@lists.sourceforge.net
>> Cc: Tung Quang Nguyen ; Hoang Huu Le 
>> ; Tuong Tong Lien ; 
>> jma...@redhat.com; ma...@donjonn.com; x...@redhat.com; 
>> ying@windriver.com; parthasarathy.bhuvara...@gmail.com
>> Subject: [ ] tipc: update a binding service via broadcast
>>
>> From: Hoang Huu Le 
>>
>> Currently, updating binding table (add service binding to
>> name table/withdraw a service binding) is being sent over replicast.
>> However, if we are scaling up clusters to > 100 nodes/containers this
>> method is less affection because of looping through nodes in a cluster one
>> by one.
> [...]
>>
>> +/* Use broadcast if all nodes support it */
>> +if (!rc_dests) {
>> +__skb_queue_head_init();
>> +__skb_queue_tail(, skb);
>> +tipc_bcast_xmit(net, , );
>> +return;
>> +}
>> +
>> [Hoang]
>> We could not use 'rc_dests' to send as broadcast mode because of it is not 
>> fully broadcast supporting in the cluster.
>> As stated, if there is a node in the cluster not supporting broadcast mode, 
>> we need to use replicast instead.
>> That's why we introduced the feature "Smooth switch between 
>> replicast/broadcast in bcast.c" and a new command line to configure the 
>> broadcast link.
>> If we assume base on 'rc_dests' (i.e capability flags TIPC_NAMED_BCAST), 
>> probably 'forced replicast' configuration on broadcast link becomes useless. 
>> Then, destination nodes will be missed the publication item.
> You misunderstand this function. rc_dests is a *counter*, not a flag. It
> counts the number of peer nodes that don't support TIPC_NAMED_BCAST, and
> if this is non-zero, we use replicast. So this is safe.
>
> [Hoang] Yes, I know about this counter. What I'm considering about the L2 
> interface (e.g VXLAN) pseudo support broadcast (we discussed the topic a year 
> ago), then, the sending side must be switching to replicast 
> (method->force_bcast = true). But above likely forcing to use broadcast 
> careless setting from broadcast/replicast from L2.
Yes you are right. I only checked this for UDP, where it seems to be ok, 
but forgot about VXLAN.
Still, we can avoid building up 'dest' lists by making a small change to 
tipc_node_bcast():

/* Use broadcast if bearer and all nodes support it */
if (!rc_dests && tipc_bcast_get_broadcast_mode(net) != BCLINK_MODE_RCAST) {
  u16 dummy;
          __skb_queue_head_init();
  __skb_queue_tail(, skb);
  tipc_bcast_xmit(net, , );
  return;
}
[Hoang] It sounds fine with this way.

I really believe sequence numbering and these three new bits solve our 
problem in a simple and robust way.
Strictly, even the 'non_legacy' bit could be omitted, since the receiver 
already has this info from the capability map, but I found it simpler to 
do it that way. Also, I realized that tipc_named_dequeue() function can 
be simplified further, since there can be no duplicate messages.

[Hoang] I will do further testing your solution and improve this one. 

BR
///jon

>
> ///jon
>
>> +/* Otherwise use legacy replicast method */
>>  rcu_read_lock();
>>  list_for_each_entry_rcu(n, tipc_nodes(net), list) {
>>  dst = n->addr;
>> @@ -1749,7 +1762,6 @@ void tipc_node_broadcast(struct net *net, struct 
>> sk_buff *skb)
>>  tipc_node_xmit_skb(net, txskb, dst, 0);
>>  }
>>  rcu_read_unlock();
>> -
>>  kfree_skb(skb);
>>

Re: [tipc-discussion] [ ] tipc: update a binding service via broadcast

2020-06-01 Thread Hoang Huu Le

Hi Jon,

See my comment inline.

Hoang
-Original Message-
From: Jon Maloy  
Sent: Monday, June 1, 2020 7:40 PM
To: Hoang Huu Le ; 
tipc-discussion@lists.sourceforge.net
Cc: Tung Quang Nguyen ; Tuong Tong Lien 
; ma...@donjonn.com; x...@redhat.com; 
ying@windriver.com; parthasarathy.bhuvara...@gmail.com
Subject: Re: [ ] tipc: update a binding service via broadcast

Hi Hoang.
See below.

On 6/1/20 5:33 AM, Hoang Huu Le wrote:
> Hi Jon,
>
> There is a concern in function tipc_node_broadcast.
> See my comment inline.
>
> Regards,
> Hoang
> -Original Message-
> From: jma...@redhat.com 
> Sent: Friday, May 29, 2020 10:34 PM
> To: tipc-discussion@lists.sourceforge.net
> Cc: Tung Quang Nguyen ; Hoang Huu Le 
> ; Tuong Tong Lien ; 
> jma...@redhat.com; ma...@donjonn.com; x...@redhat.com; 
> ying@windriver.com; parthasarathy.bhuvara...@gmail.com
> Subject: [ ] tipc: update a binding service via broadcast
>
> From: Hoang Huu Le 
>
> Currently, updating binding table (add service binding to
> name table/withdraw a service binding) is being sent over replicast.
> However, if we are scaling up clusters to > 100 nodes/containers this
> method is less affection because of looping through nodes in a cluster one
> by one.
[...]
>   
> + /* Use broadcast if all nodes support it */
> + if (!rc_dests) {
> + __skb_queue_head_init();
> + __skb_queue_tail(, skb);
> + tipc_bcast_xmit(net, , );
> + return;
> + }
> +
> [Hoang]
> We could not use 'rc_dests' to send as broadcast mode because of it is not 
> fully broadcast supporting in the cluster.
> As stated, if there is a node in the cluster not supporting broadcast mode, 
> we need to use replicast instead.
> That's why we introduced the feature "Smooth switch between 
> replicast/broadcast in bcast.c" and a new command line to configure the 
> broadcast link.
> If we assume base on 'rc_dests' (i.e capability flags TIPC_NAMED_BCAST), 
> probably 'forced replicast' configuration on broadcast link becomes useless. 
> Then, destination nodes will be missed the publication item.
You misunderstand this function. rc_dests is a *counter*, not a flag. It 
counts the number of peer nodes that don't support TIPC_NAMED_BCAST, and 
if this is non-zero, we use replicast. So this is safe.

[Hoang] Yes, I know about this counter. What I'm considering about the L2 
interface (e.g VXLAN) pseudo support broadcast (we discussed the topic a year 
ago), then, the sending side must be switching to replicast 
(method->force_bcast = true). But above likely forcing to use broadcast 
careless setting from broadcast/replicast from L2.

///jon

>
> + /* Otherwise use legacy replicast method */
>   rcu_read_lock();
>   list_for_each_entry_rcu(n, tipc_nodes(net), list) {
>   dst = n->addr;
> @@ -1749,7 +1762,6 @@ void tipc_node_broadcast(struct net *net, struct 
> sk_buff *skb)
>   tipc_node_xmit_skb(net, txskb, dst, 0);
>   }
>   rcu_read_unlock();
> -
>   kfree_skb(skb);
>   }
>   
> @@ -1844,7 +1856,9 @@ static void tipc_node_bc_rcv(struct net *net, struct 
> sk_buff *skb, int bearer_id
>   
>   /* Handle NAME_DISTRIBUTOR messages sent from 1.7 nodes */
>   if (!skb_queue_empty(>bc_entry.namedq))
> - tipc_named_rcv(net, >bc_entry.namedq);
> + tipc_named_rcv(net, >bc_entry.namedq,
> +>bc_entry.named_rcv_nxt,
> +>bc_entry.named_open);
>   
>   /* If reassembly or retransmission failure => reset all links to peer */
>   if (rc & TIPC_LINK_DOWN_EVT)
> @@ -2109,7 +2123,9 @@ void tipc_rcv(struct net *net, struct sk_buff *skb, 
> struct tipc_bearer *b)
>   tipc_node_link_down(n, bearer_id, false);
>   
>   if (unlikely(!skb_queue_empty(>bc_entry.namedq)))
> - tipc_named_rcv(net, >bc_entry.namedq);
> + tipc_named_rcv(net, >bc_entry.namedq,
> +>bc_entry.named_rcv_nxt,
> +>bc_entry.named_open);
>   
>   if (unlikely(!skb_queue_empty(>bc_entry.inputq1)))
>   tipc_node_mcast_rcv(n);
> diff --git a/net/tipc/node.h b/net/tipc/node.h
> index a6803b449a2c..9f6f13f1604f 100644
> --- a/net/tipc/node.h
> +++ b/net/tipc/node.h
> @@ -55,7 +55,8 @@ enum {
>   TIPC_MCAST_RBCTL  = (1 << 7),
>   TIPC_GAP_ACK_BLOCK= (1 << 8),
>   TIPC_TUNNEL_ENHANCED  = (1 << 9),
> - TIPC_NAGLE= (1 << 10)
> + TIPC_NAGLE= (1 << 10),
> + TIPC_NAMED_BCAST  = (1 << 11

Re: [tipc-discussion] [ ] tipc: update a binding service via broadcast

2020-06-01 Thread Hoang Huu Le

Hi Jon,

There is a concern in function tipc_node_broadcast.
See my comment inline.

Regards,
Hoang
-Original Message-
From: jma...@redhat.com  
Sent: Friday, May 29, 2020 10:34 PM
To: tipc-discussion@lists.sourceforge.net
Cc: Tung Quang Nguyen ; Hoang Huu Le 
; Tuong Tong Lien ; 
jma...@redhat.com; ma...@donjonn.com; x...@redhat.com; ying@windriver.com; 
parthasarathy.bhuvara...@gmail.com
Subject: [ ] tipc: update a binding service via broadcast

From: Hoang Huu Le 

Currently, updating binding table (add service binding to
name table/withdraw a service binding) is being sent over replicast.
However, if we are scaling up clusters to > 100 nodes/containers this
method is less affection because of looping through nodes in a cluster one
by one.

It is worth to use broadcast to update a binding service. Then binding
table updates in all nodes for one shot.

The mechanism is backward compatible as sync message slient dropped.

v2: resolve synchronization problem when switching from unicast to
broadcast

v5: - never use broadcast if there is a single node not understanding it
- always use broadcast otherwise
- ad sequence numbering to non-bulk messages

Signed-off-by: Hoang Le 
---
 net/tipc/bcast.c  |   4 +-
 net/tipc/bcast.h  |   2 +
 net/tipc/msg.h|  40 +++
 net/tipc/name_distr.c | 113 +++---
 net/tipc/name_distr.h |   9 ++--
 net/tipc/name_table.c |   9 +++-
 net/tipc/name_table.h |   2 +
 net/tipc/node.c   |  28 ---
 net/tipc/node.h   |   8 +--
 9 files changed, 170 insertions(+), 45 deletions(-)

diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
index 4c20be08b9c4..e3fce4c579c0 100644
--- a/net/tipc/bcast.c
+++ b/net/tipc/bcast.c
@@ -249,8 +249,8 @@ static void tipc_bcast_select_xmit_method(struct net *net, 
int dests,
  * Consumes the buffer chain.
  * Returns 0 if success, otherwise errno: -EHOSTUNREACH,-EMSGSIZE
  */
-static int tipc_bcast_xmit(struct net *net, struct sk_buff_head *pkts,
-  u16 *cong_link_cnt)
+int tipc_bcast_xmit(struct net *net, struct sk_buff_head *pkts,
+   u16 *cong_link_cnt)
 {
struct tipc_link *l = tipc_bc_sndlink(net);
struct sk_buff_head xmitq;
diff --git a/net/tipc/bcast.h b/net/tipc/bcast.h
index 9e847d9617d3..f095a2ac27cb 100644
--- a/net/tipc/bcast.h
+++ b/net/tipc/bcast.h
@@ -89,6 +89,8 @@ void tipc_bcast_toggle_rcast(struct net *net, bool supp);
 int tipc_mcast_xmit(struct net *net, struct sk_buff_head *pkts,
struct tipc_mc_method *method, struct tipc_nlist *dests,
u16 *cong_link_cnt);
+int tipc_bcast_xmit(struct net *net, struct sk_buff_head *pkts,
+   u16 *cong_link_cnt);
 int tipc_bcast_rcv(struct net *net, struct tipc_link *l, struct sk_buff *skb);
 void tipc_bcast_ack_rcv(struct net *net, struct tipc_link *l,
struct tipc_msg *hdr);
diff --git a/net/tipc/msg.h b/net/tipc/msg.h
index 871feadbbc19..a0930d664958 100644
--- a/net/tipc/msg.h
+++ b/net/tipc/msg.h
@@ -409,6 +409,36 @@ static inline void msg_set_errcode(struct tipc_msg *m, u32 
err)
msg_set_bits(m, 1, 25, 0xf, err);
 }
 
+static inline void msg_set_bulk(struct tipc_msg *m)
+{
+   msg_set_bits(m, 1, 28, 0xf, 1);
+}
+
+static inline u32 msg_is_bulk(struct tipc_msg *m)
+{
+   return msg_bits(m, 1, 28, 0xf);
+}
+
+static inline void msg_set_last_bulk(struct tipc_msg *m)
+{
+   msg_set_bits(m, 1, 27, 0xf, 1);
+}
+
+static inline u32 msg_is_last_bulk(struct tipc_msg *m)
+{
+   return msg_bits(m, 1, 27, 0xf);
+}
+
+static inline void msg_set_non_legacy(struct tipc_msg *m)
+{
+   msg_set_bits(m, 1, 26, 0xf, 1);
+}
+
+static inline u32 msg_is_legacy(struct tipc_msg *m)
+{
+   return !msg_bits(m, 1, 26, 0xf);
+}
+
 static inline u32 msg_reroute_cnt(struct tipc_msg *m)
 {
return msg_bits(m, 1, 21, 0xf);
@@ -538,6 +568,16 @@ static inline void msg_set_origport(struct tipc_msg *m, 
u32 p)
msg_set_word(m, 4, p);
 }
 
+static inline u16 msg_named_seqno(struct tipc_msg *m)
+{
+   return msg_bits(m, 4, 0, 0x);
+}
+
+static inline void msg_set_named_seqno(struct tipc_msg *m, u16 n)
+{
+   msg_set_bits(m, 4, 0, 0x, n);
+}
+
 static inline u32 msg_destport(struct tipc_msg *m)
 {
return msg_word(m, 5);
diff --git a/net/tipc/name_distr.c b/net/tipc/name_distr.c
index 5feaf3b67380..c58b19418c42 100644
--- a/net/tipc/name_distr.c
+++ b/net/tipc/name_distr.c
@@ -102,7 +102,8 @@ struct sk_buff *tipc_named_publish(struct net *net, struct 
publication *publ)
pr_warn("Publication distribution failure\n");
return NULL;
}
-
+   msg_set_named_seqno(buf_msg(skb), nt->snd_nxt++);
+   msg_set_non_legacy(buf_msg(skb));
item = (struct distr_item *)msg_data(buf_msg(skb));
publ_to_item(item, publ);
return skb;
@@ -114,8 +115,8 @@ struct sk_buff *tipc_name

Re: [tipc-discussion] [PATCH RFC] tipc: make peer node cleanup timer configurable

2020-04-14 Thread Hoang Huu Le

Hi Tuong,

The  'node_cleanup' is not clear in meaning.  I'd prefer using 
'node_cleanup_time_secs'.
I think we should use 'milliseconds' for more popular than 'seconds' in 
proc/sysctl.

Regards,
Hoang
-Original Message-
From: Tuong Tong Lien  
Sent: Tuesday, April 14, 2020 2:30 PM
To: jma...@redhat.com; ma...@donjonn.com; ying@windriver.com; 
tipc-discussion@lists.sourceforge.net
Cc: tipc-dek 
Subject: [PATCH RFC] tipc: make peer node cleanup timer configurable

Since commit 6a939f365bdb ("tipc: Auto removal of peer down node
instance"), a peer node instance is automatically clean up after 5 mins
without contact. This changed behavior is not really expected from user
who would expect to see the node with the "down" state in the node list
as before instead.
Also, the timer value is said to be small that the peer node might just
be rebooting, so will come back soon. This is absolutely no problem but
one wants to extend it to 10 minutes.

This commit makes the timer configurable via the sysctl file:

/proc/sys/net/tipc/node_cleanup

which indicates in seconds how long a peer down node should be removed.
The default value is 300 i.e. 5 mins, while a value of '0' will disable
the auto removal feature.

Signed-off-by: Tuong Lien 
---
 net/tipc/core.h   |  1 +
 net/tipc/node.c   | 15 +--
 net/tipc/sysctl.c |  8 
 3 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/net/tipc/core.h b/net/tipc/core.h
index 631d83c9705f..7d07f9086936 100644
--- a/net/tipc/core.h
+++ b/net/tipc/core.h
@@ -89,6 +89,7 @@ struct tipc_crypto;
 extern unsigned int tipc_net_id __read_mostly;
 extern int sysctl_tipc_rmem[3] __read_mostly;
 extern int sysctl_tipc_named_timeout __read_mostly;
+extern int sysctl_tipc_node_cleanup __read_mostly;
 
 struct tipc_net {
u8  node_id[NODE_ID_LEN];
diff --git a/net/tipc/node.c b/net/tipc/node.c
index 2373a66f3b59..5cf01e182730 100644
--- a/net/tipc/node.c
+++ b/net/tipc/node.c
@@ -46,8 +46,11 @@
 #include "trace.h"
 #include "crypto.h"
 
+int sysctl_tipc_node_cleanup __read_mostly = 300; /* in seconds */
+
 #define INVALID_NODE_SIG   0x1
-#define NODE_CLEANUP_AFTER 30
+#define NODE_CLEANUP_AFTER \
+   msecs_to_jiffies(sysctl_tipc_node_cleanup * 1000)
 
 /* Flags used to take different actions according to flag type
  * TIPC_NOTIFY_NODE_DOWN: notify node is down
@@ -130,7 +133,7 @@ struct tipc_node {
unsigned long keepalive_intv;
struct timer_list timer;
struct rcu_head rcu;
-   unsigned long delete_at;
+   unsigned long checkpt;
struct net *peer_net;
u32 peer_hash_mix;
 #ifdef CONFIG_TIPC_CRYPTO
@@ -537,7 +540,7 @@ struct tipc_node *tipc_node_create(struct net *net, u32 
addr, u8 *peer_id,
for (i = 0; i < MAX_BEARERS; i++)
spin_lock_init(>links[i].lock);
n->state = SELF_DOWN_PEER_LEAVING;
-   n->delete_at = jiffies + msecs_to_jiffies(NODE_CLEANUP_AFTER);
+   n->checkpt = jiffies;
n->signature = INVALID_NODE_SIG;
n->active_links[0] = INVALID_BEARER_ID;
n->active_links[1] = INVALID_BEARER_ID;
@@ -726,8 +729,8 @@ static bool tipc_node_cleanup(struct tipc_node *peer)
return false;
 
tipc_node_write_lock(peer);
-
-   if (!node_is_up(peer) && time_after(jiffies, peer->delete_at)) {
+   if (!node_is_up(peer) && NODE_CLEANUP_AFTER &&
+   time_after(jiffies, peer->checkpt + NODE_CLEANUP_AFTER)) {
tipc_node_clear_links(peer);
tipc_node_delete_from_list(peer);
deleted = true;
@@ -1478,7 +1481,7 @@ static void node_lost_contact(struct tipc_node *n,
uint i;
 
pr_debug("Lost contact with %x\n", n->addr);
-   n->delete_at = jiffies + msecs_to_jiffies(NODE_CLEANUP_AFTER);
+   n->checkpt = jiffies;
trace_tipc_node_lost_contact(n, true, " ");
 
/* Clean up broadcast state */
diff --git a/net/tipc/sysctl.c b/net/tipc/sysctl.c
index 58ab3d6dcdce..087a89b27b9a 100644
--- a/net/tipc/sysctl.c
+++ b/net/tipc/sysctl.c
@@ -75,6 +75,14 @@ static struct ctl_table tipc_table[] = {
.extra1 = SYSCTL_ONE,
},
 #endif
+   {
+   .procname   = "node_cleanup",
+   .data   = _tipc_node_cleanup,
+   .maxlen = sizeof(sysctl_tipc_node_cleanup),
+   .mode   = 0644,
+   .proc_handler   = proc_dointvec_minmax,
+   .extra1 = SYSCTL_ZERO,
+   },
{}
 };
 
-- 
2.13.7



___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Re: [tipc-discussion] [PATCH RFC 2/4] tipc: add back link trace events

2020-03-27 Thread Hoang Huu Le

Yes, I got the same results.

Hoang
-Original Message-
From: Jon Maloy  
Sent: Friday, March 27, 2020 8:43 PM
To: Tuong Tong Lien ; ma...@donjonn.com; 
ying@windriver.com; tipc-discussion@lists.sourceforge.net
Cc: tipc-...@dektech.com.au
Subject: Re: [tipc-discussion] [PATCH RFC 2/4] tipc: add back link trace events

I received [2/4], 3/4] and [4/4] of thi series but no [0/4] and [1/4].
This is the case both for my redhat account and my private account, so I 
assume the problem is at your end.

Please re-post.

///jon


On 3/27/20 7:56 AM, Tuong Lien wrote:
> In the previous commit ("tipc: add Gap ACK blocks support for broadcast
> link"), we have removed the following link trace events due to the code
> changes:
>
> - tipc_link_bc_ack
> - tipc_link_retrans
>
> This commit adds them back along with some minor changes to adapt to
> the new code.
>
> Signed-off-by: Tuong Lien 
> ---
>   net/tipc/link.c  |  3 +++
>   net/tipc/trace.h | 13 -
>   2 files changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/net/tipc/link.c b/net/tipc/link.c
> index 1b60ba665504..405ccf597e59 100644
> --- a/net/tipc/link.c
> +++ b/net/tipc/link.c
> @@ -1517,6 +1517,8 @@ static int tipc_link_advance_transmq(struct tipc_link 
> *l, struct tipc_link *r,
>   bool is_uc = !link_is_bc_sndlink(l);
>   bool bc_has_acked = false;
>   
> + trace_tipc_link_retrans(r, acked + 1, acked + gap, >transmq);
> +
>   /* Determine Gap ACK blocks if any for the particular link */
>   if (ga && is_uc) {
>   /* Get the Gap ACKs, uc part */
> @@ -2423,6 +2425,7 @@ int tipc_link_bc_ack_rcv(struct tipc_link *r, u16 
> acked, u16 gap,
>   if (less(acked, r->acked) || (acked == r->acked && !gap && !ga))
>   return 0;
>   
> + trace_tipc_link_bc_ack(r, acked, gap, >transmq);
>   tipc_link_advance_transmq(l, r, acked, gap, ga, xmitq, , );
>   
>   tipc_link_advance_backlog(l, xmitq);
> diff --git a/net/tipc/trace.h b/net/tipc/trace.h
> index 4d8e00483afc..e7535ab75255 100644
> --- a/net/tipc/trace.h
> +++ b/net/tipc/trace.h
> @@ -299,8 +299,10 @@ DECLARE_EVENT_CLASS(tipc_link_transmq_class,
>   __entry->from = f;
>   __entry->to = t;
>   __entry->len = skb_queue_len(tq);
> - __entry->fseqno = msg_seqno(buf_msg(skb_peek(tq)));
> - __entry->lseqno = msg_seqno(buf_msg(skb_peek_tail(tq)));
> + __entry->fseqno = __entry->len ?
> +   msg_seqno(buf_msg(skb_peek(tq))) : 0;
> + __entry->lseqno = __entry->len ?
> +   msg_seqno(buf_msg(skb_peek_tail(tq))) : 0;
>   ),
>   
>   TP_printk("<%s> retrans req: [%u-%u] transmq: %u [%u-%u]\n",
> @@ -308,15 +310,16 @@ DECLARE_EVENT_CLASS(tipc_link_transmq_class,
> __entry->len, __entry->fseqno, __entry->lseqno)
>   );
>   
> -DEFINE_EVENT(tipc_link_transmq_class, tipc_link_retrans,
> +DEFINE_EVENT_CONDITION(tipc_link_transmq_class, tipc_link_retrans,
>   TP_PROTO(struct tipc_link *r, u16 f, u16 t, struct sk_buff_head *tq),
> - TP_ARGS(r, f, t, tq)
> + TP_ARGS(r, f, t, tq),
> + TP_CONDITION(less_eq(f, t))
>   );
>   
>   DEFINE_EVENT_PRINT(tipc_link_transmq_class, tipc_link_bc_ack,
>   TP_PROTO(struct tipc_link *r, u16 f, u16 t, struct sk_buff_head *tq),
>   TP_ARGS(r, f, t, tq),
> - TP_printk("<%s> acked: [%u-%u] transmq: %u [%u-%u]\n",
> + TP_printk("<%s> acked: %u gap: %u transmq: %u [%u-%u]\n",
> __entry->name, __entry->from, __entry->to,
> __entry->len, __entry->fseqno, __entry->lseqno)
>   );



___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion


___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

74 matches

Mail list logo