Hi Jon,
Thanks for the comment, I totally agreed that skb_may_pull is the one
changing the data content, but that procedure trace is a bit Hard to follow. So
the correct data is probably after validate has linearized the data. It seems
safe to make the change anyway since no one uses bc_ack before validate.
We are checking that if we have your recommended patch. Is that patch already
in kernel 4.9.11?
Thanks,
Matthew
-----Original Message-----
From: Jon Maloy [mailto:[email protected]]
Sent: Wednesday, February 22, 2017 2:57 PM
To: Wong, Matthew; [email protected]
Subject: RE: [tipc-discussion] tipc multicast stuck (hit max window) due to
invalid bc_ack value
Hi Matthew,
See below for my comment.
Also, although this is about a different problem, you should check if you have
the following patch, and the one it is referring to:
commit 06bd2b1ed04ca9f (" tipc: fix broadcast link synchronization problem")
> -----Original Message-----
> From: Wong, Matthew [mailto:[email protected]]
> Sent: Wednesday, February 22, 2017 12:41 PM
> To: [email protected]
> Subject: [tipc-discussion] tipc multicast stuck (hit max window) due
> to invalid bc_ack value
>
>
> Hi all,
>
> I'm currently working on 4.4.0 kernel and is observing the
> following issues on tipc multicast.
>
>
> 1. I have a system setup with 3 CPUs each using tipc to multicast to
> processes running on each CPU. After sending around 50 messages (the
> max window size), the far end did not receive the message any more.
> When Iooking at the tipc-conf -ls data, it said the broadcast-link
> start bunding
[...]
>
> 4. It seems the tipc_msg_validate modified the skb message and the hdr.
> The modified data looks fine and has the correct expected bc-ack/ack
> values in the message. However, currently the bc_ack and ack value is
> initialized before the tipc_msg_validate and so we'll use that value
> which may cause issue on my bc_ack update and comparsion.
The only possible culprit here is the function skb_may_pull(), which is called
from msg_validate() in the rare case that header part of the packet buffer is
non-linear. The function is a little hard to follow, but as I understand it, it
linearizes the buffer in such cases, and header fields read before the
validation will obviously be wrong.
>
>
>
> 5 If i move the bc_ack and ack after tipc_msg_validate, i don't have
> any
> more tipc multicast stuck issue. I have run it for half a day with
> multicast on 4 CPUs and so far there is no tipc multicast bundle
> trigger and no bogus bc_ack issue. All multicast messges has been sent and
> received properly.
>
>
>
> 6 Is this a known behavior and is this an issue? If yes, is this a
> patch for it
> and will 4.4.48 has the same issue? Does the tipc_msg_validate
> function suppose to modify the hdr data and should we use the
> bc_ack/ack values afterwards the modification is completed.
We have never seen this before, but your diagnostics is totally credible. I
will post a patch for this asap.
Nice job!
BR
///jon
>
>
>
> Any comment is appreciated.
>
>
>
> Regards,
>
> Matthew
>
> Sonus network.
>
> ----------------------------------------------------------------------
> -------- Check out the vibrant tech community on one of the world's
> most engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> _______________________________________________
> tipc-discussion mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/tipc-discussion
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
tipc-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/tipc-discussion