FWIW, my code changes did manage to fix the problems I was having with idempotent datagram requests, where I could retransmit either way safely. Things got messier for a non-idempotent service with either dropped datagrams or dropped connections due to the link reset. So far, the only thing I have found to mitigate the problem is increasing the UDP media/bearer/link tolerance from 1500 to 5000. At least the tests I have run so far seem happier with it.
Gary Duzan IT Architect Senior GT.M Core Team ________________________________ From: Duzan, Gary D via tipc-discussion <tipc-discussion@lists.sourceforge.net> Sent: Thursday, May 16, 2024 4:46 PM To: Tung Quang Nguyen <tung.q.ngu...@dektech.com.au>; tipc-discussion@lists.sourceforge.net <tipc-discussion@lists.sourceforge.net> Subject: Re: [tipc-discussion] Tuning/Debugging of "Retransmission failure" Tung Quang Nguyen 5/15/2024 11:56 PM Not sure what you mean by "pushing TIPC a bit". If it means dropping TIPC messages, then "Retransmission failure" is expected. It just means that I increased the amount of traffic across TIPC in the cluster. I only noticed because it appeared from the application level that messages were being dropped. I do have TIPC_DEST_DROPPABLE and TIPC_SRC_DROPPABLE set to zero, but I just realized that I only have the TIPC_ERRINFO handling on one end. I should fix that. Tung Quang Nguyen 5/15/2024 11:56 PM If you did not intentionally drop TIPC messages, then issue could be due to packet drop at NIC (in your VM node or host or Switch/Router etc.). You need to do the tunning at NIC. Running some more tests, it does appear that my clusters of larger servers with eth bearers are not encountering this issue, but my clusters of smaller servers with udp bearers are. There is also some disparity in net.tipc.tipc_rmem settings, so I should address that. So it looks like I have some things to try. I'll follow up if that doesn't address the problem. Thanks. Gary Duzan IT Architect Senior GT.M Core Team ________________________________ From: Tung Quang Nguyen <tung.q.ngu...@dektech.com.au> Sent: Wednesday, May 15, 2024 11:56 PM To: Duzan, Gary D <gary.du...@fisglobal.com>; tipc-discussion@lists.sourceforge.net <tipc-discussion@lists.sourceforge.net> Subject: RE: Tuning/Debugging of "Retransmission failure" > I've started to notice messages like this when pushing TIPC a bit: Not sure what you mean by "pushing TIPC a bit". If it means dropping TIPC messages, then "Retransmission failure" is expected. >Is there any tuning I can do to avoid the problem, or other data to collect to >better understand it? > If you did not intentionally drop TIPC messages, then issue could be due to packet drop at NIC (in your VM node or host or Switch/Router etc.). You need to do the tunning at NIC. The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you. Message Encrypted via TLS connection _______________________________________________ tipc-discussion mailing list tipc-discussion@lists.sourceforge.net https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Ftipc-discussion&data=05%7C02%7Cgary.duzan%40fisglobal.com%7Cd4e9b48e25094d3a586508dc75e962c0%7Ce3ff91d834c84b15a0b418910a6ac575%7C0%7C0%7C638514892444800272%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=C%2F7M%2BcPUuQlPRZIQd7k7OicwrZFtRG6vYK9aFKyfP50%3D&reserved=0<https://lists.sourceforge.net/lists/listinfo/tipc-discussion> The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you. Message Encrypted via TLS connection _______________________________________________ tipc-discussion mailing list tipc-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/tipc-discussion