I will have to look for leaks next time I can make it happen. I was trying stuff and shut down a different program that was unrelated (but had some TIPC sockets open on a different address (104)), and as soon as I did, the sends started working again.
It is possible that one of those unrelated sockets has something stuck (as one of them was only ever used to send RDM messages but nothing ever reads it). Any suggestions as to what to start looking at (netstat, tipc, tipc_config or kernel params) to try to track it down?. Problem with testing a patch (or using Unbuntu 18 LTS) is that we cannot reliably make it happen. -----Original Message----- From: Jon Maloy <jon.ma...@ericsson.com> Sent: Thursday, October 17, 2019 14:35 Hi Rune, Do you see any signs of general memory leak ("free") on your node? Anyway there can be no doubt that this happens because the big buffer pool is running empty. We fixed that in commit 4c94cc2d3d57 ("tipc: fall back to smaller MTU if allocation of local send skb fails") which was delivered to Linux 4.16. Do you have any opportunity to apply that patch and try it? BR ///jon > -----Original Message----- > From: Rune Torgersen <ru...@innovsys.com> > Sent: 17-Oct-19 12:38 > To: 'tipc-discussion@lists.sourceforge.net' <tipc- > discuss...@lists.sourceforge.net> > Subject: [tipc-discussion] Error allocating memeory error when sending RDM > message > > Hi. > > I am running into an issue when sending SOCK_RDM or SOCK_DGRAM > messages. On a system that has been up for a time (120+ days inthis case), I > cannot send any RDM/DGRAM type TIPC messages that are larger than about > 16000 bytes (16033+ fails, 15100 and smaller still works). > Any larger messages fails with erro code 12 :"Cannot allocate memory". > > Really odd thing about it only happens on some connections and not others, > on the same system (example, sending to tipc node 103:1003 gets no error, > while sending to 103:3 get error). > When it gets into this state, it seems to happen forever on the same > destination address, and not on others until system is rebooted. (restarting > the > server side application makes no difference). > The sends are done on the same node as the receiver is on. > > Kernel is Ubuntu 16.04 LTS 4.4.0-150 in this case, also seen on 161. > > Nametable for 103: > 103 2 2 <1.1.1:2328193343> 2328193344 > cluster > 103 3 3 <1.1.2:3153441800> 3153441801 > cluster > 103 5 5 <1.1.4:269294867> 269294868 > cluster > 103 1002 1002 <1.1.1:490133365> 490133366 > cluster > 103 1003 1003 <1.1.2:2552019732> 2552019733 > cluster > 103 1005 1005 <1.1.4:625110186> 625110187 > cluster > > _______________________________________________ > tipc-discussion mailing list > tipc-discussion@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/tipc-discussion _______________________________________________ tipc-discussion mailing list tipc-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/tipc-discussion