And looking at Ubuntu's git repo for xenial, that patch was never backported.

-----Original Message-----
From: Rune Torgersen <ru...@innovsys.com> 
Sent: Friday, October 18, 2019 08:28

cat /proc/buddyinfo
Node 0, zone      DMA      2      2      1      1      3      0      1      0   
   1      1      3
Node 0, zone    DMA32   9275  11572    137      6      0      0      0      0   
   0      0      0
Node 0, zone   Normal  35213  15049    476     11      1      0      1      1   
   1      0      0
Node 1, zone   Normal   5917  25209    490      8      6      3      1      1   
   0      0      0

And I'm aware of the checkin, as I reported it.
I was under the impression that that was backported to the tipc drive in the 
Ubuntu 16.04 LTS 4.4.0 branch (around 4.4.0-110 I think).

Either the fix was never incorporated in the 4.4.0 branch, or was reverted 
recently.

-----Original Message-----
From: Partha <parthasarathy.bhuvara...@gmail.com>
Sent: Friday, October 18, 2019 08:17

Hi Rune,

Your systems memory seems to be fragmented, and you need to perform
forced reclaim. Can you check the buddy for higher order allocations?
  cat /proc/buddyinfo

BTW, I fixed this in:
57d5f64d83ab tipc: allocate user memory with GFP_KERNEL flag

And it was Reported-by: Rune Torgersen <ru...@innovsys.com>

Its in upstream v4.10-rc3-167-g57d5f64d83ab

regards
Partha

On 2019-10-17 22:08, Rune Torgersen wrote:
> Looks like I can kind of make it happen on one system mow.
> Stopping some programs (not pattern in which ones) makes it work, and 
> starting some back up again makes it fail.
>
> Tipc nametable has 231 entries when failing and 183 entries when succeeding 
> (however on a different system the nametable has 251 entries and it is not 
> failing).
>
> How do I look for memory used by TIPC in the kernel?
>
> -----Original Message-----
> From: Rune Torgersen <ru...@innovsys.com>
> Sent: Thursday, October 17, 2019 14:53
>
>
> I will have to look for leaks next time I can make it happen.
> I was trying stuff and shut down a different program that was unrelated (but 
> had some TIPC sockets open on a different address (104)), and as soon as I 
> did, the sends started working again.
>
> It is possible that one of those unrelated sockets has something stuck (as 
> one of them was only ever used to send RDM messages but nothing ever reads 
> it).
>
> Any suggestions as to what to start looking at (netstat, tipc, tipc_config or 
> kernel params) to try to track it down?.
>
> Problem with testing a patch (or using Unbuntu 18 LTS) is that we cannot 
> reliably make it happen.
>
> -----Original Message-----
> From: Jon Maloy <jon.ma...@ericsson.com>
> Sent: Thursday, October 17, 2019 14:35
>
>
> Hi Rune,
>
> Do you see any signs of general memory leak ("free") on your node?
>
> Anyway there can be no doubt that this happens because the big buffer pool is 
> running empty.
>
> We fixed that in commit 4c94cc2d3d57 ("tipc: fall back to smaller MTU if 
> allocation of local send skb fails") which was delivered to Linux 4.16.
>
> Do you have any opportunity to apply that patch and try it?
>
> BR
> ///jon
>
>> -----Original Message-----
>> From: Rune Torgersen <ru...@innovsys.com>
>> Sent: 17-Oct-19 12:38
>> To: 'tipc-discussion@lists.sourceforge.net' <tipc-
>> discuss...@lists.sourceforge.net>
>> Subject: [tipc-discussion] Error allocating memeory error when sending RDM
>> message
>>
>> Hi.
>>
>> I am running into an issue when sending SOCK_RDM or SOCK_DGRAM
>> messages. On a system that has been up for a time (120+ days inthis case), I
>> cannot send any RDM/DGRAM type TIPC messages that are larger than about
>> 16000 bytes (16033+ fails, 15100 and smaller still works).
>> Any larger messages fails with erro code 12 :"Cannot allocate memory".
>>
>> Really odd thing about it  only happens on some connections and not others,
>> on the same system (example, sending to tipc node 103:1003 gets no error,
>> while sending to 103:3 get error).
>> When it gets into this state, it seems to happen forever on the same
>> destination address, and not on others until system is rebooted. (restarting 
>> the
>> server side application makes no difference).
>> The sends are done on the same node as the receiver is on.
>>
>> Kernel is Ubuntu 16.04 LTS 4.4.0-150 in this case, also seen on 161.
>>
>> Nametable for 103:
>> 103        2          2          <1.1.1:2328193343>         2328193344  
>> cluster
>> 103        3          3          <1.1.2:3153441800>         3153441801  
>> cluster
>> 103        5          5          <1.1.4:269294867>          269294868   
>> cluster
>> 103        1002       1002       <1.1.1:490133365>          490133366   
>> cluster
>> 103        1003       1003       <1.1.2:2552019732>         2552019733  
>> cluster
>> 103        1005       1005       <1.1.4:625110186>          625110187   
>> cluster
>>
>> _______________________________________________
>> tipc-discussion mailing list
>> tipc-discussion@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/tipc-discussion
>
>
> _______________________________________________
> tipc-discussion mailing list
> tipc-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/tipc-discussion
>
>
> _______________________________________________
> tipc-discussion mailing list
> tipc-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/tipc-discussion
>

_______________________________________________
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion


_______________________________________________
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Reply via email to