Re: swap maxed out when plenty of RAM available

2022-03-25 Thread Nicholas Geovanis
On Fri, Mar 25, 2022 at 12:27 PM Greg Wooledge  wrote:

> On Fri, Mar 25, 2022 at 04:51:51PM +, Adam Weremczuk wrote:
> > [Tue Mar 22 00:24:10 2022] Tasks state (memory values in pages):
> > [Tue Mar 22 00:24:10 2022] [  pid  ]   uid  tgid total_vm  rss
> > pgtables_bytes swapents oom_score_adj name
> > [Tue Mar 22 00:24:10 2022] [   2211] 0  221114228 228
> 159744
> > 127 0 systemd
> > [Tue Mar 22 00:24:10 2022] [   2622] 0  262293208 59485
> > 753664   73 0 systemd-journal
>
> Well, at this point systemd-journald (I assume the name is truncated)
> was using more memory than anything else.
> .

> [Tue Mar 22 00:24:10 2022] Memory cgroup out of memory: Killed process
> 11695
>
> So, next it killed dhcpd.  And it still wasn't done.
>
> > [Tue Mar 22 00:24:10 2022] [  21057] 0 21057 1069 31
> 53248
> > 0 0 apt.systemd.dai
> > [Tue Mar 22 00:24:10 2022] [  21065] 0 2106517753 2552
> > 1802240 0 apt-get
> > [Tue Mar 22 00:24:10 2022] [  21068] 0 21068 9475 110
> > 1105920 0 systemd-journal
> > [Tue Mar 22 00:24:10 2022]
> oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=ns,mems_allowed=0-1,oom_memcg=/lxc/101,task_memcg=/lxc/101/ns,task=apt-get,pid=21065,uid=0
> > [Tue Mar 22 00:24:10 2022] Memory cgroup out of memory: Killed process
> 21065
>
> At this point, it killed apt-get.
>
> Looks like this system doesn't have enough memory to perform its daily
> tasks (including what I'm guessing are unattended upgrades, triggering
> calls to apt-get from a systemd timer).  You'll either need to stop
> letting it run those daily tasks, or add more memory, or add more swap,
> or get rid of some of the other programs that are using memory.
>

If you have re-configured your apt repositories but made a mistake, or lost
contact with them for
other network reasons, you will see those automated apt-get commands
stack-up over time.
They reach a point in execution where they acquire a lock that blocks all
other updates, so just
keep piling up. Each consumes swap. If the CPU workload is the local
bottleneck you may notice
a CPU or two pegged at 100% as the first symptom.


> If you really want the unattended upgrades, adding more swap would be
> the easiest solution, but be warned that this could mean the system
> will run extremely slowly during those unattended upgrades.  That could
> be something you don't care about, or something that matters a lot.  Only
> you would know.
>

If it's a server, servers should not swap and they should not get upgraded
without purpose.


Re: swap maxed out when plenty of RAM available

2022-03-25 Thread Greg Wooledge
On Fri, Mar 25, 2022 at 04:51:51PM +, Adam Weremczuk wrote:
> [Tue Mar 22 00:24:10 2022] Tasks state (memory values in pages):
> [Tue Mar 22 00:24:10 2022] [  pid  ]   uid  tgid total_vm  rss
> pgtables_bytes swapents oom_score_adj name
> [Tue Mar 22 00:24:10 2022] [   2211] 0  2211    14228 228   159744 
> 127 0 systemd
> [Tue Mar 22 00:24:10 2022] [   2622] 0  2622    93208 59485  
> 753664   73 0 systemd-journal

Well, at this point systemd-journald (I assume the name is truncated)
was using more memory than anything else.

> [Tue Mar 22 00:24:10 2022] 
> oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=ns,mems_allowed=0-1,oom_memcg=/lxc/101,task_memcg=/lxc/101/ns,task=systemd-journal,pid=2622,uid=0
> [Tue Mar 22 00:24:10 2022] Memory cgroup out of memory: Killed process 2622

And the kernel killed it.

But there still wasn't enough memory, so it continued:

> [Tue Mar 22 00:24:10 2022] pgsteal 9855948
> [Tue Mar 22 00:24:10 2022] Tasks state (memory values in pages):
> [Tue Mar 22 00:24:10 2022] [  pid  ]   uid  tgid total_vm  rss
> pgtables_bytes swapents oom_score_adj name
> [Tue Mar 22 00:24:10 2022] [   2211] 0  2211    14228 228   159744 
> 127 0 systemd
[...]
> [Tue Mar 22 00:24:10 2022] [  11695] 0 11695    10289 2685  
> 114688    0 0 dhcpd
[...]
> [Tue Mar 22 00:24:10 2022] [  21053] 0 21053 1069 17    53248   
> 0 0 apt.systemd.dai
> [Tue Mar 22 00:24:10 2022] [  21057] 0 21057 1069 31    53248   
> 0 0 apt.systemd.dai
> [Tue Mar 22 00:24:10 2022] [  21065] 0 21065    17753 1712  
> 180224    0 0 apt-get
> [Tue Mar 22 00:24:10 2022] 
> oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=ns,mems_allowed=0-1,oom_memcg=/lxc/101,task_memcg=/lxc/101/ns,task=dhcpd,pid=11695,uid=0
> [Tue Mar 22 00:24:10 2022] Memory cgroup out of memory: Killed process 11695

So, next it killed dhcpd.  And it still wasn't done.

> [Tue Mar 22 00:24:10 2022] [  21057] 0 21057 1069 31    53248   
> 0 0 apt.systemd.dai
> [Tue Mar 22 00:24:10 2022] [  21065] 0 21065    17753 2552  
> 180224    0 0 apt-get
> [Tue Mar 22 00:24:10 2022] [  21068] 0 21068 9475 110  
> 110592    0 0 systemd-journal
> [Tue Mar 22 00:24:10 2022] 
> oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=ns,mems_allowed=0-1,oom_memcg=/lxc/101,task_memcg=/lxc/101/ns,task=apt-get,pid=21065,uid=0
> [Tue Mar 22 00:24:10 2022] Memory cgroup out of memory: Killed process 21065

At this point, it killed apt-get.

Looks like this system doesn't have enough memory to perform its daily
tasks (including what I'm guessing are unattended upgrades, triggering
calls to apt-get from a systemd timer).  You'll either need to stop
letting it run those daily tasks, or add more memory, or add more swap,
or get rid of some of the other programs that are using memory.

If you really want the unattended upgrades, adding more swap would be
the easiest solution, but be warned that this could mean the system
will run extremely slowly during those unattended upgrades.  That could
be something you don't care about, or something that matters a lot.  Only
you would know.



Re: swap maxed out when plenty of RAM available

2022-03-25 Thread Adam Weremczuk

Thanks all for picking up a discussion and all the useful hints.

The container has been behaving since reboot, using 0 swap ATM.

Post-mortem on dhcp service crash is showing 100% memory usage and OOM, 
still not sure why though:


[Tue Mar 22 00:24:10 2022] apt-get invoked oom-killer: 
gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
[Tue Mar 22 00:24:10 2022] CPU: 1 PID: 21065 Comm: apt-get Tainted: 
P   OE 5.4.44-1-pve #1
[Tue Mar 22 00:24:10 2022] Hardware name: IBM System x3650 M3 
-[7945F2G]-/69Y4438, BIOS -[D6E162AUS-1.20]- 05/07/2014

[Tue Mar 22 00:24:10 2022] Call Trace:
[Tue Mar 22 00:24:10 2022]  dump_stack+0x6d/0x9a
[Tue Mar 22 00:24:10 2022]  dump_header+0x4f/0x1e1
[Tue Mar 22 00:24:10 2022]  oom_kill_process.cold.33+0xb/0x10
[Tue Mar 22 00:24:10 2022]  out_of_memory+0x1ad/0x490
[Tue Mar 22 00:24:10 2022]  mem_cgroup_out_of_memory+0xc4/0xd0
[Tue Mar 22 00:24:10 2022]  try_charge+0x76b/0x7e0
[Tue Mar 22 00:24:10 2022]  ? sched_clock+0x9/0x10
[Tue Mar 22 00:24:10 2022]  mem_cgroup_try_charge+0x71/0x190
[Tue Mar 22 00:24:10 2022]  __add_to_page_cache_locked+0x265/0x340
[Tue Mar 22 00:24:10 2022]  ? scan_shadow_nodes+0x30/0x30
[Tue Mar 22 00:24:10 2022]  add_to_page_cache_lru+0x4f/0xd0
[Tue Mar 22 00:24:10 2022]  pagecache_get_page+0xff/0x2e0
[Tue Mar 22 00:24:10 2022]  filemap_fault+0x86f/0xa60
[Tue Mar 22 00:24:10 2022]  ? xas_load+0xc/0x80
[Tue Mar 22 00:24:10 2022]  ? xas_find+0x17e/0x1b0
[Tue Mar 22 00:24:10 2022]  ? filemap_map_pages+0x28d/0x3b0
[Tue Mar 22 00:24:10 2022]  ? mem_cgroup_try_charge+0x71/0x190
[Tue Mar 22 00:24:10 2022]  __do_fault+0x3c/0x130
[Tue Mar 22 00:24:10 2022]  __handle_mm_fault+0xeb0/0x12e0
[Tue Mar 22 00:24:10 2022]  handle_mm_fault+0xc9/0x1f0
[Tue Mar 22 00:24:10 2022]  __do_page_fault+0x233/0x4c0
[Tue Mar 22 00:24:10 2022]  do_page_fault+0x2c/0xe0
[Tue Mar 22 00:24:10 2022]  page_fault+0x34/0x40
[Tue Mar 22 00:24:10 2022] RIP: 0033:0x7fa5788f1afc
[Tue Mar 22 00:24:10 2022] Code: Bad RIP value.
[Tue Mar 22 00:24:10 2022] RSP: 002b:7ffe3ca68390 EFLAGS: 00010246
[Tue Mar 22 00:24:10 2022] RAX: 7fa574d14010 RBX: 55e8fade31c8 
RCX: 8d55
[Tue Mar 22 00:24:10 2022] RDX: 7fa574ebc008 RSI:  
RDI: 7fa574d14018
[Tue Mar 22 00:24:10 2022] RBP: 55e8fc9e6140 R08:  
R09: 
[Tue Mar 22 00:24:10 2022] R10: 0022 R11: 0246 
R12: 55e8fc9e0b50
[Tue Mar 22 00:24:10 2022] R13: d1af R14:  
R15: 
[Tue Mar 22 00:24:10 2022] memory: usage 524288kB, limit 524288kB, 
failcnt 451174
[Tue Mar 22 00:24:10 2022] memory+swap: usage 1048548kB, limit 
1048576kB, failcnt 2828266
[Tue Mar 22 00:24:10 2022] kmem: usage 15708kB, limit 
9007199254740988kB, failcnt 0

[Tue Mar 22 00:24:10 2022] Memory cgroup stats for /lxc/101:
[Tue Mar 22 00:24:10 2022] anon 21909504
[Tue Mar 22 00:24:10 2022] file 499474432
[Tue Mar 22 00:24:10 2022] kernel_stack 442368
[Tue Mar 22 00:24:10 2022] slab 12877824
[Tue Mar 22 00:24:10 2022] sock 0
[Tue Mar 22 00:24:10 2022] shmem 499851264
[Tue Mar 22 00:24:10 2022] file_mapped 140439552
[Tue Mar 22 00:24:10 2022] file_dirty 0
[Tue Mar 22 00:24:10 2022] file_writeback 135168
[Tue Mar 22 00:24:10 2022] anon_thp 0
[Tue Mar 22 00:24:10 2022] inactive_anon 291426304
[Tue Mar 22 00:24:10 2022] active_anon 229199872
[Tue Mar 22 00:24:10 2022] inactive_file 913408
[Tue Mar 22 00:24:10 2022] active_file 0
[Tue Mar 22 00:24:10 2022] unevictable 0
[Tue Mar 22 00:24:10 2022] slab_reclaimable 6713344
[Tue Mar 22 00:24:10 2022] slab_unreclaimable 6164480
[Tue Mar 22 00:24:10 2022] pgfault 1443466134
[Tue Mar 22 00:24:10 2022] pgmajfault 9724242
[Tue Mar 22 00:24:10 2022] workingset_refault 9709788
[Tue Mar 22 00:24:10 2022] workingset_activate 2071971
[Tue Mar 22 00:24:10 2022] workingset_nodereclaim 0
[Tue Mar 22 00:24:10 2022] pgrefill 12114220
[Tue Mar 22 00:24:10 2022] pgscan 18064348
[Tue Mar 22 00:24:10 2022] pgsteal 9855662
[Tue Mar 22 00:24:10 2022] Tasks state (memory values in pages):
[Tue Mar 22 00:24:10 2022] [  pid  ]   uid  tgid total_vm  rss 
pgtables_bytes swapents oom_score_adj name
[Tue Mar 22 00:24:10 2022] [   2211] 0  2211    14228 228   
159744  127 0 systemd
[Tue Mar 22 00:24:10 2022] [   2622] 0  2622    93208 59485   
753664   73 0 systemd-journal
[Tue Mar 22 00:24:10 2022] [   2691]   107  2691    11282 43   
126976   71  -900 dbus-daemon
[Tue Mar 22 00:24:10 2022] [   2729] 0  2729    62560 312   
135168   87 0 rsyslogd
[Tue Mar 22 00:24:10 2022] [   2731] 0  2731 9495 28   
114688   89 0 systemd-logind
[Tue Mar 22 00:24:10 2022] [   2730] 0  2730 6998 17    
98304   44 0 cron
[Tue Mar 22 00:24:10 2022] [  11695] 0 11695    10289 2685   
114688    0 0 dhcpd
[Tue Mar 22 00:24:10 2022] [   3025] 0  3025    20295 27   
135168  172 

Re: swap maxed out when plenty of RAM available

2022-03-24 Thread Nathanael Schweers


Adam Weremczuk  writes:

> The container was running like that for several months until this morning 
> when its core service (dhcp) started failing.

Just a wild guess, but do you know what caused dhcp to fail?  Was it too
little memory?
>
> I logged in to investigate and noticed 100% of swap being used with maybe 
> 10-20% of RAM in use.

If I recall correctly, Linux may choose to swap pages out in order to
free up physical memory in order to use said memory for buffers and
caches.  This is a performance optimization.  So if there are pages
which have not been touched for a while, but I/O performance might
benefit from a larger cache, this is actually good for performance.

Regards,
Nathanael



Re: swap maxed out when plenty of RAM available

2022-03-23 Thread Charles Kroeger
I use

dphys-swapfile

this is a system service that auto configures a swap at boot without
requiring a static partition.

it computes the size of an optimal swap file and or resizes an existing
swap file if necessary. it mounts, dismounts, and deletes the swap if not
wanted. it doesn't dynamically resize swap during runtime.

-- 
pa'lante 



Re: swap maxed out when plenty of RAM available

2022-03-22 Thread David Christensen

On 3/22/22 07:55, Adam Weremczuk wrote:

Hi all,

I run a tiny and lightweight Debian 9.9 LXC container on Proxmox 6.2-6.

It has 512 MB of memory and 512 MB of swap assigned and typically needs 
50-100 MB to operate.


Last year I started seeing about half of swap being used with very 
little use of RAM.


I then made the following changes:

/etc/sysctl.d/60-my-swappiness.conf
vm.swappiness=10

/etc/sysctl.conf
vm.swappiness=10

and rebooted.

The container was running like that for several months until this 
morning when its core service (dhcp) started failing.


I logged in to investigate and noticed 100% of swap being used with 
maybe 10-20% of RAM in use.


Before I had time to look into details the container crashed (powered off).

I'll probably try to get rid of swap entirely as an experiment to see 
what happens.


Unless somebody has any better ideas and hints?

Regards,
Adam



Debian 9.9 is out of date.  Please update/ upgrade the LXC container to 
Debian 9.13.



If problems persist, please post console sessions with relevant details 
for the host machine (Proxmox hypervisor), the LXC container, and a DHCP 
client when the LXC container DHCP service is malfunctioning.



David



Re: swap maxed out when plenty of RAM available

2022-03-22 Thread Greg Wooledge
On Tue, Mar 22, 2022 at 04:00:23PM -0400, Kenneth Parker wrote:
> On Tue, Mar 22, 2022 at 2:17 PM Greg Wooledge  wrote:
> 
> > On Tue, Mar 22, 2022 at 01:00:42PM -0500, Nicholas Geovanis wrote:
> > > That's the usual issue. The /tmp filesystem is usually configured to live
> > > in RAM,
> >
> > That's not the default in Debian.  Of course, it might have been set up
> > that way on the OP's system.
> >
> 
> 
> This is an education for me.  You are quite right that /tmp is not in
> tmpfs, but other things are, including /dev/shm, which has the same,
> "liberal" permissions as /tmp.
> 
> So where *does* /tmp reside?

Depends on how you partition things during the installation.  It could
be a separate file system on disk, or just a plain old directory inside
the root file system.

The "default" (as far as that term has any meaning) is to be a plain old
directory.



Re: swap maxed out when plenty of RAM available

2022-03-22 Thread Kenneth Parker
On Tue, Mar 22, 2022 at 2:17 PM Greg Wooledge  wrote:

> On Tue, Mar 22, 2022 at 01:00:42PM -0500, Nicholas Geovanis wrote:
> > That's the usual issue. The /tmp filesystem is usually configured to live
> > in RAM,
>
> That's not the default in Debian.  Of course, it might have been set up
> that way on the OP's system.
>


This is an education for me.  You are quite right that /tmp is not in
tmpfs, but other things are, including /dev/shm, which has the same,
"liberal" permissions as /tmp.

So where *does* /tmp reside?

Thank you,

Kenneth Parker


Re: swap maxed out when plenty of RAM available

2022-03-22 Thread Greg Wooledge
On Tue, Mar 22, 2022 at 01:00:42PM -0500, Nicholas Geovanis wrote:
> That's the usual issue. The /tmp filesystem is usually configured to live
> in RAM,

That's not the default in Debian.  Of course, it might have been set up
that way on the OP's system.

> at some point an application needed to use lots of it. It may not have
> freed it properly
> from dying or maybe it's still running, or just misbehaving :-)

When an application terminates, all of the memory that it used for
private variables and such gets freed, and becomes available to other
processes.  However, this doesn't cause swapped-out pages belonging
to other processes to be swapped back in.  So, for example, immediately
after closing a web browser that was using gobs and gobs of memory, you
might see that the "used" memory drops dramatically, and "free" memory
grows.  But the amount of swap being used won't change immediately.

Swap space is used primarily (or ideally) by *inactive* processes.  It
won't be released until one of those processes wakes up and needs the
data that has been swapped out.  Or until one of those inactive
processes terminates, at which point all of its swap usage is released.

(When swap is used by an *active* process, that's called thrashing.  At
that point, you're in trouble.)



Re: swap maxed out when plenty of RAM available

2022-03-22 Thread Nicholas Geovanis
On Tue, Mar 22, 2022 at 12:21 PM Dan Ritter  wrote:

> Charles Curley wrote:
> > On Tue, 22 Mar 2022 14:55:34 +
> > Adam Weremczuk  wrote:
> >
> > > It has 512 MB of memory and 512 MB of swap assigned and typically
> > > needs 50-100 MB to operate.
> >
> > The rule of thumb to which I am accustomed is to have a swap space
> > double the physical RAM. If necessary, you can create a swap file and
> > add that to your /etc/fstab. That might help with your current problem.
> .
> That said, there is probably something else going on here. Logs
> on a tmpfs, maybe?
>

That's the usual issue. The /tmp filesystem is usually configured to live
in RAM,
at some point an application needed to use lots of it. It may not have
freed it properly
from dying or maybe it's still running, or just misbehaving :-)
If this happens often, consider building a larger /tmp in a real filesystem
on a real drive.


> -dsr-
>
>


Re: swap maxed out when plenty of RAM available

2022-03-22 Thread Dan Ritter
Charles Curley wrote: 
> On Tue, 22 Mar 2022 14:55:34 +
> Adam Weremczuk  wrote:
> 
> > It has 512 MB of memory and 512 MB of swap assigned and typically
> > needs 50-100 MB to operate.
> 
> The rule of thumb to which I am accustomed is to have a swap space
> double the physical RAM. If necessary, you can create a swap file and
> add that to your /etc/fstab. That might help with your current problem.

In the other direction, I would recommend removing swap and
reducing RAM to 256MB. For a service like DHCPd, it's better to
fail and restart than to just try to eat all available memory.
Then set vm.overcommit_memory=2 in sysctl to enforce that rather
than OOMd.

That said, there is probably something else going on here. Logs
on a tmpfs, maybe? 

-dsr-



Re: swap maxed out when plenty of RAM available

2022-03-22 Thread Greg Wooledge
On Tue, Mar 22, 2022 at 10:35:40AM -0600, Charles Curley wrote:
> On Tue, 22 Mar 2022 14:55:34 +
> Adam Weremczuk  wrote:
> 
> > It has 512 MB of memory and 512 MB of swap assigned and typically
> > needs 50-100 MB to operate.
> 
> The rule of thumb to which I am accustomed is to have a swap space
> double the physical RAM. If necessary, you can create a swap file and
> add that to your /etc/fstab. That might help with your current problem.

All those rules of thumb are crap.  They assume so many things that
you just can't assume.

If a system is filling up swap, that means that *at some point* there
was enough simultaneous memory demand from applications to require
filling up swap.  That memory demand might not exist *right now*, but
at some point, it did.  This might mean it's likely to occur again.
Or maybe it was a one-time thing.  Who can say?  Only the person whose
system it is.

The OP is going to need to gather a lot more data, unless they can
remember something like "Oh wait, I compiled emacs on this system the
other day; maybe that used a lot of memory."  Maybe set up a service
to run "vmstat 5" at boot and redirect it to a file.  Then if the full
swap partition is observed again, they can look at this file and at
least get an estimated timeframe for when the high memory use occurred,
and some supporting data.  There may be better tools, but vmstat is
the first one I can think of.



Re: swap maxed out when plenty of RAM available

2022-03-22 Thread Charles Curley
On Tue, 22 Mar 2022 14:55:34 +
Adam Weremczuk  wrote:

> It has 512 MB of memory and 512 MB of swap assigned and typically
> needs 50-100 MB to operate.

The rule of thumb to which I am accustomed is to have a swap space
double the physical RAM. If necessary, you can create a swap file and
add that to your /etc/fstab. That might help with your current problem.

-- 
Does anybody read signatures any more?

https://charlescurley.com
https://charlescurley.com/blog/



swap maxed out when plenty of RAM available

2022-03-22 Thread Adam Weremczuk

Hi all,

I run a tiny and lightweight Debian 9.9 LXC container on Proxmox 6.2-6.

It has 512 MB of memory and 512 MB of swap assigned and typically needs 
50-100 MB to operate.


Last year I started seeing about half of swap being used with very 
little use of RAM.


I then made the following changes:

/etc/sysctl.d/60-my-swappiness.conf
vm.swappiness=10

/etc/sysctl.conf
vm.swappiness=10

and rebooted.

The container was running like that for several months until this 
morning when its core service (dhcp) started failing.


I logged in to investigate and noticed 100% of swap being used with 
maybe 10-20% of RAM in use.


Before I had time to look into details the container crashed (powered off).

I'll probably try to get rid of swap entirely as an experiment to see 
what happens.


Unless somebody has any better ideas and hints?

Regards,
Adam