> > Well I have a script that loops through all running processes, etc.
> > checked the used swap space and calculates the sum from all together.
> > When running the script, it always tells me: "Swap usage overall: 0" -
> > checking all processes on single base I only see 0
> > So I really don't understand how swap can be in use when no process has
> > a reference to it.
> > Maybe a little background the XML database requires having swap
> > available and it also seems to use it.
> 
> My low-level kernel internal knowledge (beyond basic tuning) is even
> more dated than my JRE/VM internals (I really need to change that), so
> I'd have to dive deeper into how various vm/proc information is
> instrumented in the kernel before I answer further.  This could
> include how pages are allocated and marked by the program itself.
Don't worry too much - I'll take a look at it with the application guys and 
probably with the vendor
of the XML DB. They probably already have an answer for that.

> > The problem is, when we stop the XML database (which is java) and the
> > Oracle Database, swap isn't freed up.
> > When I do a swapoff -a (when all applications are shutdown) it takes
> > like 2 hours to free it up and have it available for maintenance
> 
> Then they really are pages that are virtually never utilized.  My
> apologies, as I might have missed it, but are you seeing any
> performance issues?  Is there a "wake-up" event that causes paging en
> masse?
> 
> Things like sysstat's sa reporting, where you capture the disk
> statistics would be extremely helpful, or just run its vmstat for 24
> hours and looking for spikes in paging.  If you see reads, and
> reduction of swap usage, it could be the event where that data and/or
> object is finally utilized.
We have systat running on that server, but I don't see any peak in there,
except during backup window of the filesystem and XML DB!

But both of them are only visible on the I/O information. Swap is not touched
at all

 
> > I already raised a change today morning to bring swappiness down to 1 -
> > but my concern is that it won't fix the problem (change not yet approved
> > but hopefully soon).
> 
> Again, it's probably best to "define" the problem.  Is this affecting
> performance at all?  That is the question.  Sysstat helps
> tremendously.
> 
> Just looking back at your "snapshot," you are using only 8GiB for
> resident objects/data, and another 56GiB is used for read caching.
> Your buffers are less than 1GiB in the snapshot.  So your system is
> easily not swamped, at least at that point-in-time, and using nearly
> all of your memory for caching reads.
> 
> > Also, the server isn't very busy during normal operation (from I/O and
> > Load perspective) - only if they load the XML database with new data,
> > the load and I/O goes up.
> 
> Is that XML database largely in memory?  Or is it sizable, but
> possibly accounting for much of your read cache?
I need to check that with the XML DB guys - looking at the stuff I'd say
Yes, the biggest part is in the memory ... or somewhere in the swap area ;-)

But I also saw that lots of data are being swapped after initial startup of the 
application.
The interesting part is, that the process I mentioned earlier is the gui
Application which is supposed to be used for the XML DB Management

 
> > Anyhow, I'd like to share some more kernel settings with you - maybe you
> > can see something we are doing wrong or at least where we can improve
> > kernel.shmmax = 56968879104 (this value is set by puppet to 75% of the
> > real available memory)
> > kernel.shmall = 4294967296
> > vm.max_reclaims_in_progress = 0
> > vm.pagecache = 100
> > vm.swap_token_timeout = 300     0
> > vm.vfs_cache_pressure = 100
> > vm.max_map_count = 65536
> > vm.percpu_pagelist_fraction = 0
> > vm.min_free_kbytes = 34511
> 
> Yep, only 0.03GiB must remain free, so you literally have plenty of room.
> 
> > vm.lowmem_reserve_ratio = 256   256     32
> > vm.swappiness = 60
> 
> Default swappiness is the root of all evil.  If you don't want to use
> swap aggressively, you never leave this in the double digits.  If you
> leave it at 60, expect swap to always be utilized.
> 
> Just because swap is utilized does _not_ mean you're out of physical
> memory.  It's just the kernel, aggressively in this case, leaving free
> memory for buffers that may need to be immediately used, etc..
> 
> In fact, considering your dirty background ratio is 10%, that's
> 7.2GiB, and roughly the amount of free memory you have.  I.e., the
> kernel is likely reserving that amount, and not using it for read
> cache (stopping at 56GiB, with 8GiB of other resident usage), in case
> your system receives off a multi-GiB write in a fraction of a second.
I'll change the value of vm.swappiness to 1 and see what's happening

But as mentioned above the applications starts swapping right after 
initialization
and even then, the server has enough free memory available.
So I think, that it's best to talk to the vendor about that ...

 
> > vm.dirty_expire_centisecs = 2999
> > vm.dirty_writeback_centisecs = 499
> > vm.dirty_ratio = 40
> > vm.dirty_background_ratio = 10
> > vm.page-cluster = 3
> 
> These are all defaults and never ideal for such large memory systems.
> _However_, as I mentioned earlier, this likely does not impact your
> symptoms at all.  At most, as I mentioned above, the dirty ratio of
> 10% could be a factor in why the kernel is leaving around 8GiB free,
> and not using more for read cache.
It's on my todo list to configure the server according the the give hardware
configuration - right now I did not have time for it and the customer did not 
complain
        never wake-up a sleeping dog ;-)

But I'll look at this too and will hopefully find some time soon to-do proper 
tuning.

Thanks for your help and your suggestions and all the best,
Si

_______________________________________________
rhelv5-list mailing list
rhelv5-list@redhat.com
https://www.redhat.com/mailman/listinfo/rhelv5-list

Reply via email to