I had a production Linux server freeze today, we had
to power cycle it to get the service up.
(It runs Fedora 6, kernel 2.6.22.14-72.fc6, and is slated for
upgrade to CentOS 5.x in the near future.)
I have 5 minute snapshots of "free", "vmstat", "top", "ps",
and "iostat", so I have the system profile before the outage
and earlier, when it was running fine (and of course, now).
Normally swap is unused on this server. (vmstat si and so are 0)
Right before the crash, "free" showed very little memory used by
the OS for buffers and cache, and non-zero swap used, plus
sar and vmstat show swapping in and out, which leads me to
suspect memory pressure from an application:
total used free shared buffers cached
Mem: 5098080 5071132 26948 0 344 2908
-/+ buffers/cache: 5067880 30200
Swap: 4192888 42644 4150244
I want to find out the root source of the memory pressure... which app used
up all the RAM?
If I sum up the "VSZ" numbers in "ps", I get 4178540 KB or about 4 GB.
If I sum up the "RSS" numbers in "ps", I get 6028 KB.
If I go back a week, when the system was running fine,
If I sum up the "VSZ" numbers in "ps", I get 4743300 KB.
If I sum up the "RSS" numbers in "ps", I get 191248 KB.
What gives? processes are using less memory right before the freeze,
than a week ago when the system was running fine.
Here is what "free" looked like, a week ago:
total used free shared buffers cached
Mem: 5098080 5068128 29952 0 50628 99104
-/+ buffers/cache: 4918396 179684
Swap: 4192888 168 4192720
Compare that to right before the crash:
total used free shared buffers cached
Mem: 5098080 5071132 26948 0 344 2908
-/+ buffers/cache: 5067880 30200
Swap: 4192888 42644 4150244
So, what I'd like to understand:
1. Why does "ps" say I am using 4 GB of RAM (VSZ + RSS),
and "free" says 5 GB?
2. Why is my RSS so low and "free" so high? What's using the
physical RAM?
3. What should I be using to per-process memory usage on Linux?
4. How to account for the decrease in RSS from 186 MB to 6 MB?
Am I running into a swap-happy kernel? I am considering decreasing
vm.swappiness from the default 60 to 10. Have you all run into any
issues with the default setting of 60? Please share your experience.
Best,
Aleksey
_______________________________________________
Tech mailing list
[email protected]
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
http://lopsa.org/