On 09/12/11 13:10, Brian Ruthven - Solaris Network Sustaining - Oracle UK wrote:

A quick look through the script shows that $ram_locked is calculated
from "$pageslocked - $pp_kernel", which in turn effectively comes from
the output of "kstat -p | grep system_pages".

AFAIK, "locked" memory means pages of memory which are not pageable to
the swap device; i.e. they must remain in RAM at all times. Off the top
of my head, I believe any RT processes fall into this category (xnptd
being the most common cause for me), along with explicit requests via
interfaces such as mlockall(3c). I suspect ISM or DISM (as used heavily
by multi-process databases) probably locks pages in RAM, but it's been a
while since I looked into this.

[ "ipcs -a" can be used to display ISM segments and their creators ]

Thanks for your answer Brian.
So, then I understand that Locked RAM means "RAM not pageable to the swap device" but anyway the system can use it?

If I've grasped the right end of the right stick correctly, I don't
believe removing the swap device will have any effect. All this will do
is to force any paged out memory pages back into RAM. By definition,
they were paged out because the system ran short on RAM, looked for
pages which had not been used for a while, and moved them out of RAM
onto the swap device.

IMO, it's best to leave them there until the system decides it needs to
load them back in. Otherwise you just create a bunch of IO requests now
to read them back in, and after you re-add the swap device, the next
time the system is short of memory again, it has to spend more time
paging back out those things which you forced back into RAM earlier...

So, to the real question - what is the problem you are trying to
investigate and/or fix?

"The problem is I have several boxes with a lot of RAM Locked" - is this
really a problem? What issue is it causing you?

On my desktop system at the time of writing this, I have:

unix:0:system_pages:pageslocked 624224
unix:0:system_pages:pagestotal 1502540

This equates to ~2.38Gb locked out of a total of 5.73Gb. As a
percentage, it's 41.5% (higher that the 37.8% in your example), but as
far as I'm aware, it's not causing me a problem, so I don't need to do
anything about it. If my system was thrashing the swap device and
running slowly because it was running out of RAM and constantly paging
around the locked memory, then I might look at this. Until that point is
reached, it shouldn't need any investigation.

Anyway it's curious because:

# host_A
RAM  _____Total 49141.8 Mb
RAM    Unusable  1290.1 Mb
RAM      Kernel  3397.1 Mb
RAM      Locked 18468.9 Mb
RAM        Used 24988.4 Mb
RAM       Avail   997.4 Mb

Disk _____Total 32773.2 Mb
Disk      Alloc  9422.3 Mb
Disk       Free 23351.0 Mb

Swap _____Total 72288.6 Mb
Swap      Alloc 49919.2 Mb
Swap    Unalloc  2525.3 Mb
Swap      Avail 19844.1 Mb
Swap  (MinFree)  6141.7 Mb

but in almost the rest of servers, like that:

# host_B
RAM  _____Total 49141.8 Mb
RAM    Unusable  1290.1 Mb
RAM      Kernel  1774.9 Mb
RAM      Locked     0.0 Mb
RAM        Used 17981.2 Mb
RAM       Avail 28095.7 Mb

Disk _____Total 32773.2 Mb
Disk      Alloc     0.0 Mb
Disk       Free 32773.2 Mb

Swap _____Total 72708.4 Mb
Swap      Alloc 14486.8 Mb
Swap    Unalloc  2588.9 Mb
Swap      Avail 55632.7 Mb
Swap  (MinFree)  6141.7 Mb

I don't understand such a big difference about this value. More useful info:

# host_A
$ prstat -s size | head -n 10
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
  7384 sjsasadm 3567M 3532M sleep    1    0 257:38:54 0.9% java/732
 14821 sjsasadm 3550M 3515M sleep   60    0 274:26:30 0.9% java/706
 16201 sjsasadm 3544M 3475M cpu5     1    0  66:54:19 0.7% java/706
  5272 sjsasadm 3325M 1886M sleep   58    0   0:42:24 0.5% java/705
 24527 sjsasadm  631M  578M sleep   60    0  10:32:36 0.0% java/173
  8693 sjsasadm  258M  126M sleep   59    0   0:57:40 0.0% java/53
  1993 sjsasadm  237M   57M sleep   59    0   0:59:07 0.0% java/36
  5738 sjsasadm  237M   45M sleep   59    0   1:01:27 0.0% java/37
 24240 sjsasadm  237M   57M sleep   59    0   0:57:31 0.0% java/36


# host_B
$ prstat -s size | head -n 10
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
   353 sjsasadm 3559M 3511M sleep   60    0  72:58:50 0.7% java/706
 18098 sjsasadm 3552M 3527M cpu2     1    0 234:36:15 2.5% java/723
 24440 sjsasadm 3543M 3503M sleep    1    0 258:41:06 0.9% java/702
 20531 sjsasadm 3324M 1836M sleep   60    0   0:38:21 0.6% java/700
  9522 sjsasadm  914M  845M sleep   28    0  16:58:56 0.0% java/257
 20626 sjsasadm  238M   72M sleep   59    0   1:41:40 0.0% java/42
 13256 sjsasadm  237M   58M sleep   59    0   1:02:46 0.0% java/36
 15106 sjsasadm  237M   58M sleep   59    0   0:59:37 0.0% java/36
  1320 sjsasadm  237M   57M sleep   59    0   1:00:55 0.0% java/36


As you can see the host A and B are serving Java app (it's the same app even!) and they're using more or less the same amount of memory. So... why this difference?

I don't want to reboot the machine, because:

1) They're in production
2) As you said, the system should be smart enough to manage the memory in a proper way
_______________________________________________
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org

Reply via email to