2013/5/12 Mihai Rotaru <[email protected]>:
> Salut,
>
> Ma confrunt cu o problema destul de bizara in care nu inteleg unde "dispar"
> in jur de 20 GB. Am testat cu RedHat 6.4 (kernel default 2.6.x) si cu
> Debian 7.0 (kernel default 3.x) iar problema apare in ambele distributii.
>
> Rulez o aplicatie pe un server dedicat cu 64 GB de RAM. La aplicatie se
> conecteaza 5 milioane de clienti prin TCP care raman conectati la
> aplicatie.
>
> Kernelul foloseste 16 GB de memorie pentru SLAB pentru cele 5 milioane de
> socketuri (vezi "slabtop" si SLAB din "/proc/meminfo").
>
> Aplicatia foloseste in jur de 25 GB (vezi RES din "top" si ACTIVE din
> "/proc/meminfo")
>
> Deci, memoria folosita de aplicatie 25 GB + memoria folosita de kernel
> (slab) 16 GB = 41 GB. Deci, din cei 64 GB total, ar trebui sa mai ramana 23
> GB memorie libera. Cu toate astea, nu raman decat 2 GB de memorie libera
> dupa cum este raportat de "vmstat" si "top". Am testat (incercand sa
> conectez mai multi clienti) si intr-adevar memoria libera reala este cea
> raportata de "vmstat" si "top", adica 2 GB.
>
> Are cineva vreo idee unde au "disparut" cei in jur de 20 GB, mai exact, de
> cine sunt folositi cei 20 GB de RAM si de ce Linux-ul nu-i raporteaza sau
> daca-i raporteaza unde anume? Mai jos este output-ul de la top, slabtop,
> vmstat, si /proc/meminfo.
>
> Mersi mult,
> Mihai
>
> -bash-4.1# more /proc/meminfo
> MemTotal:       65956340 kB
> MemFree:         2867416 kB
> Buffers:           23972 kB
> Cached:            75244 kB
> SwapCached:            0 kB
> Active:         25330980 kB
> Inactive:          67204 kB
> Active(anon):   25299084 kB
> Inactive(anon):       80 kB
> Active(file):      31896 kB
> Inactive(file):    67124 kB
> Unevictable:           0 kB
> Mlocked:               0 kB
> SwapTotal:      33046520 kB
> SwapFree:       33046520 kB
> Dirty:                12 kB
> Writeback:             0 kB
> AnonPages:      25298960 kB
> Mapped:            14084 kB
> Shmem:               204 kB
> Slab:           17095636 kB
> SReclaimable:    5011828 kB
> SUnreclaim:     12083808 kB
> KernelStack:        2568 kB
> PageTables:        51892 kB
> NFS_Unstable:          0 kB
> Bounce:                0 kB
> WritebackTmp:          0 kB
> CommitLimit:    66024688 kB
> Committed_AS:   27313564 kB
> VmallocTotal:   34359738367 kB
> VmallocUsed:      457356 kB
> VmallocChunk:   34325066836 kB
> HardwareCorrupted:     0 kB
> AnonHugePages:  25171968 kB
> HugePages_Total:       0
> HugePages_Free:        0
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> Hugepagesize:       2048 kB
> DirectMap4k:        5604 kB
> DirectMap2M:     2078720 kB
> DirectMap1G:    65011712 kB
>
> -bash-4.1# slabtop
>
>  Active / Total Objects (% used)    : 30118719 / 30135494 (99.9%)
>  Active / Total Slabs (% used)      : 4269889 / 4269910 (100.0%)
>  Active / Total Caches (% used)     : 99 / 181 (54.7%)
>  Active / Total Size (% used)       : 15391683.09K / 15394406.68K (100.0%)
>  Minimum / Average / Maximum Object : 0.02K / 0.51K / 4096.00K
>
>   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
>
> 5012780 5012730  99%    0.19K 250639       20   1002556K dentry
> 5001540 5001512  99%    0.19K 250077       20   1000308K filp
> 5000285 5000025  99%    0.07K  94345       53    377380K eventpoll_pwq
> 5000165 5000161  99%    0.69K 1000033        5   4000132K sock_inode_cache
> 5000160 5000025  99%    0.12K 166672       30    666688K eventpoll_epi
> 5000018 5000012  99%    1.81K 2500009        2  10000036K TCPv6
>  20832  20334  97%    0.03K    186    112      744K size-32
>  14337  14197  99%    0.14K    531     27     2124K sysfs_dir_cache
>  13570  10962  80%    0.06K    230     59      920K size-64
>   8843   8820  99%    0.10K    239     37      956K buffer_head
>   8268   7879  95%    0.07K    156     53      624K selinux_inode_security
>
> -bash-4.1# top
>
> top - 17:53:33 up  6:57,  5 users,  load average: 0.00, 0.03, 0.07
> Tasks: 255 total,   1 running, 254 sleeping,   0 stopped,   0 zombie
> Cpu(s): 10.8%us,  8.1%sy,  0.0%ni, 74.4%id,  0.0%wa,  0.0%hi,  6.7%si,
>  0.0%st
> Mem:  65956340k total, 63087348k used,  2868992k free,    24020k buffers
> Swap: 33046520k total,        0k used, 33046520k free,    75244k cached
>
> PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME  COMMAND
>
> 32676 root      20   0 29.9g  24g 9916 S 252.1 38.4 149:44.21 java
> ...
>
> -bash-4.1# vmstat 5
> procs -----------memory---------- ---swap-- -----io---- --system--
> -----cpu-----
>  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id
> wa st
>  3  0      0 2869620  24036  75244    0    0     0     0   65   57  3  3 93
>  0  0
>  0  0      0 2868628  24036  75244    0    0     0     0 101608 154075  7
>  9 84  0  0

Mai e tinuta cate o pagina pentru fiecare socket activ. Astea parca nu
erau contorizate in
/proc/meminfo. Daca ai 5M conexiuni active e posibil sa se ajunga la
20G. E o schimbare
prin versiunile mai noi de kernel in care paginile astea nu mai sunt
per socket, ci per
process. Poti sa incerci cu 3.8 sa vezi daca e mai buna situatia.

sorin
_______________________________________________
RLUG mailing list
[email protected]
http://lists.lug.ro/mailman/listinfo/rlug

Raspunde prin e-mail lui