Bryan,

I believe the problem here might be related to Linux memory management and
big ATS ram cache.
Here's some testing I've done and some results simply about stopping ATS
after some traffic filled up the Ram Cache.

[root@psp6cdatsec05 ~]# sysctl -a | grep shm
kernel.shmmax = 68719476736
kernel.shmall = 4294967296
kernel.shmmni = 4096
Traffic Server 5.3.2 + a few patches.

psp6cdedge03: Dell R630 256GB 8x3.6T SSD
- ram_cache = 128GB
- Ram_drive = 64GB
- hugepages = 65536 (128GB)

psp6cdedge05: Dell R730xd 196GB 24x1.2TB HDD
- ram_cache = 96GB
- Ram_drive = 32GB
- no hugepages

psp6cdedge06: Dell R730xd 196GB 24x1.2TB HDD
- ram_cache = 96GB
- Ram_drive = 32GB
- hugepages = 64000 (120GB)

Notes:
- Between the R730xd, there was no difference in performance (huge pages)
- After 20 minutes runtime (About 3Gbps each of VOD traffic, 10000 titles),
a huge difference in page table size.
  - psp6cdedge05 : PageTables:       205124 kB
  - psp6cdedge06 : PageTables:         7136 kB
- After 30 minutes
  - All hugepages been used on psp6cdedge03
  - There is still 11000 hugespages available on psp6cdedge06
- Stopped the streams after 30 minutes, waited another 30 minutes and then
stopped trafficserver

## psp6cdedge03, 128GB RAM Cache, 128GB huge pages.. not enough, depleted.

[root@psp6cdedge03 ~]# cat /proc/meminfo  | egrep "Huge|Page"
AnonPages:       9768284 kB
PageTables:        24352 kB
AnonHugePages:         0 kB
HugePages_Total:   65536
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB

[root@psp6cdedge03 ~]# service trafficserver stop
Stopping traffic_cop:                                      [  OK  ]
Stopping traffic_manager:                                  [  OK  ]
Stopping traffic_server:                                   [FAILED]

Took about 10 seconds, but failed to stop.

[root@psp6cdedge03 ~]# cat /proc/meminfo  | egrep "Huge|Page"
AnonPages:        138636 kB
PageTables:         4584 kB
AnonHugePages:         0 kB
HugePages_Total:   65536
HugePages_Free:    65536
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB

## psp6cdedge06, 96GB ram cache, 120GB huge pages.. Just fine!

[root@psp6cdatsec06 ~]# cat /proc/meminfo | egrep "Huge|Page"
AnonPages:       1159908 kB
PageTables:         7372 kB
AnonHugePages:         0 kB
HugePages_Total:   64000
HugePages_Free:    11029
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB

[root@psp6cdatsec06 ~]# service trafficserver stop
Stopping traffic_cop:                                      [  OK  ]
Stopping traffic_manager:                                  [  OK  ]
Stopping traffic_server:                                   [  OK  ]

[root@psp6cdatsec06 ~]# cat /proc/meminfo | egrep "Huge|Page"
AnonPages:        187368 kB
PageTables:         4656 kB
AnonHugePages:         0 kB
HugePages_Total:   64000
HugePages_Free:    64000
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB

## psp6cdedge05, 96GB ram cache, no huge pages. Disaster! 15 minutes for
process to stop.

[root@psp6cdatsec05 ~]# cat /proc/meminfo | egrep "Huge|Page"
AnonPages:      104075516 kB
PageTables:       208460 kB
AnonHugePages:         0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB

[root@psp6cdatsec05 ~]# service trafficserver stop
Stopping traffic_cop:                                      [  OK  ]
Stopping traffic_manager:                                  [  OK  ]
Stopping traffic_server:                                   [FAILED]

[root@psp6cdatsec05 ~]# date
Sat Aug 20 01:43:24 UTC 2016

Running "perf top" shows the ET_AIO Thread stuck in the following :
  93.26%  [kernel]             [k] try_to_unmap_anon

[root@psp6cdatsec05 ~]# date
Sat Aug 20 01:58:27 UTC 2016

[root@psp6cdatsec05 ~]# ps aux | grep ET_NET
root      39293  0.0  0.0 103312   900 pts/0    S+   01:58   0:00 grep
ET_NET

[root@psp6cdatsec05 ~]# cat /proc/meminfo | egrep "Huge|Page"
AnonPages:        171152 kB
PageTables:         4596 kB
AnonHugePages:         0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB




On Thu, Aug 18, 2016 at 7:24 AM, Steve Malenfant <[email protected]>
wrote:

> Bryan,
>
> Somehow the problem came back after a few hours (not as bad yet)... This
> is running 5.3.2 with a few patches (available on Traffic Control GitHub).
>
> Per-node numastat info (in MBs):
>                           Node 0          Node 1           Total
>                  --------------- --------------- ---------------
> Numa_Hit               831049.44       684360.21      1515409.66
> Numa_Miss                   0.00            0.00            0.00
> Numa_Foreign                0.00            0.00            0.00
> Interleave_Hit            423.24          423.16          846.40
> Local_Node             831046.29       683892.71      1514939.01
> Other_Node                  3.15          467.50          470.65
>
>
> #  sysctl -a | grep huge
> vm.nr_hugepages = 0
> vm.nr_hugepages_mempolicy = 0
> vm.hugetlb_shm_group = 0
> vm.hugepages_treat_as_movable = 0
> vm.nr_overcommit_hugepages = 0
>
> We ATS set with CONFIG proxy.config.allocator.hugepages = INT 0
>
> On Wed, Aug 17, 2016 at 5:11 PM, Bryan Call <[email protected]> wrote:
>
>> Can you run: numastat -v?  Looking up try_to_unmap_anon it looks like the
>> kernel is trying to balance the memory usage across the numa nodes.  There
>> was a recent change to how memory allocation is done on the aio threads in
>> this ticket:
>> https://issues.apache.org/jira/browse/TS-4719
>>
>> Can you also run sudo sysctl -a | grep huge and paste the output?
>>
>> -Bryan
>>
>>
>>
>> On Aug 16, 2016, at 11:00 PM, Steve Malenfant <[email protected]>
>> wrote:
>>
>> Bryan, here's the output during of one of those events...
>>
>> This server RAM is used at about 80%. Hugepage is not enabled. 256GB of
>> Ram Cache on a server with 384GB RAM.
>>
>> Samples: 756K of event 'cycles', Event count (approx.): 26719576313
>> Overhead  Shared Object            Symbol
>>   70.35%  [kernel]                 [k] try_to_unmap_anon
>>    2.42%  [kernel]                 [k] find_busiest_group
>>    1.34%  [kernel]                 [k] _spin_lock
>>    0.73%  [kernel]                 [k] thread_return
>>    0.64%  libpthread-2.12.so       [.] pthread_cond_timedwait@@GLIBC_
>> 2.3.2
>>    0.52%  [kernel]                 [k] schedule
>>    0.50%  [kernel]                 [k] find_next_bit
>>    0.42%  [kernel]                 [k] cpumask_next_and
>>    0.38%  [kernel]                 [k] _spin_lock_irqsave
>>
>> [~]# cat /proc/meminfo
>> MemTotal:       396790288 kB
>> MemFree:        52626016 kB
>> Buffers:          151148 kB
>> Cached:          2035288 kB
>> SwapCached:            0 kB
>> Active:          1198652 kB
>> Inactive:        1171944 kB
>> Active(anon):     222464 kB
>> Inactive(anon):      152 kB
>> Active(file):     976188 kB
>> Inactive(file):  1171792 kB
>> Unevictable:    289593148 kB
>> Mlocked:        289595216 kB
>> SwapTotal:       4095996 kB
>> SwapFree:        4095996 kB
>> Dirty:               208 kB
>> Writeback:             0 kB
>> AnonPages:      289777432 kB
>> Mapped:           128752 kB
>> Shmem:               244 kB
>> Slab:           15548892 kB
>> SReclaimable:   15454924 kB
>> SUnreclaim:        93968 kB
>> KernelStack:       17648 kB
>> PageTables:       573316 kB
>> NFS_Unstable:          0 kB
>> Bounce:                0 kB
>> WritebackTmp:          0 kB
>> CommitLimit:    202491140 kB
>> Committed_AS:   291736828 kB
>> VmallocTotal:   34359738367 kB
>> VmallocUsed:     1383676 kB
>> VmallocChunk:   34157955876 kB
>> HardwareCorrupted:     0 kB
>> AnonHugePages:         0 kB
>> HugePages_Total:       0
>> HugePages_Free:        0
>> HugePages_Rsvd:        0
>> HugePages_Surp:        0
>> Hugepagesize:       2048 kB
>> DirectMap4k:       58368 kB
>> DirectMap2M:     1968128 kB
>> DirectMap1G:    400556032 kB
>>
>> On Tue, Aug 16, 2016 at 8:54 PM, Bryan Call <[email protected]> wrote:
>>
>>> Can you run sudo perf top or sudo perf record during the time you are
>>> having a spike in CPU usage?
>>>
>>> -Bryan
>>>
>>>
>>>
>>>
>>> On Aug 16, 2016, at 3:09 PM, Steve Malenfant <[email protected]>
>>> wrote:
>>>
>>> We have a condition where 4 caches, every 33-34 minutes goes high CPU
>>> utliization (1000%) and prevents incoming connections. This last about 30
>>> seconds and goes away.
>>>
>>> It seems to occur with some specific origin/content which I don't
>>> exactly know which one so far.
>>>
>>> I'm including some details (in case it's obvious), but how do we profile
>>> a running server to see what's causing this high cpu usage?
>>>
>>> The load is normal, maybe about 100Mbps (basically idling).
>>>
>>> top - 17:20:33 up 61 days,  3:32,  4 users,  load average: 4.74, 1.20,
>>> 0.53
>>> Tasks: 616 total,  16 running, 600 sleeping,   0 stopped,   0 zombie
>>> Cpu(s):  2.6%us, 22.3%sy,  0.0%ni, 74.9%id,  0.0%wa,  0.0%hi,  0.2%si,
>>>  0.0%st
>>> Mem:  396790288k total, 342530608k used, 54259680k free,   148764k
>>> buffers
>>> Swap:  4095996k total,        0k used,  4095996k free,  1618892k cached
>>>
>>>    PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>>>   8703 ats       20   0  282g 275g  39m D 853.9 72.7  12496:54 [ET_NET 0]
>>>     61 root      20   0     0    0    0 R  5.6  0.0   5:10.11
>>> ksoftirqd/14
>>>     77 root      20   0     0    0    0 R  5.0  0.0   5:15.08
>>> ksoftirqd/18
>>>    125 root      20   0     0    0    0 R  5.0  0.0   5:02.48
>>> ksoftirqd/30
>>>     13 root      20   0     0    0    0 R  4.6  0.0   5:23.63 ksoftirqd/2
>>>     37 root      20   0     0    0    0 R  4.6  0.0   5:10.31 ksoftirqd/8
>>>     45 root      20   0     0    0    0 R  4.6  0.0   5:10.36
>>> ksoftirqd/10
>>>     93 root      20   0     0    0    0 R  4.6  0.0   5:29.05
>>> ksoftirqd/22
>>>    109 root      20   0     0    0    0 R  4.6  0.0   5:06.27
>>> ksoftirqd/26
>>>     21 root      20   0     0    0    0 R  4.3  0.0   5:12.82 ksoftirqd/4
>>>     29 root      20   0     0    0    0 R  4.3  0.0   5:10.79 ksoftirqd/6
>>>     85 root      20   0     0    0    0 R  4.3  0.0   5:18.36
>>> ksoftirqd/20
>>>    101 root      20   0     0    0    0 R  4.3  0.0   5:13.54
>>> ksoftirqd/24
>>>      4 root      20   0     0    0    0 R  4.0  0.0   5:16.25 ksoftirqd/0
>>>     69 root      20   0     0    0    0 R  4.0  0.0   5:09.72
>>> ksoftirqd/16
>>>    117 root      20   0     0    0    0 R  3.0  0.0   5:03.74
>>> ksoftirqd/28
>>>     53 root      20   0     0    0    0 S  1.7  0.0   5:07.40
>>> ksoftirqd/12
>>>
>>> Here's a "dstat" output of that server when this is occurring... updated
>>> every second.
>>>
>>> ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
>>> usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw
>>>   0   0  98   1   0   0|7940k   10M|6270k   18M|   0     0 |  13k   35k
>>>   0   0  99   0   0   0|9738k 4160k|1600k   10M|   0     0 |  12k   34k
>>>   0   0  99   0   0   0|3232k 4136k|1390k 5724k|   0     0 |  11k   34k
>>>   0   0 100   0   0   0|   0  4096k|  34k  555k|   0     0 |  10k   33k
>>>   0   0  99   0   0   0|1316k 9095k|7028k 7240k|   0     0 |  11k   34k
>>>   0   1  99   0   0   0| 288k 8658k|6189k 5878k|   0     0 |  29k   36k
>>>   0   0  99   1   0   0|  13M 6204k|2870k   17M|   0     0 |  14k   35k
>>>   0   0  99   0   0   0|   0  4096k|3009k 3387k|   0     0 |  14k   35k
>>>   0   0  99   0   0   0|1028k 6384k|5138k 6047k|   0     0 |  14k   35k
>>>   0   0  99   1   0   0|5127k 8192k|4262k 7417k|   0     0 |  13k   34k
>>>   0   0  99   0   0   0|2372k 2612k|2011k 4944k|   0     0 |  14k   35k
>>>   0   0  99   0   0   0|   0  4172k|1633k 2201k|   0     0 |  13k   35k
>>>   0   0  99   0   0   0|  11M  608k|2528k   12M|   0     0 |  13k   35k
>>>   0   3  95   1   0   0|  15M 8409k|  11M   28M|   0     0 |  16k   37k
>>>   0   3  95   1   0   0|6324k 4244k|7959k   20M|   0     0 |  12k   32k
>>>   0   4  96   0   0   0| 256k 2128k| 853k 2823k|   0     0 |  27k   32k
>>>   0   3  96   0   0   0|1028k    0 | 472k  562k|   0     0 |  10k   31k
>>>   0   3  96   0   0   0|1028k   68k|1457k 1454k|   0     0 |  11k   32k
>>>   0   3  97   0   0   0|   0     0 | 316k  317k|   0     0 |  11k   31k
>>>   0   3  97   0   0   0|3076k    0 | 415k 3251k|   0     0 |9498    30k
>>>   0   3  97   0   0   0|2688k   84k|  16k 2985k|   0     0 |8623    30k
>>>   0   3  97   0   0   0|   0     0 |  64k   65k|   0     0 |9121    30k
>>>   0   3  97   0   0   0|   0    68k|1607k 1556k|   0     0 |  11k   31k
>>>   0   3  97   0   0   0|   0     0 |  18k  120k|   0     0 |7669    28k
>>>   0   3  97   0   0   0|   0     0 | 536k  537k|   0     0 |8660    30k
>>>   0   4  96   0   0   0|   0    72k|  65k   75k|   0     0 |  26k   30k
>>>   0   3  96   0   0   0|   0  2104k|3079k 3093k|   0     0 |9267    30k
>>>   0   3  97   0   0   0|   0   108k| 593k  506k|   0     0 |7591    28k
>>>   0   3  96   0   0   0|1152k 4096B| 124k  238k|   0     0 |7914    29k
>>>   0   3  97   0   0   0|   0     0 |5719B 1574B|   0     0 |8884    29k
>>>   0   3  96   0   0   0|   0   224k|  18k   51k|   0     0 |9092    30k
>>>   0   3  97   0   0   0|   0     0 |  87k  108k|   0     0 |8998    29k
>>>   0   3  96   0   0   0|5328k    0 |1594k 6790k|   0     0 |8499    29k
>>>   0   3  97   0   0   0|   0   156k|  44k  203k|   0     0 |8039    29k
>>>   0   3  97   0   0   0|   0     0 |  17k   82k|   0     0 |5377    25k
>>>   0   4  96   0   0   0|   0   252k|3224B 6858B|   0     0 |  28k   31k
>>>   0   3  96   0   0   0|   0     0 |2139B 8535B|   0     0 |  10k   29k
>>>   0   3  97   0   0   0|5508k    0 |  20k 5516k|   0     0 |7762    29k
>>>   0   3  97   0   0   0|   0    72k|  10k   28k|   0     0 |7026    29k
>>>   0   3  97   0   0   0|   0     0 |3782B  712B|   0     0 |5829    29k
>>>   0   4  96   0   0   0|   0   264k|  11k   10k|   0     0 |5170   183k
>>>   0   4  95   0   0   0|   0     0 |  15k 1415B|   0     0 |4125   245k
>>>   0   3  97   0   0   0|   0     0 |5416B  966B|   0     0 |3608    40k
>>>   0   3  97   0   0   0|   0  4096B|6233B 3868B|   0     0 |3636    41k
>>>   0   3  96   0   0   0|   0    64k|5761B 1594B|   0     0 |3579    41k
>>>   2  20  78   0   0   0|   0    80k|  11k 1544B|   0     0 |  16k 3590k
>>>   3  26  71   0   0   0|   0     0 |  13k 9997B|   0     0 |  12k 4969k
>>>   3  26  71   0   0   0|   0     0 |6685B 1042B|   0     0 |  12k 5017k
>>>   3  25  71   0   0   0|   0     0 |  11k 1660B|   0     0 |  12k 4957k
>>>   3  25  72   0   0   0|   0     0 |7765B 4463B|   0     0 |  12k 4862k
>>>   3  25  72   0   0   0|   0   128k|8377B 1478B|   0     0 |  12k 4875k
>>>   3  25  72   0   0   0|   0     0 |4407B  992B|   0     0 |  12k 4814k
>>>   3  25  71   0   0   0|   0     0 |  17k 1959B|   0     0 |  12k 4993k
>>>   3  26  71   0   0   0|   0     0 |7774B 1478B|   0     0 |  12k 4986k
>>>   3  25  72   0   0   0|   0     0 |5728B 1066B|   0     0 |  12k 4921k
>>>   3  25  71   0   0   0|   0   108k|  14k 1816B|   0     0 |  19k 4922k
>>>   3  25  72   0   0   0|   0     0 |6296B 9096B|   0     0 |  12k 4899k
>>>   3  25  72   0   0   0|   0     0 |8974B 1166B|   0     0 |  12k 4914k
>>>   3  26  71   0   0   0|   0     0 |  12k 1038B|   0     0 |  12k 5133k
>>>   3  25  72   0   0   0|   0     0 |9956B 1668B|   0     0 |  12k 4873k
>>>   3  25  72   0   0   0|   0   228k|8025B 2172B|   0     0 |  12k 4841k
>>>   3  25  72   0   0   0|   0     0 |7743B 1728B|   0     0 |  12k 4888k
>>>   3  26  71   0   0   0|   0     0 |8287B 6061B|   0     0 |  12k 4955k
>>>   3  26  70   0   0   0|   0     0 |8702B 1602B|   0     0 |  12k 5064k
>>> ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
>>> usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw
>>>   3  26  71   0   0   0|   0     0 |8898B 1198B|   0     0 |  12k 4983k
>>>   3  25  72   0   0   0|   0   152k|  10k   48k|   0     0 |  17k 4840k
>>>   3  25  72   0   0   0|   0     0 |4030B 8837B|   0     0 |  15k 4900k
>>>   3  25  72   0   0   0|   0     0 |  12k 2260B|   0     0 |  12k 4909k
>>>   3  26  71   0   0   0|   0     0 |  14k 2169B|   0     0 |  12k 5001k
>>>   3  26  70   0   0   0|   0     0 |  12k 1470B|   0     0 |  12k 5047k
>>>   3  26  71   0   0   0|   0   232k|  14k 1964B|   0     0 |  12k 4967k
>>>   3  25  72   0   0   0|   0     0 |  14k 2115B|   0     0 |  12k 4846k
>>>   3  25  72   0   0   0|   0     0 |  15k 2144B|   0     0 |  12k 4945k
>>>   3  25  72   0   0   0|   0     0 |  15k 1544B|   0     0 |  12k 4862k
>>>   3  26  71   0   0   0|   0     0 |6595B 1124B|   0     0 |  12k 4991k
>>>   3  27  70   0   0   0|   0   128k|  39k 7960B|   0     0 |  17k 5065k
>>>   3  26  71   0   0   0|   0     0 |  13k   11k|   0     0 |  12k 4942k
>>>   3  25  72   0   0   0|   0     0 |9035B 2072B|   0     0 |  12k 4881k
>>>   3  25  72   0   0   0|   0     0 |9057B 2788B|   0     0 |  12k 4825k
>>>   1   5  90   4   0   0|  41M   28M|  49M  104M|   0     0 |  30k  848k
>>>   0   1  92   7   0   0|  28M   51M|  46M  166M|   0     0 |  22k   39k
>>>   0   0  97   2   0   0|  19M   20M|  18M   48M|   0     0 |  15k   36k
>>>   0   0  97   2   0   0|  21M   14M|  13M   41M|   0     0 |  15k   36k
>>>   0   0  97   3   0   0|  13M   16M|  21M   33M|   0     0 |  14k   36k
>>>   0   0  96   3   0   0|  14M   20M|  21M   42M|   0     0 |  14k   36k
>>>   0   1  96   3   0   0|  15M   15M|  16M   32M|   0     0 |  24k   36k
>>>   0   0  99   1   0   0|  11M 4351k|3789k   18M|   0     0 |  12k   34k
>>>   0   0  99   0   0   0|7182k    0 |1881k 8056k|   0     0 |  11k   33k
>>>   0   0  99   0   0   0|3164k  100k|2091k 6605k|   0     0 |  11k   35k
>>>   0   0  99   0   0   0|3079k 2647k|6850k 6604k|   0     0 |  11k   33k
>>>   0   0  98   1   0   0|8640k 7195k|6742k   17M|   0     0 |  12k   34k
>>>   0   0  99   1   0   0|   0  4451k|6093k 7067k|   0     0 |  10k   33k
>>>   0   0  99   1   0   0|3908k 4787k|1294k 4556k|   0     0 |  11k   33k
>>>   0   0  99   0   0   0|7460k   44k|1505k 9884k|   0     0 |9809    32k
>>>
>>> Thanks,
>>>
>>> Steve
>>>
>>>
>>>
>>
>>
>

Reply via email to