Re: ATS blocking at half hour interval

Leif Hedstrom Fri, 19 Aug 2016 19:54:07 -0700

> On Aug 19, 2016, at 10:00 PM, Steve Malenfant <[email protected]> wrote:
> 
> Bryan,
> 
> I believe the problem here might be related to Linux memory management and 
> big ATS ram cache.
> Here's some testing I've done and some results simply about stopping ATS 
> after some traffic filled up the Ram Cache.



Yeh, Phil Sorber has posted some insight on this before, but tldr;  turn off 
anonymous huge pages (it’s a boot directive on at least my kernels).

— Leif


transparent_hugepage=never


> 
> [root@psp6cdatsec05 ~]# sysctl -a | grep shm
> kernel.shmmax = 68719476736
> kernel.shmall = 4294967296
> kernel.shmmni = 4096
> Traffic Server 5.3.2 + a few patches.
> 
> psp6cdedge03: Dell R630 256GB 8x3.6T SSD
> - ram_cache = 128GB
> - Ram_drive = 64GB
> - hugepages = 65536 (128GB)
> 
> psp6cdedge05: Dell R730xd 196GB 24x1.2TB HDD
> - ram_cache = 96GB
> - Ram_drive = 32GB
> - no hugepages
> 
> psp6cdedge06: Dell R730xd 196GB 24x1.2TB HDD
> - ram_cache = 96GB
> - Ram_drive = 32GB
> - hugepages = 64000 (120GB)
> 
> Notes:
> - Between the R730xd, there was no difference in performance (huge pages)
> - After 20 minutes runtime (About 3Gbps each of VOD traffic, 10000 titles), a 
> huge difference in page table size.
>   - psp6cdedge05 : PageTables:       205124 kB
>   - psp6cdedge06 : PageTables:         7136 kB
> - After 30 minutes
>   - All hugepages been used on psp6cdedge03
>   - There is still 11000 hugespages available on psp6cdedge06
> - Stopped the streams after 30 minutes, waited another 30 minutes and then 
> stopped trafficserver
> 
> ## psp6cdedge03, 128GB RAM Cache, 128GB huge pages.. not enough, depleted.
> 
> [root@psp6cdedge03 ~]# cat /proc/meminfo  | egrep "Huge|Page"
> AnonPages:       9768284 kB
> PageTables:        24352 kB
> AnonHugePages:         0 kB
> HugePages_Total:   65536
> HugePages_Free:        0
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> Hugepagesize:       2048 kB
> 
> [root@psp6cdedge03 ~]# service trafficserver stop
> Stopping traffic_cop:                                      [  OK  ]
> Stopping traffic_manager:                                  [  OK  ]
> Stopping traffic_server:                                   [FAILED]
> 
> Took about 10 seconds, but failed to stop.
> 
> [root@psp6cdedge03 ~]# cat /proc/meminfo  | egrep "Huge|Page"
> AnonPages:        138636 kB
> PageTables:         4584 kB
> AnonHugePages:         0 kB
> HugePages_Total:   65536
> HugePages_Free:    65536
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> Hugepagesize:       2048 kB
> 
> ## psp6cdedge06, 96GB ram cache, 120GB huge pages.. Just fine!
> 
> [root@psp6cdatsec06 ~]# cat /proc/meminfo | egrep "Huge|Page"
> AnonPages:       1159908 kB
> PageTables:         7372 kB
> AnonHugePages:         0 kB
> HugePages_Total:   64000
> HugePages_Free:    11029
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> Hugepagesize:       2048 kB
> 
> [root@psp6cdatsec06 ~]# service trafficserver stop
> Stopping traffic_cop:                                      [  OK  ]
> Stopping traffic_manager:                                  [  OK  ]
> Stopping traffic_server:                                   [  OK  ]
> 
> [root@psp6cdatsec06 ~]# cat /proc/meminfo | egrep "Huge|Page"
> AnonPages:        187368 kB
> PageTables:         4656 kB
> AnonHugePages:         0 kB
> HugePages_Total:   64000
> HugePages_Free:    64000
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> Hugepagesize:       2048 kB
> 
> ## psp6cdedge05, 96GB ram cache, no huge pages. Disaster! 15 minutes for 
> process to stop.
> 
> [root@psp6cdatsec05 ~]# cat /proc/meminfo | egrep "Huge|Page"
> AnonPages:      104075516 kB
> PageTables:       208460 kB
> AnonHugePages:         0 kB
> HugePages_Total:       0
> HugePages_Free:        0
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> Hugepagesize:       2048 kB
> 
> [root@psp6cdatsec05 ~]# service trafficserver stop
> Stopping traffic_cop:                                      [  OK  ]
> Stopping traffic_manager:                                  [  OK  ]
> Stopping traffic_server:                                   [FAILED]
> 
> [root@psp6cdatsec05 ~]# date
> Sat Aug 20 01:43:24 UTC 2016
> 
> Running "perf top" shows the ET_AIO Thread stuck in the following :
>   93.26%  [kernel]             [k] try_to_unmap_anon   
> 
> [root@psp6cdatsec05 ~]# date
> Sat Aug 20 01:58:27 UTC 2016
> 
> [root@psp6cdatsec05 ~]# ps aux | grep ET_NET
> root      39293  0.0  0.0 103312   900 pts/0    S+   01:58   0:00 grep ET_NET
> 
> [root@psp6cdatsec05 ~]# cat /proc/meminfo | egrep "Huge|Page"
> AnonPages:        171152 kB
> PageTables:         4596 kB
> AnonHugePages:         0 kB
> HugePages_Total:       0
> HugePages_Free:        0
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> Hugepagesize:       2048 kB
> 
> 
> 
> 
> On Thu, Aug 18, 2016 at 7:24 AM, Steve Malenfant <[email protected] 
> <mailto:[email protected]>> wrote:
> Bryan,
> 
> Somehow the problem came back after a few hours (not as bad yet)... This is 
> running 5.3.2 with a few patches (available on Traffic Control GitHub).
> 
> Per-node numastat info (in MBs):
>                           Node 0          Node 1           Total
>                  --------------- --------------- ---------------
> Numa_Hit               831049.44       684360.21      1515409.66
> Numa_Miss                   0.00            0.00            0.00
> Numa_Foreign                0.00            0.00            0.00
> Interleave_Hit            423.24          423.16          846.40
> Local_Node             831046.29       683892.71      1514939.01
> Other_Node                  3.15          467.50          470.65
> 
> 
> #  sysctl -a | grep huge
> vm.nr_hugepages = 0
> vm.nr_hugepages_mempolicy = 0
> vm.hugetlb_shm_group = 0
> vm.hugepages_treat_as_movable = 0
> vm.nr_overcommit_hugepages = 0
> 
> We ATS set with CONFIG proxy.config.allocator.hugepages = INT 0
> 
> On Wed, Aug 17, 2016 at 5:11 PM, Bryan Call <[email protected] 
> <mailto:[email protected]>> wrote:
> Can you run: numastat -v?  Looking up try_to_unmap_anon it looks like the 
> kernel is trying to balance the memory usage across the numa nodes.  There 
> was a recent change to how memory allocation is done on the aio threads in 
> this ticket:
> https://issues.apache.org/jira/browse/TS-4719 
> <https://issues.apache.org/jira/browse/TS-4719>
> 
> Can you also run sudo sysctl -a | grep huge and paste the output?
> 
> -Bryan
> 
> 
> 
>> On Aug 16, 2016, at 11:00 PM, Steve Malenfant <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> Bryan, here's the output during of one of those events... 
>> 
>> This server RAM is used at about 80%. Hugepage is not enabled. 256GB of Ram 
>> Cache on a server with 384GB RAM. 
>> 
>> Samples: 756K of event 'cycles', Event count (approx.): 26719576313
>> Overhead  Shared Object            Symbol
>>   70.35%  [kernel]                 [k] try_to_unmap_anon
>>    2.42%  [kernel]                 [k] find_busiest_group
>>    1.34%  [kernel]                 [k] _spin_lock
>>    0.73%  [kernel]                 [k] thread_return
>>    0.64%  libpthread-2.12.so <http://libpthread-2.12.so/>       [.] 
>> pthread_cond_timedwait@@GLIBC_2.3.2
>>    0.52%  [kernel]                 [k] schedule
>>    0.50%  [kernel]                 [k] find_next_bit
>>    0.42%  [kernel]                 [k] cpumask_next_and
>>    0.38%  [kernel]                 [k] _spin_lock_irqsave
>> 
>> [~]# cat /proc/meminfo 
>> MemTotal:       396790288 kB
>> MemFree:        52626016 kB
>> Buffers:          151148 kB
>> Cached:          2035288 kB
>> SwapCached:            0 kB
>> Active:          1198652 kB
>> Inactive:        1171944 kB
>> Active(anon):     222464 kB
>> Inactive(anon):      152 kB
>> Active(file):     976188 kB
>> Inactive(file):  1171792 kB
>> Unevictable:    289593148 kB
>> Mlocked:        289595216 kB
>> SwapTotal:       4095996 kB
>> SwapFree:        4095996 kB
>> Dirty:               208 kB
>> Writeback:             0 kB
>> AnonPages:      289777432 kB
>> Mapped:           128752 kB
>> Shmem:               244 kB
>> Slab:           15548892 kB
>> SReclaimable:   15454924 kB
>> SUnreclaim:        93968 kB
>> KernelStack:       17648 kB
>> PageTables:       573316 kB
>> NFS_Unstable:          0 kB
>> Bounce:                0 kB
>> WritebackTmp:          0 kB
>> CommitLimit:    202491140 kB
>> Committed_AS:   291736828 kB
>> VmallocTotal:   34359738367 kB
>> VmallocUsed:     1383676 kB
>> VmallocChunk:   34157955876 kB
>> HardwareCorrupted:     0 kB
>> AnonHugePages:         0 kB
>> HugePages_Total:       0
>> HugePages_Free:        0
>> HugePages_Rsvd:        0
>> HugePages_Surp:        0
>> Hugepagesize:       2048 kB
>> DirectMap4k:       58368 kB
>> DirectMap2M:     1968128 kB
>> DirectMap1G:    400556032 kB
>> 
>> On Tue, Aug 16, 2016 at 8:54 PM, Bryan Call <[email protected] 
>> <mailto:[email protected]>> wrote:
>> Can you run sudo perf top or sudo perf record during the time you are having 
>> a spike in CPU usage?
>> 
>> -Bryan
>> 
>> 
>> 
>> 
>>> On Aug 16, 2016, at 3:09 PM, Steve Malenfant <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> 
>>> We have a condition where 4 caches, every 33-34 minutes goes high CPU 
>>> utliization (1000%) and prevents incoming connections. This last about 30 
>>> seconds and goes away.
>>> 
>>> It seems to occur with some specific origin/content which I don't exactly 
>>> know which one so far.
>>> 
>>> I'm including some details (in case it's obvious), but how do we profile a 
>>> running server to see what's causing this high cpu usage?
>>> 
>>> The load is normal, maybe about 100Mbps (basically idling).
>>> 
>>> top - 17:20:33 up 61 days,  3:32,  4 users,  load average: 4.74, 1.20, 0.53
>>> Tasks: 616 total,  16 running, 600 sleeping,   0 stopped,   0 zombie
>>> Cpu(s):  2.6%us, 22.3%sy,  0.0%ni, 74.9%id,  0.0%wa,  0.0%hi,  0.2%si,  
>>> 0.0%st
>>> Mem:  396790288k total, 342530608k used, 54259680k free,   148764k buffers
>>> Swap:  4095996k total,        0k used,  4095996k free,  1618892k cached
>>>  
>>>    PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>>>   8703 ats       20   0  282g 275g  39m D 853.9 72.7  12496:54 [ET_NET 0]
>>>     61 root      20   0     0    0    0 R  5.6  0.0   5:10.11 ksoftirqd/14
>>>     77 root      20   0     0    0    0 R  5.0  0.0   5:15.08 ksoftirqd/18
>>>    125 root      20   0     0    0    0 R  5.0  0.0   5:02.48 ksoftirqd/30
>>>     13 root      20   0     0    0    0 R  4.6  0.0   5:23.63 ksoftirqd/2
>>>     37 root      20   0     0    0    0 R  4.6  0.0   5:10.31 ksoftirqd/8
>>>     45 root      20   0     0    0    0 R  4.6  0.0   5:10.36 ksoftirqd/10
>>>     93 root      20   0     0    0    0 R  4.6  0.0   5:29.05 ksoftirqd/22
>>>    109 root      20   0     0    0    0 R  4.6  0.0   5:06.27 ksoftirqd/26
>>>     21 root      20   0     0    0    0 R  4.3  0.0   5:12.82 ksoftirqd/4
>>>     29 root      20   0     0    0    0 R  4.3  0.0   5:10.79 ksoftirqd/6
>>>     85 root      20   0     0    0    0 R  4.3  0.0   5:18.36 ksoftirqd/20
>>>    101 root      20   0     0    0    0 R  4.3  0.0   5:13.54 ksoftirqd/24
>>>      4 root      20   0     0    0    0 R  4.0  0.0   5:16.25 ksoftirqd/0
>>>     69 root      20   0     0    0    0 R  4.0  0.0   5:09.72 ksoftirqd/16
>>>    117 root      20   0     0    0    0 R  3.0  0.0   5:03.74 ksoftirqd/28
>>>     53 root      20   0     0    0    0 S  1.7  0.0   5:07.40 ksoftirqd/12
>>> 
>>> Here's a "dstat" output of that server when this is occurring... updated 
>>> every second.
>>> 
>>> ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
>>> usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw 
>>>   0   0  98   1   0   0|7940k   10M|6270k   18M|   0     0 |  13k   35k
>>>   0   0  99   0   0   0|9738k 4160k|1600k   10M|   0     0 |  12k   34k
>>>   0   0  99   0   0   0|3232k 4136k|1390k 5724k|   0     0 |  11k   34k
>>>   0   0 100   0   0   0|   0  4096k|  34k  555k|   0     0 |  10k   33k
>>>   0   0  99   0   0   0|1316k 9095k|7028k 7240k|   0     0 |  11k   34k
>>>   0   1  99   0   0   0| 288k 8658k|6189k 5878k|   0     0 |  29k   36k
>>>   0   0  99   1   0   0|  13M 6204k|2870k   17M|   0     0 |  14k   35k
>>>   0   0  99   0   0   0|   0  4096k|3009k 3387k|   0     0 |  14k   35k
>>>   0   0  99   0   0   0|1028k 6384k|5138k 6047k|   0     0 |  14k   35k
>>>   0   0  99   1   0   0|5127k 8192k|4262k 7417k|   0     0 |  13k   34k
>>>   0   0  99   0   0   0|2372k 2612k|2011k 4944k|   0     0 |  14k   35k
>>>   0   0  99   0   0   0|   0  4172k|1633k 2201k|   0     0 |  13k   35k
>>>   0   0  99   0   0   0|  11M  608k|2528k   12M|   0     0 |  13k   35k
>>>   0   3  95   1   0   0|  15M 8409k|  11M   28M|   0     0 |  16k   37k
>>>   0   3  95   1   0   0|6324k 4244k|7959k   20M|   0     0 |  12k   32k
>>>   0   4  96   0   0   0| 256k 2128k| 853k 2823k|   0     0 |  27k   32k
>>>   0   3  96   0   0   0|1028k    0 | 472k  562k|   0     0 |  10k   31k
>>>   0   3  96   0   0   0|1028k   68k|1457k 1454k|   0     0 |  11k   32k
>>>   0   3  97   0   0   0|   0     0 | 316k  317k|   0     0 |  11k   31k
>>>   0   3  97   0   0   0|3076k    0 | 415k 3251k|   0     0 |9498    30k
>>>   0   3  97   0   0   0|2688k   84k|  16k 2985k|   0     0 |8623    30k
>>>   0   3  97   0   0   0|   0     0 |  64k   65k|   0     0 |9121    30k
>>>   0   3  97   0   0   0|   0    68k|1607k 1556k|   0     0 |  11k   31k
>>>   0   3  97   0   0   0|   0     0 |  18k  120k|   0     0 |7669    28k
>>>   0   3  97   0   0   0|   0     0 | 536k  537k|   0     0 |8660    30k
>>>   0   4  96   0   0   0|   0    72k|  65k   75k|   0     0 |  26k   30k
>>>   0   3  96   0   0   0|   0  2104k|3079k 3093k|   0     0 |9267    30k
>>>   0   3  97   0   0   0|   0   108k| 593k  506k|   0     0 |7591    28k
>>>   0   3  96   0   0   0|1152k 4096B| 124k  238k|   0     0 |7914    29k
>>>   0   3  97   0   0   0|   0     0 |5719B 1574B|   0     0 |8884    29k
>>>   0   3  96   0   0   0|   0   224k|  18k   51k|   0     0 |9092    30k
>>>   0   3  97   0   0   0|   0     0 |  87k  108k|   0     0 |8998    29k
>>>   0   3  96   0   0   0|5328k    0 |1594k 6790k|   0     0 |8499    29k
>>>   0   3  97   0   0   0|   0   156k|  44k  203k|   0     0 |8039    29k
>>>   0   3  97   0   0   0|   0     0 |  17k   82k|   0     0 |5377    25k
>>>   0   4  96   0   0   0|   0   252k|3224B 6858B|   0     0 |  28k   31k
>>>   0   3  96   0   0   0|   0     0 |2139B 8535B|   0     0 |  10k   29k
>>>   0   3  97   0   0   0|5508k    0 |  20k 5516k|   0     0 |7762    29k
>>>   0   3  97   0   0   0|   0    72k|  10k   28k|   0     0 |7026    29k
>>>   0   3  97   0   0   0|   0     0 |3782B  712B|   0     0 |5829    29k
>>>   0   4  96   0   0   0|   0   264k|  11k   10k|   0     0 |5170   183k
>>>   0   4  95   0   0   0|   0     0 |  15k 1415B|   0     0 |4125   245k
>>>   0   3  97   0   0   0|   0     0 |5416B  966B|   0     0 |3608    40k
>>>   0   3  97   0   0   0|   0  4096B|6233B 3868B|   0     0 |3636    41k
>>>   0   3  96   0   0   0|   0    64k|5761B 1594B|   0     0 |3579    41k
>>>   2  20  78   0   0   0|   0    80k|  11k 1544B|   0     0 |  16k 3590k
>>>   3  26  71   0   0   0|   0     0 |  13k 9997B|   0     0 |  12k 4969k
>>>   3  26  71   0   0   0|   0     0 |6685B 1042B|   0     0 |  12k 5017k
>>>   3  25  71   0   0   0|   0     0 |  11k 1660B|   0     0 |  12k 4957k
>>>   3  25  72   0   0   0|   0     0 |7765B 4463B|   0     0 |  12k 4862k
>>>   3  25  72   0   0   0|   0   128k|8377B 1478B|   0     0 |  12k 4875k
>>>   3  25  72   0   0   0|   0     0 |4407B  992B|   0     0 |  12k 4814k
>>>   3  25  71   0   0   0|   0     0 |  17k 1959B|   0     0 |  12k 4993k
>>>   3  26  71   0   0   0|   0     0 |7774B 1478B|   0     0 |  12k 4986k
>>>   3  25  72   0   0   0|   0     0 |5728B 1066B|   0     0 |  12k 4921k
>>>   3  25  71   0   0   0|   0   108k|  14k 1816B|   0     0 |  19k 4922k
>>>   3  25  72   0   0   0|   0     0 |6296B 9096B|   0     0 |  12k 4899k
>>>   3  25  72   0   0   0|   0     0 |8974B 1166B|   0     0 |  12k 4914k
>>>   3  26  71   0   0   0|   0     0 |  12k 1038B|   0     0 |  12k 5133k
>>>   3  25  72   0   0   0|   0     0 |9956B 1668B|   0     0 |  12k 4873k
>>>   3  25  72   0   0   0|   0   228k|8025B 2172B|   0     0 |  12k 4841k
>>>   3  25  72   0   0   0|   0     0 |7743B 1728B|   0     0 |  12k 4888k
>>>   3  26  71   0   0   0|   0     0 |8287B 6061B|   0     0 |  12k 4955k
>>>   3  26  70   0   0   0|   0     0 |8702B 1602B|   0     0 |  12k 5064k
>>> ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
>>> usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw 
>>>   3  26  71   0   0   0|   0     0 |8898B 1198B|   0     0 |  12k 4983k
>>>   3  25  72   0   0   0|   0   152k|  10k   48k|   0     0 |  17k 4840k
>>>   3  25  72   0   0   0|   0     0 |4030B 8837B|   0     0 |  15k 4900k
>>>   3  25  72   0   0   0|   0     0 |  12k 2260B|   0     0 |  12k 4909k
>>>   3  26  71   0   0   0|   0     0 |  14k 2169B|   0     0 |  12k 5001k
>>>   3  26  70   0   0   0|   0     0 |  12k 1470B|   0     0 |  12k 5047k
>>>   3  26  71   0   0   0|   0   232k|  14k 1964B|   0     0 |  12k 4967k
>>>   3  25  72   0   0   0|   0     0 |  14k 2115B|   0     0 |  12k 4846k
>>>   3  25  72   0   0   0|   0     0 |  15k 2144B|   0     0 |  12k 4945k
>>>   3  25  72   0   0   0|   0     0 |  15k 1544B|   0     0 |  12k 4862k
>>>   3  26  71   0   0   0|   0     0 |6595B 1124B|   0     0 |  12k 4991k
>>>   3  27  70   0   0   0|   0   128k|  39k 7960B|   0     0 |  17k 5065k
>>>   3  26  71   0   0   0|   0     0 |  13k   11k|   0     0 |  12k 4942k
>>>   3  25  72   0   0   0|   0     0 |9035B 2072B|   0     0 |  12k 4881k
>>>   3  25  72   0   0   0|   0     0 |9057B 2788B|   0     0 |  12k 4825k
>>>   1   5  90   4   0   0|  41M   28M|  49M  104M|   0     0 |  30k  848k
>>>   0   1  92   7   0   0|  28M   51M|  46M  166M|   0     0 |  22k   39k
>>>   0   0  97   2   0   0|  19M   20M|  18M   48M|   0     0 |  15k   36k
>>>   0   0  97   2   0   0|  21M   14M|  13M   41M|   0     0 |  15k   36k
>>>   0   0  97   3   0   0|  13M   16M|  21M   33M|   0     0 |  14k   36k
>>>   0   0  96   3   0   0|  14M   20M|  21M   42M|   0     0 |  14k   36k
>>>   0   1  96   3   0   0|  15M   15M|  16M   32M|   0     0 |  24k   36k
>>>   0   0  99   1   0   0|  11M 4351k|3789k   18M|   0     0 |  12k   34k
>>>   0   0  99   0   0   0|7182k    0 |1881k 8056k|   0     0 |  11k   33k
>>>   0   0  99   0   0   0|3164k  100k|2091k 6605k|   0     0 |  11k   35k
>>>   0   0  99   0   0   0|3079k 2647k|6850k 6604k|   0     0 |  11k   33k
>>>   0   0  98   1   0   0|8640k 7195k|6742k   17M|   0     0 |  12k   34k
>>>   0   0  99   1   0   0|   0  4451k|6093k 7067k|   0     0 |  10k   33k
>>>   0   0  99   1   0   0|3908k 4787k|1294k 4556k|   0     0 |  11k   33k
>>>   0   0  99   0   0   0|7460k   44k|1505k 9884k|   0     0 |9809    32k
>>> 
>>> Thanks,
>>> 
>>> Steve
>> 
>> 
> 
> 
>

Re: ATS blocking at half hour interval

Reply via email to