> On Aug 19, 2016, at 10:00 PM, Steve Malenfant <[email protected]> wrote: > > Bryan, > > I believe the problem here might be related to Linux memory management and > big ATS ram cache. > Here's some testing I've done and some results simply about stopping ATS > after some traffic filled up the Ram Cache.
Yeh, Phil Sorber has posted some insight on this before, but tldr; turn off anonymous huge pages (it’s a boot directive on at least my kernels). — Leif transparent_hugepage=never > > [root@psp6cdatsec05 ~]# sysctl -a | grep shm > kernel.shmmax = 68719476736 > kernel.shmall = 4294967296 > kernel.shmmni = 4096 > Traffic Server 5.3.2 + a few patches. > > psp6cdedge03: Dell R630 256GB 8x3.6T SSD > - ram_cache = 128GB > - Ram_drive = 64GB > - hugepages = 65536 (128GB) > > psp6cdedge05: Dell R730xd 196GB 24x1.2TB HDD > - ram_cache = 96GB > - Ram_drive = 32GB > - no hugepages > > psp6cdedge06: Dell R730xd 196GB 24x1.2TB HDD > - ram_cache = 96GB > - Ram_drive = 32GB > - hugepages = 64000 (120GB) > > Notes: > - Between the R730xd, there was no difference in performance (huge pages) > - After 20 minutes runtime (About 3Gbps each of VOD traffic, 10000 titles), a > huge difference in page table size. > - psp6cdedge05 : PageTables: 205124 kB > - psp6cdedge06 : PageTables: 7136 kB > - After 30 minutes > - All hugepages been used on psp6cdedge03 > - There is still 11000 hugespages available on psp6cdedge06 > - Stopped the streams after 30 minutes, waited another 30 minutes and then > stopped trafficserver > > ## psp6cdedge03, 128GB RAM Cache, 128GB huge pages.. not enough, depleted. > > [root@psp6cdedge03 ~]# cat /proc/meminfo | egrep "Huge|Page" > AnonPages: 9768284 kB > PageTables: 24352 kB > AnonHugePages: 0 kB > HugePages_Total: 65536 > HugePages_Free: 0 > HugePages_Rsvd: 0 > HugePages_Surp: 0 > Hugepagesize: 2048 kB > > [root@psp6cdedge03 ~]# service trafficserver stop > Stopping traffic_cop: [ OK ] > Stopping traffic_manager: [ OK ] > Stopping traffic_server: [FAILED] > > Took about 10 seconds, but failed to stop. > > [root@psp6cdedge03 ~]# cat /proc/meminfo | egrep "Huge|Page" > AnonPages: 138636 kB > PageTables: 4584 kB > AnonHugePages: 0 kB > HugePages_Total: 65536 > HugePages_Free: 65536 > HugePages_Rsvd: 0 > HugePages_Surp: 0 > Hugepagesize: 2048 kB > > ## psp6cdedge06, 96GB ram cache, 120GB huge pages.. Just fine! > > [root@psp6cdatsec06 ~]# cat /proc/meminfo | egrep "Huge|Page" > AnonPages: 1159908 kB > PageTables: 7372 kB > AnonHugePages: 0 kB > HugePages_Total: 64000 > HugePages_Free: 11029 > HugePages_Rsvd: 0 > HugePages_Surp: 0 > Hugepagesize: 2048 kB > > [root@psp6cdatsec06 ~]# service trafficserver stop > Stopping traffic_cop: [ OK ] > Stopping traffic_manager: [ OK ] > Stopping traffic_server: [ OK ] > > [root@psp6cdatsec06 ~]# cat /proc/meminfo | egrep "Huge|Page" > AnonPages: 187368 kB > PageTables: 4656 kB > AnonHugePages: 0 kB > HugePages_Total: 64000 > HugePages_Free: 64000 > HugePages_Rsvd: 0 > HugePages_Surp: 0 > Hugepagesize: 2048 kB > > ## psp6cdedge05, 96GB ram cache, no huge pages. Disaster! 15 minutes for > process to stop. > > [root@psp6cdatsec05 ~]# cat /proc/meminfo | egrep "Huge|Page" > AnonPages: 104075516 kB > PageTables: 208460 kB > AnonHugePages: 0 kB > HugePages_Total: 0 > HugePages_Free: 0 > HugePages_Rsvd: 0 > HugePages_Surp: 0 > Hugepagesize: 2048 kB > > [root@psp6cdatsec05 ~]# service trafficserver stop > Stopping traffic_cop: [ OK ] > Stopping traffic_manager: [ OK ] > Stopping traffic_server: [FAILED] > > [root@psp6cdatsec05 ~]# date > Sat Aug 20 01:43:24 UTC 2016 > > Running "perf top" shows the ET_AIO Thread stuck in the following : > 93.26% [kernel] [k] try_to_unmap_anon > > [root@psp6cdatsec05 ~]# date > Sat Aug 20 01:58:27 UTC 2016 > > [root@psp6cdatsec05 ~]# ps aux | grep ET_NET > root 39293 0.0 0.0 103312 900 pts/0 S+ 01:58 0:00 grep ET_NET > > [root@psp6cdatsec05 ~]# cat /proc/meminfo | egrep "Huge|Page" > AnonPages: 171152 kB > PageTables: 4596 kB > AnonHugePages: 0 kB > HugePages_Total: 0 > HugePages_Free: 0 > HugePages_Rsvd: 0 > HugePages_Surp: 0 > Hugepagesize: 2048 kB > > > > > On Thu, Aug 18, 2016 at 7:24 AM, Steve Malenfant <[email protected] > <mailto:[email protected]>> wrote: > Bryan, > > Somehow the problem came back after a few hours (not as bad yet)... This is > running 5.3.2 with a few patches (available on Traffic Control GitHub). > > Per-node numastat info (in MBs): > Node 0 Node 1 Total > --------------- --------------- --------------- > Numa_Hit 831049.44 684360.21 1515409.66 > Numa_Miss 0.00 0.00 0.00 > Numa_Foreign 0.00 0.00 0.00 > Interleave_Hit 423.24 423.16 846.40 > Local_Node 831046.29 683892.71 1514939.01 > Other_Node 3.15 467.50 470.65 > > > # sysctl -a | grep huge > vm.nr_hugepages = 0 > vm.nr_hugepages_mempolicy = 0 > vm.hugetlb_shm_group = 0 > vm.hugepages_treat_as_movable = 0 > vm.nr_overcommit_hugepages = 0 > > We ATS set with CONFIG proxy.config.allocator.hugepages = INT 0 > > On Wed, Aug 17, 2016 at 5:11 PM, Bryan Call <[email protected] > <mailto:[email protected]>> wrote: > Can you run: numastat -v? Looking up try_to_unmap_anon it looks like the > kernel is trying to balance the memory usage across the numa nodes. There > was a recent change to how memory allocation is done on the aio threads in > this ticket: > https://issues.apache.org/jira/browse/TS-4719 > <https://issues.apache.org/jira/browse/TS-4719> > > Can you also run sudo sysctl -a | grep huge and paste the output? > > -Bryan > > > >> On Aug 16, 2016, at 11:00 PM, Steve Malenfant <[email protected] >> <mailto:[email protected]>> wrote: >> >> Bryan, here's the output during of one of those events... >> >> This server RAM is used at about 80%. Hugepage is not enabled. 256GB of Ram >> Cache on a server with 384GB RAM. >> >> Samples: 756K of event 'cycles', Event count (approx.): 26719576313 >> Overhead Shared Object Symbol >> 70.35% [kernel] [k] try_to_unmap_anon >> 2.42% [kernel] [k] find_busiest_group >> 1.34% [kernel] [k] _spin_lock >> 0.73% [kernel] [k] thread_return >> 0.64% libpthread-2.12.so <http://libpthread-2.12.so/> [.] >> pthread_cond_timedwait@@GLIBC_2.3.2 >> 0.52% [kernel] [k] schedule >> 0.50% [kernel] [k] find_next_bit >> 0.42% [kernel] [k] cpumask_next_and >> 0.38% [kernel] [k] _spin_lock_irqsave >> >> [~]# cat /proc/meminfo >> MemTotal: 396790288 kB >> MemFree: 52626016 kB >> Buffers: 151148 kB >> Cached: 2035288 kB >> SwapCached: 0 kB >> Active: 1198652 kB >> Inactive: 1171944 kB >> Active(anon): 222464 kB >> Inactive(anon): 152 kB >> Active(file): 976188 kB >> Inactive(file): 1171792 kB >> Unevictable: 289593148 kB >> Mlocked: 289595216 kB >> SwapTotal: 4095996 kB >> SwapFree: 4095996 kB >> Dirty: 208 kB >> Writeback: 0 kB >> AnonPages: 289777432 kB >> Mapped: 128752 kB >> Shmem: 244 kB >> Slab: 15548892 kB >> SReclaimable: 15454924 kB >> SUnreclaim: 93968 kB >> KernelStack: 17648 kB >> PageTables: 573316 kB >> NFS_Unstable: 0 kB >> Bounce: 0 kB >> WritebackTmp: 0 kB >> CommitLimit: 202491140 kB >> Committed_AS: 291736828 kB >> VmallocTotal: 34359738367 kB >> VmallocUsed: 1383676 kB >> VmallocChunk: 34157955876 kB >> HardwareCorrupted: 0 kB >> AnonHugePages: 0 kB >> HugePages_Total: 0 >> HugePages_Free: 0 >> HugePages_Rsvd: 0 >> HugePages_Surp: 0 >> Hugepagesize: 2048 kB >> DirectMap4k: 58368 kB >> DirectMap2M: 1968128 kB >> DirectMap1G: 400556032 kB >> >> On Tue, Aug 16, 2016 at 8:54 PM, Bryan Call <[email protected] >> <mailto:[email protected]>> wrote: >> Can you run sudo perf top or sudo perf record during the time you are having >> a spike in CPU usage? >> >> -Bryan >> >> >> >> >>> On Aug 16, 2016, at 3:09 PM, Steve Malenfant <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> We have a condition where 4 caches, every 33-34 minutes goes high CPU >>> utliization (1000%) and prevents incoming connections. This last about 30 >>> seconds and goes away. >>> >>> It seems to occur with some specific origin/content which I don't exactly >>> know which one so far. >>> >>> I'm including some details (in case it's obvious), but how do we profile a >>> running server to see what's causing this high cpu usage? >>> >>> The load is normal, maybe about 100Mbps (basically idling). >>> >>> top - 17:20:33 up 61 days, 3:32, 4 users, load average: 4.74, 1.20, 0.53 >>> Tasks: 616 total, 16 running, 600 sleeping, 0 stopped, 0 zombie >>> Cpu(s): 2.6%us, 22.3%sy, 0.0%ni, 74.9%id, 0.0%wa, 0.0%hi, 0.2%si, >>> 0.0%st >>> Mem: 396790288k total, 342530608k used, 54259680k free, 148764k buffers >>> Swap: 4095996k total, 0k used, 4095996k free, 1618892k cached >>> >>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >>> 8703 ats 20 0 282g 275g 39m D 853.9 72.7 12496:54 [ET_NET 0] >>> 61 root 20 0 0 0 0 R 5.6 0.0 5:10.11 ksoftirqd/14 >>> 77 root 20 0 0 0 0 R 5.0 0.0 5:15.08 ksoftirqd/18 >>> 125 root 20 0 0 0 0 R 5.0 0.0 5:02.48 ksoftirqd/30 >>> 13 root 20 0 0 0 0 R 4.6 0.0 5:23.63 ksoftirqd/2 >>> 37 root 20 0 0 0 0 R 4.6 0.0 5:10.31 ksoftirqd/8 >>> 45 root 20 0 0 0 0 R 4.6 0.0 5:10.36 ksoftirqd/10 >>> 93 root 20 0 0 0 0 R 4.6 0.0 5:29.05 ksoftirqd/22 >>> 109 root 20 0 0 0 0 R 4.6 0.0 5:06.27 ksoftirqd/26 >>> 21 root 20 0 0 0 0 R 4.3 0.0 5:12.82 ksoftirqd/4 >>> 29 root 20 0 0 0 0 R 4.3 0.0 5:10.79 ksoftirqd/6 >>> 85 root 20 0 0 0 0 R 4.3 0.0 5:18.36 ksoftirqd/20 >>> 101 root 20 0 0 0 0 R 4.3 0.0 5:13.54 ksoftirqd/24 >>> 4 root 20 0 0 0 0 R 4.0 0.0 5:16.25 ksoftirqd/0 >>> 69 root 20 0 0 0 0 R 4.0 0.0 5:09.72 ksoftirqd/16 >>> 117 root 20 0 0 0 0 R 3.0 0.0 5:03.74 ksoftirqd/28 >>> 53 root 20 0 0 0 0 S 1.7 0.0 5:07.40 ksoftirqd/12 >>> >>> Here's a "dstat" output of that server when this is occurring... updated >>> every second. >>> >>> ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system-- >>> usr sys idl wai hiq siq| read writ| recv send| in out | int csw >>> 0 0 98 1 0 0|7940k 10M|6270k 18M| 0 0 | 13k 35k >>> 0 0 99 0 0 0|9738k 4160k|1600k 10M| 0 0 | 12k 34k >>> 0 0 99 0 0 0|3232k 4136k|1390k 5724k| 0 0 | 11k 34k >>> 0 0 100 0 0 0| 0 4096k| 34k 555k| 0 0 | 10k 33k >>> 0 0 99 0 0 0|1316k 9095k|7028k 7240k| 0 0 | 11k 34k >>> 0 1 99 0 0 0| 288k 8658k|6189k 5878k| 0 0 | 29k 36k >>> 0 0 99 1 0 0| 13M 6204k|2870k 17M| 0 0 | 14k 35k >>> 0 0 99 0 0 0| 0 4096k|3009k 3387k| 0 0 | 14k 35k >>> 0 0 99 0 0 0|1028k 6384k|5138k 6047k| 0 0 | 14k 35k >>> 0 0 99 1 0 0|5127k 8192k|4262k 7417k| 0 0 | 13k 34k >>> 0 0 99 0 0 0|2372k 2612k|2011k 4944k| 0 0 | 14k 35k >>> 0 0 99 0 0 0| 0 4172k|1633k 2201k| 0 0 | 13k 35k >>> 0 0 99 0 0 0| 11M 608k|2528k 12M| 0 0 | 13k 35k >>> 0 3 95 1 0 0| 15M 8409k| 11M 28M| 0 0 | 16k 37k >>> 0 3 95 1 0 0|6324k 4244k|7959k 20M| 0 0 | 12k 32k >>> 0 4 96 0 0 0| 256k 2128k| 853k 2823k| 0 0 | 27k 32k >>> 0 3 96 0 0 0|1028k 0 | 472k 562k| 0 0 | 10k 31k >>> 0 3 96 0 0 0|1028k 68k|1457k 1454k| 0 0 | 11k 32k >>> 0 3 97 0 0 0| 0 0 | 316k 317k| 0 0 | 11k 31k >>> 0 3 97 0 0 0|3076k 0 | 415k 3251k| 0 0 |9498 30k >>> 0 3 97 0 0 0|2688k 84k| 16k 2985k| 0 0 |8623 30k >>> 0 3 97 0 0 0| 0 0 | 64k 65k| 0 0 |9121 30k >>> 0 3 97 0 0 0| 0 68k|1607k 1556k| 0 0 | 11k 31k >>> 0 3 97 0 0 0| 0 0 | 18k 120k| 0 0 |7669 28k >>> 0 3 97 0 0 0| 0 0 | 536k 537k| 0 0 |8660 30k >>> 0 4 96 0 0 0| 0 72k| 65k 75k| 0 0 | 26k 30k >>> 0 3 96 0 0 0| 0 2104k|3079k 3093k| 0 0 |9267 30k >>> 0 3 97 0 0 0| 0 108k| 593k 506k| 0 0 |7591 28k >>> 0 3 96 0 0 0|1152k 4096B| 124k 238k| 0 0 |7914 29k >>> 0 3 97 0 0 0| 0 0 |5719B 1574B| 0 0 |8884 29k >>> 0 3 96 0 0 0| 0 224k| 18k 51k| 0 0 |9092 30k >>> 0 3 97 0 0 0| 0 0 | 87k 108k| 0 0 |8998 29k >>> 0 3 96 0 0 0|5328k 0 |1594k 6790k| 0 0 |8499 29k >>> 0 3 97 0 0 0| 0 156k| 44k 203k| 0 0 |8039 29k >>> 0 3 97 0 0 0| 0 0 | 17k 82k| 0 0 |5377 25k >>> 0 4 96 0 0 0| 0 252k|3224B 6858B| 0 0 | 28k 31k >>> 0 3 96 0 0 0| 0 0 |2139B 8535B| 0 0 | 10k 29k >>> 0 3 97 0 0 0|5508k 0 | 20k 5516k| 0 0 |7762 29k >>> 0 3 97 0 0 0| 0 72k| 10k 28k| 0 0 |7026 29k >>> 0 3 97 0 0 0| 0 0 |3782B 712B| 0 0 |5829 29k >>> 0 4 96 0 0 0| 0 264k| 11k 10k| 0 0 |5170 183k >>> 0 4 95 0 0 0| 0 0 | 15k 1415B| 0 0 |4125 245k >>> 0 3 97 0 0 0| 0 0 |5416B 966B| 0 0 |3608 40k >>> 0 3 97 0 0 0| 0 4096B|6233B 3868B| 0 0 |3636 41k >>> 0 3 96 0 0 0| 0 64k|5761B 1594B| 0 0 |3579 41k >>> 2 20 78 0 0 0| 0 80k| 11k 1544B| 0 0 | 16k 3590k >>> 3 26 71 0 0 0| 0 0 | 13k 9997B| 0 0 | 12k 4969k >>> 3 26 71 0 0 0| 0 0 |6685B 1042B| 0 0 | 12k 5017k >>> 3 25 71 0 0 0| 0 0 | 11k 1660B| 0 0 | 12k 4957k >>> 3 25 72 0 0 0| 0 0 |7765B 4463B| 0 0 | 12k 4862k >>> 3 25 72 0 0 0| 0 128k|8377B 1478B| 0 0 | 12k 4875k >>> 3 25 72 0 0 0| 0 0 |4407B 992B| 0 0 | 12k 4814k >>> 3 25 71 0 0 0| 0 0 | 17k 1959B| 0 0 | 12k 4993k >>> 3 26 71 0 0 0| 0 0 |7774B 1478B| 0 0 | 12k 4986k >>> 3 25 72 0 0 0| 0 0 |5728B 1066B| 0 0 | 12k 4921k >>> 3 25 71 0 0 0| 0 108k| 14k 1816B| 0 0 | 19k 4922k >>> 3 25 72 0 0 0| 0 0 |6296B 9096B| 0 0 | 12k 4899k >>> 3 25 72 0 0 0| 0 0 |8974B 1166B| 0 0 | 12k 4914k >>> 3 26 71 0 0 0| 0 0 | 12k 1038B| 0 0 | 12k 5133k >>> 3 25 72 0 0 0| 0 0 |9956B 1668B| 0 0 | 12k 4873k >>> 3 25 72 0 0 0| 0 228k|8025B 2172B| 0 0 | 12k 4841k >>> 3 25 72 0 0 0| 0 0 |7743B 1728B| 0 0 | 12k 4888k >>> 3 26 71 0 0 0| 0 0 |8287B 6061B| 0 0 | 12k 4955k >>> 3 26 70 0 0 0| 0 0 |8702B 1602B| 0 0 | 12k 5064k >>> ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system-- >>> usr sys idl wai hiq siq| read writ| recv send| in out | int csw >>> 3 26 71 0 0 0| 0 0 |8898B 1198B| 0 0 | 12k 4983k >>> 3 25 72 0 0 0| 0 152k| 10k 48k| 0 0 | 17k 4840k >>> 3 25 72 0 0 0| 0 0 |4030B 8837B| 0 0 | 15k 4900k >>> 3 25 72 0 0 0| 0 0 | 12k 2260B| 0 0 | 12k 4909k >>> 3 26 71 0 0 0| 0 0 | 14k 2169B| 0 0 | 12k 5001k >>> 3 26 70 0 0 0| 0 0 | 12k 1470B| 0 0 | 12k 5047k >>> 3 26 71 0 0 0| 0 232k| 14k 1964B| 0 0 | 12k 4967k >>> 3 25 72 0 0 0| 0 0 | 14k 2115B| 0 0 | 12k 4846k >>> 3 25 72 0 0 0| 0 0 | 15k 2144B| 0 0 | 12k 4945k >>> 3 25 72 0 0 0| 0 0 | 15k 1544B| 0 0 | 12k 4862k >>> 3 26 71 0 0 0| 0 0 |6595B 1124B| 0 0 | 12k 4991k >>> 3 27 70 0 0 0| 0 128k| 39k 7960B| 0 0 | 17k 5065k >>> 3 26 71 0 0 0| 0 0 | 13k 11k| 0 0 | 12k 4942k >>> 3 25 72 0 0 0| 0 0 |9035B 2072B| 0 0 | 12k 4881k >>> 3 25 72 0 0 0| 0 0 |9057B 2788B| 0 0 | 12k 4825k >>> 1 5 90 4 0 0| 41M 28M| 49M 104M| 0 0 | 30k 848k >>> 0 1 92 7 0 0| 28M 51M| 46M 166M| 0 0 | 22k 39k >>> 0 0 97 2 0 0| 19M 20M| 18M 48M| 0 0 | 15k 36k >>> 0 0 97 2 0 0| 21M 14M| 13M 41M| 0 0 | 15k 36k >>> 0 0 97 3 0 0| 13M 16M| 21M 33M| 0 0 | 14k 36k >>> 0 0 96 3 0 0| 14M 20M| 21M 42M| 0 0 | 14k 36k >>> 0 1 96 3 0 0| 15M 15M| 16M 32M| 0 0 | 24k 36k >>> 0 0 99 1 0 0| 11M 4351k|3789k 18M| 0 0 | 12k 34k >>> 0 0 99 0 0 0|7182k 0 |1881k 8056k| 0 0 | 11k 33k >>> 0 0 99 0 0 0|3164k 100k|2091k 6605k| 0 0 | 11k 35k >>> 0 0 99 0 0 0|3079k 2647k|6850k 6604k| 0 0 | 11k 33k >>> 0 0 98 1 0 0|8640k 7195k|6742k 17M| 0 0 | 12k 34k >>> 0 0 99 1 0 0| 0 4451k|6093k 7067k| 0 0 | 10k 33k >>> 0 0 99 1 0 0|3908k 4787k|1294k 4556k| 0 0 | 11k 33k >>> 0 0 99 0 0 0|7460k 44k|1505k 9884k| 0 0 |9809 32k >>> >>> Thanks, >>> >>> Steve >> >> > > >
