On Tue, Feb 5, 2013 at 11:23 PM, Josh Krupka <jkru...@gmail.com> wrote:
> I've been looking into something on our system that sounds similar to what > you're seeing. I'm still researching it, but I'm suspecting the memory > compaction that runs as part of transparent huge pages when memory is > allocated... yet to be proven. The tunable you mentioned controls the > compaction process that runs at allocation time so it can try to allocate > large pages, there's a separate one that controls if the compaction is done > in khugepaged, and a separate one that controls whether THP is used at all > or not (/sys/kernel/mm/transparent_hugepage/enabled, or perhaps different > in your distro) > BTW, I sent /defrag yesterday, but /enabled had the same output. > What's the output of this command? > egrep 'trans|thp|compact_' /proc/vmstat > compact_stall represents the number of processes that were stalled to do > a compaction, the other metrics have to do with other parts of THP. If you > see compact_stall climbing, from what I can tell those might be causing > your spikes. I haven't found a way of telling how long the processes have > been stalled. You could probably get a little more insight into the > processes with some tracing assuming you can catch it quickly enough. > Running perf top will also show the compaction happening but that doesn't > necessarily mean it's impacting your running processes. > Interesting: # egrep 'trans|thp|compact_' /proc/vmstat nr_anon_transparent_hugepages 643 compact_blocks_moved 22629094 compact_pages_moved 532129382 compact_pagemigrate_failed 0 compact_stall 398051 compact_fail 80453 compact_success 317598 thp_fault_alloc 8254106 thp_fault_fallback 167286 thp_collapse_alloc 622783 thp_collapse_alloc_failed 3321 thp_split 122833