On Wed, 13 Apr 2016, kernel test robot wrote:

> FYI, we noticed that vm-scalability.throughput -5.5% regression on
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> commit d7c7d56ca61aec18e5e0cb3a64e50073c42195f7 ("huge tmpfs: avoid premature 
> exposure of new pagetable")

Very useful info, thank you.  I presume it confirms exactly what Kirill
warned me of, that doing the map_pages after instead of before the fault,
comes with a performance disadvantage.  I shall look into it, but not
immediately (and we know other reasons why that patch has to be revisited).

Hugh

> 
> 
> =========================================================================================
> compiler/cpufreq_governor/kconfig/rootfs/runtime/tbox_group/test/testcase:
>   
> gcc-4.9/performance/x86_64-rhel/debian-x86_64-2015-02-07.cgz/300s/lkp-hsw01/lru-file-mmap-read-rand/vm-scalability
> 
> commit: 
>   517348161d2725b8b596feb10c813bf596dc6a47
>   d7c7d56ca61aec18e5e0cb3a64e50073c42195f7
> 
> 517348161d2725b8 d7c7d56ca61aec18e5e0cb3a64 
> ---------------- -------------------------- 
>        fail:runs  %reproduction    fail:runs
>            |             |             |    
>    1801726 ±  0%      -5.5%    1702808 ±  0%  vm-scalability.throughput
>     317.89 ±  0%      +2.9%     327.15 ±  0%  vm-scalability.time.elapsed_time
>     317.89 ±  0%      +2.9%     327.15 ±  0%  
> vm-scalability.time.elapsed_time.max
>     872240 ±  4%      +8.5%     946467 ±  1%  
> vm-scalability.time.involuntary_context_switches
>   6.73e+08 ±  0%     -92.5%   50568722 ±  0%  
> vm-scalability.time.major_page_faults
>    2109093 ±  9%     -25.8%    1564815 ±  7%  
> vm-scalability.time.maximum_resident_set_size
>      37881 ±  0%    +586.9%     260194 ±  0%  
> vm-scalability.time.minor_page_faults
>       5087 ±  0%      +3.7%       5277 ±  0%  
> vm-scalability.time.percent_of_cpu_this_job_got
>      16047 ±  0%      +7.5%      17252 ±  0%  vm-scalability.time.system_time
>     127.19 ±  0%     -88.3%      14.93 ±  1%  vm-scalability.time.user_time
>      72572 ±  7%     +56.0%     113203 ±  3%  cpuidle.C1-HSW.usage
>  9.879e+08 ±  4%     -32.5%   6.67e+08 ±  8%  cpuidle.C6-HSW.time
>     605545 ±  3%     -12.9%     527295 ±  1%  softirqs.RCU
>     164170 ±  7%     +20.5%     197881 ±  6%  softirqs.SCHED
>    2584429 ±  3%     -25.5%    1925241 ±  2%  vmstat.memory.free
>     252507 ±  0%     +36.2%     343994 ±  0%  vmstat.system.in
>  2.852e+08 ±  5%    +163.9%  7.527e+08 ±  1%  numa-numastat.node0.local_node
>  2.852e+08 ±  5%    +163.9%  7.527e+08 ±  1%  numa-numastat.node0.numa_hit
>  2.876e+08 ±  6%    +162.8%  7.559e+08 ±  0%  numa-numastat.node1.local_node
>  2.876e+08 ±  6%    +162.8%  7.559e+08 ±  0%  numa-numastat.node1.numa_hit
>   6.73e+08 ±  0%     -92.5%   50568722 ±  0%  time.major_page_faults
>    2109093 ±  9%     -25.8%    1564815 ±  7%  time.maximum_resident_set_size
>      37881 ±  0%    +586.9%     260194 ±  0%  time.minor_page_faults
>     127.19 ±  0%     -88.3%      14.93 ±  1%  time.user_time
>      94.37 ±  0%      +2.0%      96.27 ±  0%  turbostat.%Busy
>       2919 ±  0%      +2.0%       2977 ±  0%  turbostat.Avg_MHz
>       5.12 ±  4%     -38.7%       3.14 ±  5%  turbostat.CPU%c6
>       2.00 ± 13%     -44.8%       1.10 ± 22%  turbostat.Pkg%pc2
>     240.00 ±  0%      +4.2%     250.14 ±  0%  turbostat.PkgWatt
>      55.36 ±  3%     +16.3%      64.40 ±  2%  turbostat.RAMWatt
>      17609 ±103%     -59.4%       7148 ± 72%  
> latency_stats.avg.pipe_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath
>      63966 ±152%     -68.4%      20204 ± 64%  
> latency_stats.max.pipe_read.__vfs_read.vfs_read.SyS_read.entry_SYSCALL_64_fastpath
>     299681 ±123%     -89.7%      30889 ± 13%  
> latency_stats.max.pipe_wait.pipe_read.__vfs_read.vfs_read.SyS_read.entry_SYSCALL_64_fastpath
>       0.00 ± -1%      +Inf%      35893 ± 10%  
> latency_stats.max.wait_on_page_bit_killable.__lock_page_or_retry.filemap_fault.__do_fault.do_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
>      90871 ±125%     -56.2%      39835 ± 74%  
> latency_stats.sum.pipe_read.__vfs_read.vfs_read.SyS_read.entry_SYSCALL_64_fastpath
>      61821 ± 22%     -86.6%       8254 ± 62%  
> latency_stats.sum.sigsuspend.SyS_rt_sigsuspend.entry_SYSCALL_64_fastpath
>       0.00 ± -1%      +Inf%      59392 ±118%  
> latency_stats.sum.throttle_direct_reclaim.try_to_free_pages.__alloc_pages_nodemask.alloc_pages_current.__page_cache_alloc.__do_page_cache_readahead.filemap_fault.xfs_filemap_fault.__do_fault.do_fault.handle_mm_fault.__do_page_fault
>       0.00 ± -1%      +Inf%    1549096 ± 24%  
> latency_stats.sum.wait_on_page_bit_killable.__lock_page_or_retry.filemap_fault.__do_fault.do_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
>     639.30 ±  8%     -38.8%     391.40 ±  6%  slabinfo.RAW.active_objs
>     639.30 ±  8%     -38.8%     391.40 ±  6%  slabinfo.RAW.num_objs
>     555.90 ± 14%     -50.7%     274.10 ± 36%  
> slabinfo.nfs_commit_data.active_objs
>     555.90 ± 14%     -50.7%     274.10 ± 36%  
> slabinfo.nfs_commit_data.num_objs
>   10651978 ±  0%     -80.0%    2126718 ±  0%  
> slabinfo.radix_tree_node.active_objs
>     218915 ±  0%     -81.9%      39535 ±  0%  
> slabinfo.radix_tree_node.active_slabs
>   12259274 ±  0%     -81.9%    2213762 ±  0%  
> slabinfo.radix_tree_node.num_objs
>     218915 ±  0%     -81.9%      39535 ±  0%  
> slabinfo.radix_tree_node.num_slabs
>    8503640 ±  1%     -87.8%    1038681 ±  0%  meminfo.Active
>    8155208 ±  1%     -91.5%     692744 ±  0%  meminfo.Active(file)
>   47732497 ±  0%     +13.9%   54365008 ±  0%  meminfo.Cached
>   38794624 ±  0%     +36.4%   52899738 ±  0%  meminfo.Inactive
>   38748440 ±  0%     +36.4%   52853183 ±  0%  meminfo.Inactive(file)
>   45315491 ±  0%     -24.0%   34459599 ±  0%  meminfo.Mapped
>    2693407 ±  5%     -30.7%    1867438 ±  3%  meminfo.MemFree
>    7048370 ±  0%     -81.5%    1303216 ±  0%  meminfo.SReclaimable
>    7145508 ±  0%     -80.4%    1400313 ±  0%  meminfo.Slab
>    4168849 ±  2%     -88.1%     496040 ± 27%  numa-meminfo.node0.Active
>    3987391 ±  1%     -91.3%     346768 ±  0%  numa-meminfo.node0.Active(file)
>   23809283 ±  0%     +13.8%   27087077 ±  0%  numa-meminfo.node0.FilePages
>   19423374 ±  0%     +35.8%   26379857 ±  0%  numa-meminfo.node0.Inactive
>   19402281 ±  0%     +35.8%   26356354 ±  0%  
> numa-meminfo.node0.Inactive(file)
>   22594121 ±  0%     -24.1%   17153129 ±  0%  numa-meminfo.node0.Mapped
>    1430871 ±  5%     -31.2%     984861 ±  2%  numa-meminfo.node0.MemFree
>    3457483 ±  1%     -81.4%     642147 ±  0%  numa-meminfo.node0.SReclaimable
>    3507005 ±  1%     -80.3%     692577 ±  0%  numa-meminfo.node0.Slab
>    4349443 ±  3%     -87.5%     543711 ± 24%  numa-meminfo.node1.Active
>    4181422 ±  3%     -91.7%     346861 ±  1%  numa-meminfo.node1.Active(file)
>   23896184 ±  0%     +14.2%   27287954 ±  0%  numa-meminfo.node1.FilePages
>   19329324 ±  0%     +37.2%   26528591 ±  0%  numa-meminfo.node1.Inactive
>   19304364 ±  0%     +37.3%   26505692 ±  0%  
> numa-meminfo.node1.Inactive(file)
>   22671758 ±  0%     -23.7%   17303673 ±  0%  numa-meminfo.node1.Mapped
>    1299430 ±  7%     -32.8%     873435 ±  6%  numa-meminfo.node1.MemFree
>    3589265 ±  1%     -81.6%     661650 ±  0%  numa-meminfo.node1.SReclaimable
>    3636880 ±  1%     -80.5%     708315 ±  0%  numa-meminfo.node1.Slab
>     994864 ±  1%     -91.3%      86711 ±  0%  numa-vmstat.node0.nr_active_file
>    5952715 ±  0%     +13.8%    6773427 ±  0%  numa-vmstat.node0.nr_file_pages
>     356982 ±  5%     -31.5%     244513 ±  3%  numa-vmstat.node0.nr_free_pages
>    4853127 ±  0%     +35.8%    6590709 ±  0%  
> numa-vmstat.node0.nr_inactive_file
>     394.70 ± 15%     -62.9%     146.60 ± 32%  
> numa-vmstat.node0.nr_isolated_file
>    5649360 ±  0%     -24.1%    4288873 ±  0%  numa-vmstat.node0.nr_mapped
>      28030 ± 53%     -97.7%     648.30 ± 10%  
> numa-vmstat.node0.nr_pages_scanned
>     864516 ±  1%     -81.4%     160512 ±  0%  
> numa-vmstat.node0.nr_slab_reclaimable
>  1.522e+08 ±  4%    +155.9%  3.893e+08 ±  1%  numa-vmstat.node0.numa_hit
>  1.521e+08 ±  4%    +155.9%  3.893e+08 ±  1%  numa-vmstat.node0.numa_local
>     217926 ±  3%     -84.4%      33949 ±  2%  
> numa-vmstat.node0.workingset_activate
>   60138428 ±  2%     -72.5%   16533446 ±  0%  
> numa-vmstat.node0.workingset_nodereclaim
>    4367580 ±  3%    +158.4%   11285489 ±  1%  
> numa-vmstat.node0.workingset_refault
>    1043245 ±  3%     -91.7%      86749 ±  1%  numa-vmstat.node1.nr_active_file
>    5974941 ±  0%     +14.2%    6823255 ±  0%  numa-vmstat.node1.nr_file_pages
>     323798 ±  7%     -33.0%     216945 ±  5%  numa-vmstat.node1.nr_free_pages
>    4829122 ±  1%     +37.2%    6627644 ±  0%  
> numa-vmstat.node1.nr_inactive_file
>     395.80 ±  8%     -68.5%     124.80 ± 46%  
> numa-vmstat.node1.nr_isolated_file
>    5669082 ±  0%     -23.7%    4326551 ±  0%  numa-vmstat.node1.nr_mapped
>      32004 ± 60%     -99.9%      47.00 ±  9%  
> numa-vmstat.node1.nr_pages_scanned
>     897351 ±  1%     -81.6%     165406 ±  0%  
> numa-vmstat.node1.nr_slab_reclaimable
>  1.535e+08 ±  4%    +154.6%  3.909e+08 ±  0%  numa-vmstat.node1.numa_hit
>  1.535e+08 ±  4%    +154.7%  3.909e+08 ±  0%  numa-vmstat.node1.numa_local
>     235134 ±  5%     -85.7%      33507 ±  2%  
> numa-vmstat.node1.workingset_activate
>   59647268 ±  1%     -72.1%   16626347 ±  0%  
> numa-vmstat.node1.workingset_nodereclaim
>    4535102 ±  4%    +151.1%   11389137 ±  0%  
> numa-vmstat.node1.workingset_refault
>     347641 ± 13%     +97.0%     684832 ±  0%  proc-vmstat.allocstall
>       7738 ±  9%    +236.5%      26042 ±  0%  
> proc-vmstat.kswapd_low_wmark_hit_quickly
>    2041367 ±  1%     -91.5%     173206 ±  0%  proc-vmstat.nr_active_file
>    1233230 ±  0%     +11.7%    1378011 ±  0%  
> proc-vmstat.nr_dirty_background_threshold
>    2466460 ±  0%     +11.7%    2756024 ±  0%  proc-vmstat.nr_dirty_threshold
>   11933740 ±  0%     +13.9%   13594909 ±  0%  proc-vmstat.nr_file_pages
>     671934 ±  5%     -31.1%     463093 ±  3%  proc-vmstat.nr_free_pages
>    9685062 ±  0%     +36.5%   13216819 ±  0%  proc-vmstat.nr_inactive_file
>     792.80 ± 10%     -67.9%     254.20 ± 34%  proc-vmstat.nr_isolated_file
>   11327952 ±  0%     -23.9%    8616859 ±  0%  proc-vmstat.nr_mapped
>      73994 ± 51%     -99.1%     657.00 ±  7%  proc-vmstat.nr_pages_scanned
>    1762423 ±  0%     -81.5%     325807 ±  0%  proc-vmstat.nr_slab_reclaimable
>      72.30 ± 23%    +852.4%     688.60 ± 58%  
> proc-vmstat.nr_vmscan_immediate_reclaim
>       5392 ±  2%     -11.9%       4750 ±  2%  proc-vmstat.numa_hint_faults
>  5.728e+08 ±  5%    +163.4%  1.509e+09 ±  0%  proc-vmstat.numa_hit
>  5.728e+08 ±  5%    +163.4%  1.509e+09 ±  0%  proc-vmstat.numa_local
>       5638 ±  4%     -12.5%       4935 ±  3%  proc-vmstat.numa_pte_updates
>       8684 ±  8%    +215.8%      27427 ±  0%  proc-vmstat.pageoutrun
>    3220941 ±  0%     -90.2%     315751 ±  0%  proc-vmstat.pgactivate
>   17739240 ±  1%    +143.6%   43217427 ±  0%  proc-vmstat.pgalloc_dma32
>    6.6e+08 ±  0%    +138.1%  1.572e+09 ±  0%  proc-vmstat.pgalloc_normal
>  6.737e+08 ±  0%     -92.4%   51517407 ±  0%  proc-vmstat.pgfault
>  6.767e+08 ±  0%    +138.5%  1.614e+09 ±  0%  proc-vmstat.pgfree
>   6.73e+08 ±  0%     -92.5%   50568722 ±  0%  proc-vmstat.pgmajfault
>   31567471 ±  1%     +91.6%   60472288 ±  0%  proc-vmstat.pgscan_direct_dma32
>  1.192e+09 ±  2%     +84.5%  2.199e+09 ±  0%  proc-vmstat.pgscan_direct_normal
>   16309661 ±  0%    +150.4%   40841573 ±  0%  proc-vmstat.pgsteal_direct_dma32
>  6.151e+08 ±  0%    +140.8%  1.481e+09 ±  0%  
> proc-vmstat.pgsteal_direct_normal
>     939746 ± 18%    +101.3%    1891322 ±  6%  proc-vmstat.pgsteal_kswapd_dma32
>   27432476 ±  4%    +162.4%   71970660 ±  2%  
> proc-vmstat.pgsteal_kswapd_normal
>  4.802e+08 ±  5%     -81.5%   88655347 ±  0%  proc-vmstat.slabs_scanned
>     452671 ±  2%     -85.1%      67360 ±  1%  proc-vmstat.workingset_activate
>  1.198e+08 ±  1%     -72.4%   33135682 ±  0%  
> proc-vmstat.workingset_nodereclaim
>    8898128 ±  1%    +154.6%   22657102 ±  0%  proc-vmstat.workingset_refault
>     613962 ± 12%     -18.6%     499880 ±  9%  
> sched_debug.cfs_rq:/.min_vruntime.stddev
>      31.47 ± 38%    +203.5%      95.52 ± 29%  
> sched_debug.cfs_rq:/.nr_spread_over.max
>       6.19 ± 32%    +150.9%      15.53 ± 24%  
> sched_debug.cfs_rq:/.nr_spread_over.stddev
>      41.71 ± 51%     -42.3%      24.07 ± 12%  
> sched_debug.cfs_rq:/.runnable_load_avg.avg
>       1094 ±106%     -60.9%     427.95 ± 25%  
> sched_debug.cfs_rq:/.runnable_load_avg.max
>     163.22 ± 92%     -63.2%      60.09 ± 28%  
> sched_debug.cfs_rq:/.runnable_load_avg.stddev
>     613932 ± 12%     -18.6%     499833 ±  9%  
> sched_debug.cfs_rq:/.spread0.stddev
>      35.20 ±  8%     -29.1%      24.97 ± 11%  sched_debug.cpu.cpu_load[0].avg
>     731.80 ± 11%     -36.1%     467.45 ± 21%  sched_debug.cpu.cpu_load[0].max
>     116.23 ± 10%     -43.5%      65.72 ± 23%  
> sched_debug.cpu.cpu_load[0].stddev
>      35.25 ±  8%     -25.6%      26.23 ± 10%  sched_debug.cpu.cpu_load[1].avg
>     722.47 ± 10%     -30.2%     504.05 ± 18%  sched_debug.cpu.cpu_load[1].max
>     115.25 ± 10%     -38.5%      70.82 ± 19%  
> sched_debug.cpu.cpu_load[1].stddev
>      35.37 ±  8%     -22.4%      27.45 ±  8%  sched_debug.cpu.cpu_load[2].avg
>     721.90 ±  9%     -27.7%     521.60 ± 16%  sched_debug.cpu.cpu_load[2].max
>      10.85 ± 14%     +16.9%      12.68 ±  6%  sched_debug.cpu.cpu_load[2].min
>     114.93 ±  9%     -35.1%      74.62 ± 16%  
> sched_debug.cpu.cpu_load[2].stddev
>      35.20 ±  8%     -21.3%      27.70 ±  5%  sched_debug.cpu.cpu_load[3].avg
>     705.73 ±  9%     -29.6%     496.57 ± 13%  sched_debug.cpu.cpu_load[3].max
>      10.95 ± 13%     +18.7%      13.00 ±  4%  sched_debug.cpu.cpu_load[3].min
>     112.58 ±  9%     -34.8%      73.35 ± 12%  
> sched_debug.cpu.cpu_load[3].stddev
>      34.96 ±  8%     -21.7%      27.39 ±  5%  sched_debug.cpu.cpu_load[4].avg
>     684.63 ± 10%     -32.0%     465.83 ± 11%  sched_debug.cpu.cpu_load[4].max
>      11.10 ± 12%     +17.7%      13.07 ±  3%  sched_debug.cpu.cpu_load[4].min
>     110.03 ±  9%     -36.1%      70.28 ± 10%  
> sched_debug.cpu.cpu_load[4].stddev
>     293.58 ± 28%    +110.8%     618.85 ± 32%  sched_debug.cpu.curr->pid.min
>      18739 ±  3%     +10.5%      20713 ±  1%  sched_debug.cpu.nr_switches.avg
>      33332 ± 10%     +21.0%      40337 ±  6%  sched_debug.cpu.nr_switches.max
>       4343 ± 10%     +34.8%       5852 ±  8%  
> sched_debug.cpu.nr_switches.stddev
>      19363 ±  3%      +9.2%      21136 ±  1%  sched_debug.cpu.sched_count.avg
>      20.35 ± 17%     -31.5%      13.93 ± 22%  sched_debug.cpu.sched_goidle.min
>       9245 ±  3%     +12.5%      10398 ±  0%  sched_debug.cpu.ttwu_count.avg
>      16837 ± 10%     +27.0%      21390 ±  8%  sched_debug.cpu.ttwu_count.max
>       2254 ±  8%     +39.5%       3143 ±  8%  
> sched_debug.cpu.ttwu_count.stddev
>       8052 ±  4%     +16.2%       9353 ±  0%  sched_debug.cpu.ttwu_local.avg
>       5846 ±  4%     +11.0%       6491 ±  2%  sched_debug.cpu.ttwu_local.min
>       1847 ± 11%     +39.8%       2582 ±  8%  
> sched_debug.cpu.ttwu_local.stddev
>       3.66 ±  4%    -100.0%       0.00 ± -1%  
> perf-profile.cycles-pp.__add_to_page_cache_locked.add_to_page_cache_lru.filemap_fault.xfs_filemap_fault.__do_fault
>       0.00 ± -1%      +Inf%       1.12 ±  0%  
> perf-profile.cycles-pp.__add_to_page_cache_locked.add_to_page_cache_lru.mpage_readpages.xfs_vm_readpages.__do_page_cache_readahead
>       0.00 ± -1%      +Inf%      77.72 ±  0%  
> perf-profile.cycles-pp.__alloc_pages_nodemask.alloc_pages_current.__page_cache_alloc.__do_page_cache_readahead.filemap_fault
>      79.28 ±  0%    -100.0%       0.00 ± -1%  
> perf-profile.cycles-pp.__alloc_pages_nodemask.alloc_pages_current.__page_cache_alloc.filemap_fault.xfs_filemap_fault
>      11.43 ±  5%     -89.4%       1.21 ±  4%  
> perf-profile.cycles-pp.__delete_from_page_cache.__remove_mapping.shrink_page_list.shrink_inactive_list.shrink_zone_memcg
>       0.00 ± -1%      +Inf%      96.93 ±  0%  
> perf-profile.cycles-pp.__do_fault.do_fault.handle_mm_fault.__do_page_fault.do_page_fault
>      91.04 ±  0%    -100.0%       0.00 ± -1%  
> perf-profile.cycles-pp.__do_fault.handle_pte_fault.handle_mm_fault.__do_page_fault.do_page_fault
>       0.00 ± -1%      +Inf%      96.66 ±  0%  
> perf-profile.cycles-pp.__do_page_cache_readahead.filemap_fault.xfs_filemap_fault.__do_fault.do_fault
>      29.86 ±  3%     -96.9%       0.92 ± 19%  
> perf-profile.cycles-pp.__list_lru_walk_one.isra.3.list_lru_walk_one.scan_shadow_nodes.shrink_slab.shrink_zone
>       1.59 ± 14%    -100.0%       0.00 ± -1%  
> perf-profile.cycles-pp.__lru_cache_add.lru_cache_add.add_to_page_cache_lru.filemap_fault.xfs_filemap_fault
>       0.00 ± -1%      +Inf%       5.67 ±  5%  
> perf-profile.cycles-pp.__lru_cache_add.lru_cache_add.add_to_page_cache_lru.mpage_readpages.xfs_vm_readpages
>       0.00 ± -1%      +Inf%      78.11 ±  0%  
> perf-profile.cycles-pp.__page_cache_alloc.__do_page_cache_readahead.filemap_fault.xfs_filemap_fault.__do_fault
>      79.40 ±  0%    -100.0%       0.00 ± -1%  
> perf-profile.cycles-pp.__page_cache_alloc.filemap_fault.xfs_filemap_fault.__do_fault.handle_pte_fault
>       1.28 ±  4%     -38.7%       0.78 ±  1%  
> perf-profile.cycles-pp.__radix_tree_lookup.__delete_from_page_cache.__remove_mapping.shrink_page_list.shrink_inactive_list
>      25.30 ±  6%     -84.2%       3.99 ±  5%  
> perf-profile.cycles-pp.__remove_mapping.shrink_page_list.shrink_inactive_list.shrink_zone_memcg.shrink_zone
>       0.56 ±  0%     +98.2%       1.11 ±  0%  
> perf-profile.cycles-pp.__rmqueue.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_current.__page_cache_alloc
>       0.00 ± -1%      +Inf%       1.11 ±  0%  
> perf-profile.cycles-pp.__xfs_get_blocks.xfs_get_blocks.do_mpage_readpage.mpage_readpages.xfs_vm_readpages
>       0.01 ±133%  +30254.3%       2.66 ±  8%  
> perf-profile.cycles-pp._raw_spin_lock.free_pcppages_bulk.free_hot_cold_page.free_hot_cold_page_list.shrink_page_list
>       5.07 ± 25%    +268.7%      18.71 ±  3%  
> perf-profile.cycles-pp._raw_spin_lock.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_current.__page_cache_alloc
>       9.16 ±  6%    -100.0%       0.00 ± -1%  
> perf-profile.cycles-pp._raw_spin_lock.list_lru_add.__delete_from_page_cache.__remove_mapping.shrink_page_list
>       0.69 ± 64%    -100.0%       0.00 ± -1%  
> perf-profile.cycles-pp._raw_spin_lock.list_lru_del.__add_to_page_cache_locked.add_to_page_cache_lru.filemap_fault
>      27.69 ±  3%    -100.0%       0.00 ± -1%  
> perf-profile.cycles-pp._raw_spin_lock.shadow_lru_isolate.__list_lru_walk_one.list_lru_walk_one.scan_shadow_nodes
>      10.77 ± 10%    +238.5%      36.45 ±  1%  
> perf-profile.cycles-pp._raw_spin_lock_irq.shrink_inactive_list.shrink_zone_memcg.shrink_zone.do_try_to_free_pages
>       0.35 ±  9%    +193.4%       1.02 ± 13%  
> perf-profile.cycles-pp._raw_spin_lock_irq.shrink_inactive_list.shrink_zone_memcg.shrink_zone.kswapd
>      12.86 ±  9%     -89.4%       1.36 ± 17%  
> perf-profile.cycles-pp._raw_spin_lock_irqsave.__remove_mapping.shrink_page_list.shrink_inactive_list.shrink_zone_memcg
>       1.11 ± 18%    +333.5%       4.83 ±  6%  
> perf-profile.cycles-pp._raw_spin_lock_irqsave.pagevec_lru_move_fn.__lru_cache_add.lru_cache_add.add_to_page_cache_lru
>       5.38 ±  5%    -100.0%       0.00 ± -1%  
> perf-profile.cycles-pp.add_to_page_cache_lru.filemap_fault.xfs_filemap_fault.__do_fault.handle_pte_fault
>       0.00 ± -1%      +Inf%       7.15 ±  4%  
> perf-profile.cycles-pp.add_to_page_cache_lru.mpage_readpages.xfs_vm_readpages.__do_page_cache_readahead.filemap_fault
>       0.00 ± -1%      +Inf%      78.06 ±  0%  
> perf-profile.cycles-pp.alloc_pages_current.__page_cache_alloc.__do_page_cache_readahead.filemap_fault.xfs_filemap_fault
>      79.38 ±  0%    -100.0%       0.00 ± -1%  
> perf-profile.cycles-pp.alloc_pages_current.__page_cache_alloc.filemap_fault.xfs_filemap_fault.__do_fault
>       0.00 ± -1%      +Inf%      97.32 ±  0%  
> perf-profile.cycles-pp.do_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
>       5.19 ±  2%    -100.0%       0.00 ± -1%  
> perf-profile.cycles-pp.do_mpage_readpage.mpage_readpage.xfs_vm_readpage.filemap_fault.xfs_filemap_fault
>       0.00 ± -1%      +Inf%      10.68 ±  1%  
> perf-profile.cycles-pp.do_mpage_readpage.mpage_readpages.xfs_vm_readpages.__do_page_cache_readahead.filemap_fault
>       0.72 ± 67%     -98.3%       0.01 ± 87%  
> perf-profile.cycles-pp.do_syscall_64.return_from_SYSCALL_64.__libc_fork
>      72.75 ±  1%     -23.2%      55.88 ±  1%  
> perf-profile.cycles-pp.do_try_to_free_pages.try_to_free_pages.__alloc_pages_nodemask.alloc_pages_current.__page_cache_alloc
>       0.00 ± -1%      +Inf%      96.86 ±  0%  
> perf-profile.cycles-pp.filemap_fault.xfs_filemap_fault.__do_fault.do_fault.handle_mm_fault
>      90.80 ±  0%    -100.0%       0.00 ± -1%  
> perf-profile.cycles-pp.filemap_fault.xfs_filemap_fault.__do_fault.handle_pte_fault.handle_mm_fault
>       2.39 ±  3%    -100.0%       0.00 ± -1%  
> perf-profile.cycles-pp.filemap_map_pages.handle_pte_fault.handle_mm_fault.__do_page_fault.do_page_fault
>       0.97 ± 12%    +321.3%       4.07 ±  6%  
> perf-profile.cycles-pp.free_hot_cold_page.free_hot_cold_page_list.shrink_page_list.shrink_inactive_list.shrink_zone_memcg
>       1.03 ±  9%    +303.4%       4.17 ±  5%  
> perf-profile.cycles-pp.free_hot_cold_page_list.shrink_page_list.shrink_inactive_list.shrink_zone_memcg.shrink_zone
>       0.65 ± 23%    +451.9%       3.58 ±  6%  
> perf-profile.cycles-pp.free_pcppages_bulk.free_hot_cold_page.free_hot_cold_page_list.shrink_page_list.shrink_inactive_list
>       0.00 ± -1%      +Inf%      21.18 ±  2%  
> perf-profile.cycles-pp.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_current.__page_cache_alloc.__do_page_cache_readahead
>       6.22 ± 21%    -100.0%       0.00 ± -1%  
> perf-profile.cycles-pp.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_current.__page_cache_alloc.filemap_fault
>      94.07 ±  0%    -100.0%       0.00 ± -1%  
> perf-profile.cycles-pp.handle_pte_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
>       0.57 ±  1%    +104.6%       1.16 ±  2%  
> perf-profile.cycles-pp.isolate_lru_pages.isra.47.shrink_inactive_list.shrink_zone_memcg.shrink_zone.do_try_to_free_pages
>       2.96 ±  7%     -30.5%       2.05 ±  9%  
> perf-profile.cycles-pp.kthread.ret_from_fork
>       9.58 ±  6%    -100.0%       0.00 ±229%  
> perf-profile.cycles-pp.list_lru_add.__delete_from_page_cache.__remove_mapping.shrink_page_list.shrink_inactive_list
>       1.88 ±  6%    -100.0%       0.00 ± -1%  
> perf-profile.cycles-pp.list_lru_del.__add_to_page_cache_locked.add_to_page_cache_lru.filemap_fault.xfs_filemap_fault
>      29.08 ±  3%     -97.0%       0.89 ± 19%  
> perf-profile.cycles-pp.list_lru_walk_one.scan_shadow_nodes.shrink_slab.shrink_zone.do_try_to_free_pages
>       1.59 ± 14%    -100.0%       0.00 ± -1%  
> perf-profile.cycles-pp.lru_cache_add.add_to_page_cache_lru.filemap_fault.xfs_filemap_fault.__do_fault
>       0.00 ± -1%      +Inf%       5.68 ±  5%  
> perf-profile.cycles-pp.lru_cache_add.add_to_page_cache_lru.mpage_readpages.xfs_vm_readpages.__do_page_cache_readahead
>       5.24 ±  2%    -100.0%       0.00 ± -1%  
> perf-profile.cycles-pp.mpage_readpage.xfs_vm_readpage.filemap_fault.xfs_filemap_fault.__do_fault
>       0.00 ± -1%      +Inf%      18.20 ±  1%  
> perf-profile.cycles-pp.mpage_readpages.xfs_vm_readpages.__do_page_cache_readahead.filemap_fault.xfs_filemap_fault
>       2.37 ± 14%     +79.9%       4.27 ± 13%  
> perf-profile.cycles-pp.native_flush_tlb_others.try_to_unmap_flush.shrink_page_list.shrink_inactive_list.shrink_zone_memcg
>       0.01 ±133%  +30322.9%       2.66 ±  8%  
> perf-profile.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.free_pcppages_bulk.free_hot_cold_page.free_hot_cold_page_list
>       5.07 ± 25%    +268.8%      18.71 ±  3%  
> perf-profile.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_current
>       9.16 ±  6%    -100.0%       0.00 ± -1%  
> perf-profile.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.list_lru_add.__delete_from_page_cache.__remove_mapping
>       0.75 ± 57%    -100.0%       0.00 ± -1%  
> perf-profile.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.list_lru_del.__add_to_page_cache_locked.add_to_page_cache_lru
>      27.68 ±  3%    -100.0%       0.00 ± -1%  
> perf-profile.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.shadow_lru_isolate.__list_lru_walk_one.list_lru_walk_one
>      11.09 ± 10%    +237.5%      37.44 ±  0%  
> perf-profile.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irq.shrink_inactive_list.shrink_zone_memcg.shrink_zone
>      12.76 ±  9%     -90.9%       1.17 ± 22%  
> perf-profile.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.__remove_mapping.shrink_page_list.shrink_inactive_list
>       1.08 ± 19%    +338.2%       4.75 ±  7%  
> perf-profile.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.pagevec_lru_move_fn.__lru_cache_add.lru_cache_add
>       1.81 ±  2%     -73.7%       0.48 ±  1%  
> perf-profile.cycles-pp.page_check_address_transhuge.page_referenced_one.rmap_walk_file.rmap_walk.page_referenced
>       3.24 ±  1%     -42.5%       1.87 ±  2%  
> perf-profile.cycles-pp.page_referenced.shrink_page_list.shrink_inactive_list.shrink_zone_memcg.shrink_zone
>       2.20 ±  2%     -66.0%       0.75 ±  5%  
> perf-profile.cycles-pp.page_referenced_one.rmap_walk_file.rmap_walk.page_referenced.shrink_page_list
>       1.54 ± 14%    -100.0%       0.00 ± -1%  
> perf-profile.cycles-pp.pagevec_lru_move_fn.__lru_cache_add.lru_cache_add.add_to_page_cache_lru.filemap_fault
>       0.00 ± -1%      +Inf%       5.57 ±  5%  
> perf-profile.cycles-pp.pagevec_lru_move_fn.__lru_cache_add.lru_cache_add.add_to_page_cache_lru.mpage_readpages
>       2.07 ±  4%    -100.0%       0.00 ± -1%  
> perf-profile.cycles-pp.radix_tree_next_chunk.filemap_map_pages.handle_pte_fault.handle_mm_fault.__do_page_fault
>       3.01 ±  6%     -31.7%       2.05 ±  9%  
> perf-profile.cycles-pp.ret_from_fork
>       0.72 ± 67%     -98.5%       0.01 ± 94%  
> perf-profile.cycles-pp.return_from_SYSCALL_64.__libc_fork
>       3.15 ±  1%     -46.0%       1.70 ±  1%  
> perf-profile.cycles-pp.rmap_walk.page_referenced.shrink_page_list.shrink_inactive_list.shrink_zone_memcg
>       3.02 ±  2%     -48.4%       1.56 ±  1%  
> perf-profile.cycles-pp.rmap_walk_file.rmap_walk.page_referenced.shrink_page_list.shrink_inactive_list
>      29.08 ±  3%     -97.0%       0.89 ± 19%  
> perf-profile.cycles-pp.scan_shadow_nodes.shrink_slab.shrink_zone.do_try_to_free_pages.try_to_free_pages
>      28.89 ±  3%     -97.1%       0.84 ± 22%  
> perf-profile.cycles-pp.shadow_lru_isolate.__list_lru_walk_one.list_lru_walk_one.scan_shadow_nodes.shrink_slab
>      44.93 ±  4%     +21.9%      54.77 ±  1%  
> perf-profile.cycles-pp.shrink_inactive_list.shrink_zone_memcg.shrink_zone.do_try_to_free_pages.try_to_free_pages
>      33.07 ±  4%     -50.8%      16.28 ±  3%  
> perf-profile.cycles-pp.shrink_page_list.shrink_inactive_list.shrink_zone_memcg.shrink_zone.do_try_to_free_pages
>       1.11 ± 16%     -22.6%       0.86 ±  6%  
> perf-profile.cycles-pp.shrink_page_list.shrink_inactive_list.shrink_zone_memcg.shrink_zone.kswapd
>      29.15 ±  3%     -96.8%       0.94 ± 16%  
> perf-profile.cycles-pp.shrink_slab.shrink_zone.do_try_to_free_pages.try_to_free_pages.__alloc_pages_nodemask
>      73.07 ±  1%     -23.5%      55.91 ±  1%  
> perf-profile.cycles-pp.shrink_zone.do_try_to_free_pages.try_to_free_pages.__alloc_pages_nodemask.alloc_pages_current
>      45.01 ±  4%     +22.1%      54.95 ±  1%  
> perf-profile.cycles-pp.shrink_zone_memcg.shrink_zone.do_try_to_free_pages.try_to_free_pages.__alloc_pages_nodemask
>       2.35 ± 14%     +78.9%       4.21 ± 13%  
> perf-profile.cycles-pp.smp_call_function_many.native_flush_tlb_others.try_to_unmap_flush.shrink_page_list.shrink_inactive_list
>       0.00 ± -1%      +Inf%      55.91 ±  1%  
> perf-profile.cycles-pp.try_to_free_pages.__alloc_pages_nodemask.alloc_pages_current.__page_cache_alloc.__do_page_cache_readahead
>      72.76 ±  1%    -100.0%       0.00 ± -1%  
> perf-profile.cycles-pp.try_to_free_pages.__alloc_pages_nodemask.alloc_pages_current.__page_cache_alloc.filemap_fault
>       2.38 ± 14%     +79.5%       4.28 ± 13%  
> perf-profile.cycles-pp.try_to_unmap_flush.shrink_page_list.shrink_inactive_list.shrink_zone_memcg.shrink_zone
>       0.58 ±  1%     +51.9%       0.88 ± 14%  
> perf-profile.cycles-pp.workingset_eviction.__remove_mapping.shrink_page_list.shrink_inactive_list.shrink_zone_memcg
>       0.00 ± -1%      +Inf%      96.89 ±  0%  
> perf-profile.cycles-pp.xfs_filemap_fault.__do_fault.do_fault.handle_mm_fault.__do_page_fault
>      91.02 ±  0%    -100.0%       0.00 ± -1%  
> perf-profile.cycles-pp.xfs_filemap_fault.__do_fault.handle_pte_fault.handle_mm_fault.__do_page_fault
>       0.00 ± -1%      +Inf%       1.11 ±  0%  
> perf-profile.cycles-pp.xfs_get_blocks.do_mpage_readpage.mpage_readpages.xfs_vm_readpages.__do_page_cache_readahead
>       5.26 ±  2%    -100.0%       0.00 ± -1%  
> perf-profile.cycles-pp.xfs_vm_readpage.filemap_fault.xfs_filemap_fault.__do_fault.handle_pte_fault
>       0.00 ± -1%      +Inf%      18.21 ±  1%  
> perf-profile.cycles-pp.xfs_vm_readpages.__do_page_cache_readahead.filemap_fault.xfs_filemap_fault.__do_fault
> 
> 
> 
> lkp-hsw01: Grantley Haswell-EP
> Memory: 64G
> 
> 
> 
> 
>                            vm-scalability.time.user_time
> 
>   140 ++--------------------------------------------------------------------+
>       |******* ****** *************************************** **** ****** ***
>   120 *+      *    * *               *       *               *    *      *  |
>       |                                                                     |
>   100 ++                                                                    |
>       |                                                                     |
>    80 ++                                                                    |
>       |                                                                     |
>    60 ++                                                                    |
>       |                                                                     |
>    40 ++                                                                    |
>       |                                                                     |
>    20 OOOO      O   O OOO O  OO OO  OO                                      |
>       | O OOOOOO OOO O   OOOO OO  OO                                        |
>     0 ++--------------------------------------------------------------------+
> 
> 
>                         vm-scalability.time.major_page_faults
> 
>   7e+08 ++*-***-----*-----**--*-----*---****-*--*-***------------*---***--*-+
>         ** * * ***** ****** ** *********  * **** *  *****************  ******
>   6e+08 ++                                                                  |
>         |                                                                   |
>   5e+08 ++                                                                  |
>         |                                                                   |
>   4e+08 ++                                                                  |
>         |                                                                   |
>   3e+08 ++                                                                  |
>         |                                                                   |
>   2e+08 ++                                                                  |
>         |                                                                   |
>   1e+08 ++                                                                  |
>         OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO                                     |
>       0 ++------------------------------------------------------------------+
> 
> 
>       [*] bisect-good sample
>       [O] bisect-bad  sample
> 
> To reproduce:
> 
>         git clone 
> git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
>         cd lkp-tests
>         bin/lkp install job.yaml  # job file is attached in this email
>         bin/lkp run     job.yaml
> 
> 
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
> 
> 
> Thanks,
> Xiaolong Ye

Reply via email to