Re: [lkp] [x86/build] b2c51106c75: -18.1% will-it-scale.per_process_ops
On Wed, 2015-08-05 at 10:38 +0200, Ingo Molnar wrote: > * kernel test robot wrote: > > > FYI, we noticed the below changes on > > > > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/asm > > commit b2c51106c7581866c37ffc77c5d739f3d4b7cbc9 ("x86/build: Fix detection > > of GCC -mpreferred-stack-boundary support") > > Does the performance regression go away reproducibly if you do: > >git revert b2c51106c7581866c37ffc77c5d739f3d4b7cbc9 > > ? Sorry for reply so late! Revert the commit will restore part of the performance, as below. parent commit: f2a50f8b7da45ff2de93a71393e715a2ab9f3b68 the commit:b2c51106c7581866c37ffc77c5d739f3d4b7cbc9 revert commit: 987d12601a4a82cc2f2151b1be704723eb84cb9d = tbox_group/testcase/rootfs/kconfig/compiler/cpufreq_governor/test: wsm/will-it-scale/debian-x86_64-2015-02-07.cgz/x86_64-rhel/gcc-4.9/performance/readseek2 commit: f2a50f8b7da45ff2de93a71393e715a2ab9f3b68 b2c51106c7581866c37ffc77c5d739f3d4b7cbc9 987d12601a4a82cc2f2151b1be704723eb84cb9d f2a50f8b7da45ff2 b2c51106c7581866c37ffc77c5 987d12601a4a82cc2f2151b1be -- -- %stddev %change %stddev %change %stddev \ |\ |\ 879002 ± 0% -18.1% 720270 ± 7% -3.6% 847011 ± 2% will-it-scale.per_process_ops 0.02 ± 0% +34.5% 0.02 ± 7% +5.6% 0.02 ± 2% will-it-scale.scalability 11144 ± 0% +0.1% 11156 ± 0% +10.6% 12320 ± 0% will-it-scale.time.minor_page_faults 769.30 ± 0% -0.9% 762.15 ± 0% +1.1% 777.42 ± 0% will-it-scale.time.system_time 26153173 ± 0% +7.0% 27977076 ± 0% +3.5% 27078124 ± 0% will-it-scale.time.voluntary_context_switches 2964 ± 2% +1.4% 3004 ± 1% -51.9% 1426 ± 2% proc-vmstat.pgactivate 0.06 ± 27%+154.5% 0.14 ± 44%+122.7% 0.12 ± 24% turbostat.CPU%c3 370683 ± 0% +6.2% 393491 ± 0% +2.4% 379575 ± 0% vmstat.system.cs 11144 ± 0% +0.1% 11156 ± 0% +10.6% 12320 ± 0% time.minor_page_faults 15.70 ± 2% +14.5% 17.98 ± 0% +1.5% 15.94 ± 1% time.user_time 830343 ± 56% -54.0% 382128 ± 39% -22.3% 645308 ± 65% cpuidle.C1E-NHM.time 788.25 ± 14% -21.7% 617.25 ± 16% -12.3% 691.00 ± 3% cpuidle.C1E-NHM.usage 2489132 ± 20% +79.3%4464147 ± 33% +78.4%4440574 ± 21% cpuidle.C3-NHM.time 1082762 ±162%-100.0% 0.00 ± -1%+189.3%3132030 ±110% latency_stats.avg.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath 102189 ± 2% -2.1% 100087 ± 5% -32.9% 68568 ± 2% latency_stats.hits.pipe_wait.pipe_read.__vfs_read.vfs_read.SyS_read.entry_SYSCALL_64_fastpath 1082762 ±162%-100.0% 0.00 ± -1%+289.6%4217977 ±109% latency_stats.max.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath 1082762 ±162%-100.0% 0.00 ± -1%+478.5%6264061 ±110% latency_stats.sum.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath 5.10 ± 2% -8.0% 4.69 ± 1% +13.0% 5.76 ± 1% perf-profile.cpu-cycles.__kernel_text_address.print_context_stack.dump_trace.save_stack_trace_tsk.__account_scheduler_latency 2.58 ± 8% +19.5% 3.09 ± 3% -1.8% 2.54 ± 11% perf-profile.cpu-cycles._raw_spin_lock_irqsave.finish_wait.__wait_on_bit_lock.__lock_page.find_lock_entry 7.02 ± 3% +9.2% 7.67 ± 2% +7.1% 7.52 ± 3% perf-profile.cpu-cycles._raw_spin_lock_irqsave.prepare_to_wait_exclusive.__wait_on_bit_lock.__lock_page.find_lock_entry 3.07 ± 2% +14.8% 3.53 ± 3% -1.4% 3.03 ± 5% perf-profile.cpu-cycles.finish_wait.__wait_on_bit_lock.__lock_page.find_lock_entry.shmem_getpage_gfp 3.05 ± 5% -8.4% 2.79 ± 4% -5.2% 2.90 ± 5% perf-profile.cpu-cycles.hrtimer_start_range_ns.tick_nohz_stop_sched_tick.__tick_nohz_idle_enter.tick_nohz_idle_enter.cpu_startup_entry 0.89 ± 5% -7.6% 0.82 ± 3% +16.3% 1.03 ± 5% perf-profile.cpu-cycles.is_ftrace_trampoline.__kernel_text_address.print_context_stack.dump_trace.save_stack_trace_tsk 0.98 ± 3% -25.1% 0.74 ± 7% -16.8% 0.82 ± 2%
Re: [lkp] [x86/build] b2c51106c75: -18.1% will-it-scale.per_process_ops
On Wed, 2015-08-05 at 10:38 +0200, Ingo Molnar wrote: > * kernel test robotwrote: > > > FYI, we noticed the below changes on > > > > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/asm > > commit b2c51106c7581866c37ffc77c5d739f3d4b7cbc9 ("x86/build: Fix detection > > of GCC -mpreferred-stack-boundary support") > > Does the performance regression go away reproducibly if you do: > >git revert b2c51106c7581866c37ffc77c5d739f3d4b7cbc9 > > ? Sorry for reply so late! Revert the commit will restore part of the performance, as below. parent commit: f2a50f8b7da45ff2de93a71393e715a2ab9f3b68 the commit:b2c51106c7581866c37ffc77c5d739f3d4b7cbc9 revert commit: 987d12601a4a82cc2f2151b1be704723eb84cb9d = tbox_group/testcase/rootfs/kconfig/compiler/cpufreq_governor/test: wsm/will-it-scale/debian-x86_64-2015-02-07.cgz/x86_64-rhel/gcc-4.9/performance/readseek2 commit: f2a50f8b7da45ff2de93a71393e715a2ab9f3b68 b2c51106c7581866c37ffc77c5d739f3d4b7cbc9 987d12601a4a82cc2f2151b1be704723eb84cb9d f2a50f8b7da45ff2 b2c51106c7581866c37ffc77c5 987d12601a4a82cc2f2151b1be -- -- %stddev %change %stddev %change %stddev \ |\ |\ 879002 ± 0% -18.1% 720270 ± 7% -3.6% 847011 ± 2% will-it-scale.per_process_ops 0.02 ± 0% +34.5% 0.02 ± 7% +5.6% 0.02 ± 2% will-it-scale.scalability 11144 ± 0% +0.1% 11156 ± 0% +10.6% 12320 ± 0% will-it-scale.time.minor_page_faults 769.30 ± 0% -0.9% 762.15 ± 0% +1.1% 777.42 ± 0% will-it-scale.time.system_time 26153173 ± 0% +7.0% 27977076 ± 0% +3.5% 27078124 ± 0% will-it-scale.time.voluntary_context_switches 2964 ± 2% +1.4% 3004 ± 1% -51.9% 1426 ± 2% proc-vmstat.pgactivate 0.06 ± 27%+154.5% 0.14 ± 44%+122.7% 0.12 ± 24% turbostat.CPU%c3 370683 ± 0% +6.2% 393491 ± 0% +2.4% 379575 ± 0% vmstat.system.cs 11144 ± 0% +0.1% 11156 ± 0% +10.6% 12320 ± 0% time.minor_page_faults 15.70 ± 2% +14.5% 17.98 ± 0% +1.5% 15.94 ± 1% time.user_time 830343 ± 56% -54.0% 382128 ± 39% -22.3% 645308 ± 65% cpuidle.C1E-NHM.time 788.25 ± 14% -21.7% 617.25 ± 16% -12.3% 691.00 ± 3% cpuidle.C1E-NHM.usage 2489132 ± 20% +79.3%4464147 ± 33% +78.4%4440574 ± 21% cpuidle.C3-NHM.time 1082762 ±162%-100.0% 0.00 ± -1%+189.3%3132030 ±110% latency_stats.avg.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath 102189 ± 2% -2.1% 100087 ± 5% -32.9% 68568 ± 2% latency_stats.hits.pipe_wait.pipe_read.__vfs_read.vfs_read.SyS_read.entry_SYSCALL_64_fastpath 1082762 ±162%-100.0% 0.00 ± -1%+289.6%4217977 ±109% latency_stats.max.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath 1082762 ±162%-100.0% 0.00 ± -1%+478.5%6264061 ±110% latency_stats.sum.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath 5.10 ± 2% -8.0% 4.69 ± 1% +13.0% 5.76 ± 1% perf-profile.cpu-cycles.__kernel_text_address.print_context_stack.dump_trace.save_stack_trace_tsk.__account_scheduler_latency 2.58 ± 8% +19.5% 3.09 ± 3% -1.8% 2.54 ± 11% perf-profile.cpu-cycles._raw_spin_lock_irqsave.finish_wait.__wait_on_bit_lock.__lock_page.find_lock_entry 7.02 ± 3% +9.2% 7.67 ± 2% +7.1% 7.52 ± 3% perf-profile.cpu-cycles._raw_spin_lock_irqsave.prepare_to_wait_exclusive.__wait_on_bit_lock.__lock_page.find_lock_entry 3.07 ± 2% +14.8% 3.53 ± 3% -1.4% 3.03 ± 5% perf-profile.cpu-cycles.finish_wait.__wait_on_bit_lock.__lock_page.find_lock_entry.shmem_getpage_gfp 3.05 ± 5% -8.4% 2.79 ± 4% -5.2% 2.90 ± 5% perf-profile.cpu-cycles.hrtimer_start_range_ns.tick_nohz_stop_sched_tick.__tick_nohz_idle_enter.tick_nohz_idle_enter.cpu_startup_entry 0.89 ± 5% -7.6% 0.82 ± 3% +16.3% 1.03 ± 5% perf-profile.cpu-cycles.is_ftrace_trampoline.__kernel_text_address.print_context_stack.dump_trace.save_stack_trace_tsk 0.98 ± 3% -25.1% 0.74 ± 7% -16.8% 0.82 ± 2%
Re: [lkp] [x86/build] b2c51106c75: -18.1% will-it-scale.per_process_ops
On Wed, Aug 5, 2015 at 1:38 AM, Ingo Molnar wrote: > > * kernel test robot wrote: > >> FYI, we noticed the below changes on >> >> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/asm >> commit b2c51106c7581866c37ffc77c5d739f3d4b7cbc9 ("x86/build: Fix detection >> of GCC -mpreferred-stack-boundary support") > > Does the performance regression go away reproducibly if you do: > >git revert b2c51106c7581866c37ffc77c5d739f3d4b7cbc9 > > ? FWIW, I spot-checked the generated code. All I saw were the deletion of some dummy subtraction from rsp to align the stack for children, changes of stack frame offsets, and a couple instances in which instructions got reordered. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [lkp] [x86/build] b2c51106c75: -18.1% will-it-scale.per_process_ops
On Wed, Aug 5, 2015 at 1:38 AM, Ingo Molnar mi...@kernel.org wrote: * kernel test robot ying.hu...@intel.com wrote: FYI, we noticed the below changes on git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/asm commit b2c51106c7581866c37ffc77c5d739f3d4b7cbc9 (x86/build: Fix detection of GCC -mpreferred-stack-boundary support) Does the performance regression go away reproducibly if you do: git revert b2c51106c7581866c37ffc77c5d739f3d4b7cbc9 ? FWIW, I spot-checked the generated code. All I saw were the deletion of some dummy subtraction from rsp to align the stack for children, changes of stack frame offsets, and a couple instances in which instructions got reordered. --Andy -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [lkp] [x86/build] b2c51106c75: -18.1% will-it-scale.per_process_ops
* kernel test robot wrote: > FYI, we noticed the below changes on > > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/asm > commit b2c51106c7581866c37ffc77c5d739f3d4b7cbc9 ("x86/build: Fix detection of > GCC -mpreferred-stack-boundary support") Does the performance regression go away reproducibly if you do: git revert b2c51106c7581866c37ffc77c5d739f3d4b7cbc9 ? Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [lkp] [x86/build] b2c51106c75: -18.1% will-it-scale.per_process_ops
* kernel test robot ying.hu...@intel.com wrote: FYI, we noticed the below changes on git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/asm commit b2c51106c7581866c37ffc77c5d739f3d4b7cbc9 (x86/build: Fix detection of GCC -mpreferred-stack-boundary support) Does the performance regression go away reproducibly if you do: git revert b2c51106c7581866c37ffc77c5d739f3d4b7cbc9 ? Thanks, Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[lkp] [x86/build] b2c51106c75: -18.1% will-it-scale.per_process_ops
FYI, we noticed the below changes on git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/asm commit b2c51106c7581866c37ffc77c5d739f3d4b7cbc9 ("x86/build: Fix detection of GCC -mpreferred-stack-boundary support") = tbox_group/testcase/rootfs/kconfig/compiler/cpufreq_governor/test: wsm/will-it-scale/debian-x86_64-2015-02-07.cgz/x86_64-rhel/gcc-4.9/performance/readseek2 commit: f2a50f8b7da45ff2de93a71393e715a2ab9f3b68 b2c51106c7581866c37ffc77c5d739f3d4b7cbc9 f2a50f8b7da45ff2 b2c51106c7581866c37ffc77c5 -- %stddev %change %stddev \ |\ 879002 ± 0% -18.1% 720270 ± 7% will-it-scale.per_process_ops 0.02 ± 0% +34.5% 0.02 ± 7% will-it-scale.scalability 26153173 ± 0% +7.0% 27977076 ± 0% will-it-scale.time.voluntary_context_switches 15.70 ± 2% +14.5% 17.98 ± 0% time.user_time 370683 ± 0% +6.2% 393491 ± 0% vmstat.system.cs 830343 ± 56% -54.0% 382128 ± 39% cpuidle.C1E-NHM.time 788.25 ± 14% -21.7% 617.25 ± 16% cpuidle.C1E-NHM.usage 3947 ± 2% +10.6% 4363 ± 3% slabinfo.kmalloc-192.active_objs 3947 ± 2% +10.6% 4363 ± 3% slabinfo.kmalloc-192.num_objs 1082762 ±162%-100.0% 0.00 ± -1% latency_stats.avg.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath 1082762 ±162%-100.0% 0.00 ± -1% latency_stats.max.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath 1082762 ±162%-100.0% 0.00 ± -1% latency_stats.sum.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath 2.58 ± 8% +19.5% 3.09 ± 3% perf-profile.cpu-cycles._raw_spin_lock_irqsave.finish_wait.__wait_on_bit_lock.__lock_page.find_lock_entry 7.02 ± 3% +9.2% 7.67 ± 2% perf-profile.cpu-cycles._raw_spin_lock_irqsave.prepare_to_wait_exclusive.__wait_on_bit_lock.__lock_page.find_lock_entry 3.07 ± 2% +14.8% 3.53 ± 3% perf-profile.cpu-cycles.finish_wait.__wait_on_bit_lock.__lock_page.find_lock_entry.shmem_getpage_gfp 3.05 ± 5% -8.4% 2.79 ± 4% perf-profile.cpu-cycles.hrtimer_start_range_ns.tick_nohz_stop_sched_tick.__tick_nohz_idle_enter.tick_nohz_idle_enter.cpu_startup_entry 0.98 ± 3% -25.1% 0.74 ± 7% perf-profile.cpu-cycles.is_ftrace_trampoline.print_context_stack.dump_trace.save_stack_trace_tsk.__account_scheduler_latency 1.82 ± 18% +46.6% 2.67 ± 3% perf-profile.cpu-cycles.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.finish_wait.__wait_on_bit_lock.__lock_page 8.05 ± 3% +9.5% 8.82 ± 3% perf-profile.cpu-cycles.prepare_to_wait_exclusive.__wait_on_bit_lock.__lock_page.find_lock_entry.shmem_getpage_gfp 7.75 ± 34% -64.5% 2.75 ± 64% sched_debug.cfs_rq[2]:/.nr_spread_over 1135 ± 20% -43.6% 640.75 ± 49% sched_debug.cfs_rq[3]:/.blocked_load_avg 1215 ± 21% -43.1% 691.50 ± 46% sched_debug.cfs_rq[3]:/.tg_load_contrib 38.50 ± 21%+129.9% 88.50 ± 36% sched_debug.cfs_rq[4]:/.load 26.00 ± 20% +98.1% 51.50 ± 46% sched_debug.cfs_rq[4]:/.runnable_load_avg 128.25 ± 18%+227.5% 420.00 ± 43% sched_debug.cfs_rq[4]:/.utilization_load_avg 1015 ± 78%+101.1% 2042 ± 25% sched_debug.cfs_rq[6]:/.blocked_load_avg 1069 ± 72%+100.2% 2140 ± 23% sched_debug.cfs_rq[6]:/.tg_load_contrib 88.75 ± 14% -47.3% 46.75 ± 36% sched_debug.cfs_rq[9]:/.load 59.25 ± 23% -41.4% 34.75 ± 34% sched_debug.cfs_rq[9]:/.runnable_load_avg 315.50 ± 45% -64.6% 111.67 ± 1% sched_debug.cfs_rq[9]:/.utilization_load_avg 2246758 ± 7% +87.6%4213925 ± 65% sched_debug.cpu#0.nr_switches 2249376 ± 7% +87.4%4215969 ± 65% sched_debug.cpu#0.sched_count 1121438 ± 7% +81.0%2030313 ± 61% sched_debug.cpu#0.sched_goidle 1151160 ± 7% +86.5%2146608 ± 64% sched_debug.cpu#0.ttwu_count 33.75 ± 15% -22.2% 26.25 ± 6% sched_debug.cpu#1.cpu_load[3] 33.25 ± 10% -18.0% 27.25 ± 7% sched_debug.cpu#1.cpu_load[4] 40.00 ± 18% +24.4% 49.75 ± 18% sched_debug.cpu#10.cpu_load[2] 39.25 ± 14% +22.3% 48.00 ± 10% sched_debug.cpu#10.cpu_load[3] 39.50 ± 15% +20.3% 47.50 ± 6% sched_debug.cpu#10.cpu_load[4] 5269004 ± 1%
[lkp] [x86/build] b2c51106c75: -18.1% will-it-scale.per_process_ops
FYI, we noticed the below changes on git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/asm commit b2c51106c7581866c37ffc77c5d739f3d4b7cbc9 (x86/build: Fix detection of GCC -mpreferred-stack-boundary support) = tbox_group/testcase/rootfs/kconfig/compiler/cpufreq_governor/test: wsm/will-it-scale/debian-x86_64-2015-02-07.cgz/x86_64-rhel/gcc-4.9/performance/readseek2 commit: f2a50f8b7da45ff2de93a71393e715a2ab9f3b68 b2c51106c7581866c37ffc77c5d739f3d4b7cbc9 f2a50f8b7da45ff2 b2c51106c7581866c37ffc77c5 -- %stddev %change %stddev \ |\ 879002 ± 0% -18.1% 720270 ± 7% will-it-scale.per_process_ops 0.02 ± 0% +34.5% 0.02 ± 7% will-it-scale.scalability 26153173 ± 0% +7.0% 27977076 ± 0% will-it-scale.time.voluntary_context_switches 15.70 ± 2% +14.5% 17.98 ± 0% time.user_time 370683 ± 0% +6.2% 393491 ± 0% vmstat.system.cs 830343 ± 56% -54.0% 382128 ± 39% cpuidle.C1E-NHM.time 788.25 ± 14% -21.7% 617.25 ± 16% cpuidle.C1E-NHM.usage 3947 ± 2% +10.6% 4363 ± 3% slabinfo.kmalloc-192.active_objs 3947 ± 2% +10.6% 4363 ± 3% slabinfo.kmalloc-192.num_objs 1082762 ±162%-100.0% 0.00 ± -1% latency_stats.avg.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath 1082762 ±162%-100.0% 0.00 ± -1% latency_stats.max.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath 1082762 ±162%-100.0% 0.00 ± -1% latency_stats.sum.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath 2.58 ± 8% +19.5% 3.09 ± 3% perf-profile.cpu-cycles._raw_spin_lock_irqsave.finish_wait.__wait_on_bit_lock.__lock_page.find_lock_entry 7.02 ± 3% +9.2% 7.67 ± 2% perf-profile.cpu-cycles._raw_spin_lock_irqsave.prepare_to_wait_exclusive.__wait_on_bit_lock.__lock_page.find_lock_entry 3.07 ± 2% +14.8% 3.53 ± 3% perf-profile.cpu-cycles.finish_wait.__wait_on_bit_lock.__lock_page.find_lock_entry.shmem_getpage_gfp 3.05 ± 5% -8.4% 2.79 ± 4% perf-profile.cpu-cycles.hrtimer_start_range_ns.tick_nohz_stop_sched_tick.__tick_nohz_idle_enter.tick_nohz_idle_enter.cpu_startup_entry 0.98 ± 3% -25.1% 0.74 ± 7% perf-profile.cpu-cycles.is_ftrace_trampoline.print_context_stack.dump_trace.save_stack_trace_tsk.__account_scheduler_latency 1.82 ± 18% +46.6% 2.67 ± 3% perf-profile.cpu-cycles.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.finish_wait.__wait_on_bit_lock.__lock_page 8.05 ± 3% +9.5% 8.82 ± 3% perf-profile.cpu-cycles.prepare_to_wait_exclusive.__wait_on_bit_lock.__lock_page.find_lock_entry.shmem_getpage_gfp 7.75 ± 34% -64.5% 2.75 ± 64% sched_debug.cfs_rq[2]:/.nr_spread_over 1135 ± 20% -43.6% 640.75 ± 49% sched_debug.cfs_rq[3]:/.blocked_load_avg 1215 ± 21% -43.1% 691.50 ± 46% sched_debug.cfs_rq[3]:/.tg_load_contrib 38.50 ± 21%+129.9% 88.50 ± 36% sched_debug.cfs_rq[4]:/.load 26.00 ± 20% +98.1% 51.50 ± 46% sched_debug.cfs_rq[4]:/.runnable_load_avg 128.25 ± 18%+227.5% 420.00 ± 43% sched_debug.cfs_rq[4]:/.utilization_load_avg 1015 ± 78%+101.1% 2042 ± 25% sched_debug.cfs_rq[6]:/.blocked_load_avg 1069 ± 72%+100.2% 2140 ± 23% sched_debug.cfs_rq[6]:/.tg_load_contrib 88.75 ± 14% -47.3% 46.75 ± 36% sched_debug.cfs_rq[9]:/.load 59.25 ± 23% -41.4% 34.75 ± 34% sched_debug.cfs_rq[9]:/.runnable_load_avg 315.50 ± 45% -64.6% 111.67 ± 1% sched_debug.cfs_rq[9]:/.utilization_load_avg 2246758 ± 7% +87.6%4213925 ± 65% sched_debug.cpu#0.nr_switches 2249376 ± 7% +87.4%4215969 ± 65% sched_debug.cpu#0.sched_count 1121438 ± 7% +81.0%2030313 ± 61% sched_debug.cpu#0.sched_goidle 1151160 ± 7% +86.5%2146608 ± 64% sched_debug.cpu#0.ttwu_count 33.75 ± 15% -22.2% 26.25 ± 6% sched_debug.cpu#1.cpu_load[3] 33.25 ± 10% -18.0% 27.25 ± 7% sched_debug.cpu#1.cpu_load[4] 40.00 ± 18% +24.4% 49.75 ± 18% sched_debug.cpu#10.cpu_load[2] 39.25 ± 14% +22.3% 48.00 ± 10% sched_debug.cpu#10.cpu_load[3] 39.50 ± 15% +20.3% 47.50 ± 6% sched_debug.cpu#10.cpu_load[4] 5269004 ± 1% +27.8%