Re: [LKP] [lkp-robot] [tracing/x86] 1c758a2202: aim9.disk_rr.ops_per_sec -12.0% regression

2018-06-04 Thread Aaron Lu
On Fri, Jun 01, 2018 at 05:43:19PM -0400, Steven Rostedt wrote:
> On Mon, 28 May 2018 19:34:19 +0800
> kernel test robot  wrote:
> 
> > Greeting,
> > 
> > FYI, we noticed a -12.0% regression of aim9.disk_rr.ops_per_sec due to 
> > commit:
> > 
> > 
> > commit: 1c758a2202a6b4624d0703013a2c6cfa6e7455aa ("tracing/x86: Update 
> > syscall trace events to handle new prefixed syscall func names")
> 
> How can this commit cause a run time regression. It changes code
> in an __init call (that gets removed after boot)??

Sorry for the noise.

We actually enabled the syscall trace events for a duration of 10s
during many tests to hopefully capture syscall "noise", i.e. does some
syscall start to take more time etc. for a workload?

We failed to notice this monitor has stopped running due to commit
d5a00528b58c ("syscalls/core, syscalls/x86: Rename struct pt_regs-based
sys_*() to  __x64_sys_*()") as you pointed out in your commit's
changelog. And with your fix commit, syscall trace event starts to work
again, and I believe it is the 10s trace that caused some overhead
during the test that it triggered a bisect and ultimately sent out this
misleading report.

Regards,
Aaron

> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > 
> > in testcase: aim9
> > on test machine: 144 threads Intel(R) Xeon(R) CPU E7-8890 v3 @ 2.50GHz with 
> > 512G memory
> > with following parameters:
> > 
> > testtime: 300s
> > test: disk_rr
> > cpufreq_governor: performance
> > 
> > test-description: Suite IX is the "AIM Independent Resource Benchmark:" the 
> > famous synthetic benchmark.
> > test-url: https://sourceforge.net/projects/aimbench/files/aim-suite9/
> > 
> > 
> > Details are as below:
> > -->
> >   
> > =
> > compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase/testtime:
> >   
> > gcc-7/performance/x86_64-rhel-7.2/debian-x86_64-2016-08-31.cgz/lkp-hsx04/disk_rr/aim9/300s
> > 
> > commit: 
> >   1cdae042fc ("tracing: Add missing forward declaration")
> >   1c758a2202 ("tracing/x86: Update syscall trace events to handle new 
> > prefixed syscall func names")
> > 
> > 1cdae042fc63dd98 1c758a2202a6b4624d0703013a 
> >  -- 
> >  %stddev %change %stddev
> >  \  |\  
> > 633310   -12.0% 557268aim9.disk_rr.ops_per_sec
> > 244.24+2.2% 249.57aim9.time.system_time
> >  55.76-9.6%  50.43aim9.time.user_time
> >   8135 ± 39% +73.2%  14091 ± 33%  numa-meminfo.node0.Shmem
> >   1328   -11.9%   1171pmeter.performance_per_watt
> > 450606 ±  3%  -9.5% 407878 ±  5%  meminfo.Committed_AS
> > 406.75 ±173%+300.1%   1627meminfo.Mlocked
> >  20124 ±  4%  +8.4%  21819 ±  6%  softirqs.NET_RX
> >8237636 ±  6% -15.4%6965294 ±  2%  softirqs.RCU
> >   2033 ± 39% +73.0%   3518 ± 33%  numa-vmstat.node0.nr_shmem
> >  21.25 ±173%+378.8% 101.75 ± 27%  numa-vmstat.node2.nr_mlock
> >  21.25 ±173%+378.8% 101.75 ± 27%  numa-vmstat.node3.nr_mlock
> >  9.408e+08 ±  6% +53.3%  1.442e+09 ± 20%  perf-stat.dTLB-load-misses
> >  47.39 ± 17% -10.4   36.99 ±  8%  perf-stat.iTLB-load-miss-rate%
> >   1279 ± 27% +63.4%   2089 ± 21%  
> > perf-stat.instructions-per-iTLB-miss
> >  46.73 ±  5%  -5.4   41.33 ±  5%  
> > perf-stat.node-store-miss-rate%
> >  19240+1.2%  19474
> > proc-vmstat.nr_indirectly_reclaimable
> >  18868+4.0%  19628
> > proc-vmstat.nr_slab_reclaimable
> >   48395423   -11.8%   42700849proc-vmstat.numa_hit
> >   48314997   -11.8%   42620296proc-vmstat.numa_local
> >3153408   -12.0%2775642proc-vmstat.pgactivate
> >   48365477   -11.8%   42678780proc-vmstat.pgfree
> >   3060   +38.9%   4250
> > slabinfo.ftrace_event_field.active_objs
> >   3060   +38.9%   4250
> > slabinfo.ftrace_event_field.num_objs
> >   2748 ±  3%  -8.9%   2502 ±  3%  
> > slabinfo.sighand_cache.active_objs
> >   2763 ±  3%  -8.9%   2517 ±  3%  
> > slabinfo.sighand_cache.num_objs
> >   4125 ±  4% -10.3%   3700 ±  2%  
> > slabinfo.signal_cache.active_objs
> >   4125 ±  4% -10.3%   3700 ±  2%  slabinfo.signal_cache.num_objs
> >   1104   +57.3%   1736
> > slabinfo.trace_event_file.active_objs
> >   1104   +57.3%   1736
> > slabinfo.trace_event_file.num_objs
> >   0.78 ±  4%  -0.10.67 ±  5%  
> > 

Re: [LKP] [lkp-robot] [tracing/x86] 1c758a2202: aim9.disk_rr.ops_per_sec -12.0% regression

2018-06-04 Thread Aaron Lu
On Fri, Jun 01, 2018 at 05:43:19PM -0400, Steven Rostedt wrote:
> On Mon, 28 May 2018 19:34:19 +0800
> kernel test robot  wrote:
> 
> > Greeting,
> > 
> > FYI, we noticed a -12.0% regression of aim9.disk_rr.ops_per_sec due to 
> > commit:
> > 
> > 
> > commit: 1c758a2202a6b4624d0703013a2c6cfa6e7455aa ("tracing/x86: Update 
> > syscall trace events to handle new prefixed syscall func names")
> 
> How can this commit cause a run time regression. It changes code
> in an __init call (that gets removed after boot)??

Sorry for the noise.

We actually enabled the syscall trace events for a duration of 10s
during many tests to hopefully capture syscall "noise", i.e. does some
syscall start to take more time etc. for a workload?

We failed to notice this monitor has stopped running due to commit
d5a00528b58c ("syscalls/core, syscalls/x86: Rename struct pt_regs-based
sys_*() to  __x64_sys_*()") as you pointed out in your commit's
changelog. And with your fix commit, syscall trace event starts to work
again, and I believe it is the 10s trace that caused some overhead
during the test that it triggered a bisect and ultimately sent out this
misleading report.

Regards,
Aaron

> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > 
> > in testcase: aim9
> > on test machine: 144 threads Intel(R) Xeon(R) CPU E7-8890 v3 @ 2.50GHz with 
> > 512G memory
> > with following parameters:
> > 
> > testtime: 300s
> > test: disk_rr
> > cpufreq_governor: performance
> > 
> > test-description: Suite IX is the "AIM Independent Resource Benchmark:" the 
> > famous synthetic benchmark.
> > test-url: https://sourceforge.net/projects/aimbench/files/aim-suite9/
> > 
> > 
> > Details are as below:
> > -->
> >   
> > =
> > compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase/testtime:
> >   
> > gcc-7/performance/x86_64-rhel-7.2/debian-x86_64-2016-08-31.cgz/lkp-hsx04/disk_rr/aim9/300s
> > 
> > commit: 
> >   1cdae042fc ("tracing: Add missing forward declaration")
> >   1c758a2202 ("tracing/x86: Update syscall trace events to handle new 
> > prefixed syscall func names")
> > 
> > 1cdae042fc63dd98 1c758a2202a6b4624d0703013a 
> >  -- 
> >  %stddev %change %stddev
> >  \  |\  
> > 633310   -12.0% 557268aim9.disk_rr.ops_per_sec
> > 244.24+2.2% 249.57aim9.time.system_time
> >  55.76-9.6%  50.43aim9.time.user_time
> >   8135 ± 39% +73.2%  14091 ± 33%  numa-meminfo.node0.Shmem
> >   1328   -11.9%   1171pmeter.performance_per_watt
> > 450606 ±  3%  -9.5% 407878 ±  5%  meminfo.Committed_AS
> > 406.75 ±173%+300.1%   1627meminfo.Mlocked
> >  20124 ±  4%  +8.4%  21819 ±  6%  softirqs.NET_RX
> >8237636 ±  6% -15.4%6965294 ±  2%  softirqs.RCU
> >   2033 ± 39% +73.0%   3518 ± 33%  numa-vmstat.node0.nr_shmem
> >  21.25 ±173%+378.8% 101.75 ± 27%  numa-vmstat.node2.nr_mlock
> >  21.25 ±173%+378.8% 101.75 ± 27%  numa-vmstat.node3.nr_mlock
> >  9.408e+08 ±  6% +53.3%  1.442e+09 ± 20%  perf-stat.dTLB-load-misses
> >  47.39 ± 17% -10.4   36.99 ±  8%  perf-stat.iTLB-load-miss-rate%
> >   1279 ± 27% +63.4%   2089 ± 21%  
> > perf-stat.instructions-per-iTLB-miss
> >  46.73 ±  5%  -5.4   41.33 ±  5%  
> > perf-stat.node-store-miss-rate%
> >  19240+1.2%  19474
> > proc-vmstat.nr_indirectly_reclaimable
> >  18868+4.0%  19628
> > proc-vmstat.nr_slab_reclaimable
> >   48395423   -11.8%   42700849proc-vmstat.numa_hit
> >   48314997   -11.8%   42620296proc-vmstat.numa_local
> >3153408   -12.0%2775642proc-vmstat.pgactivate
> >   48365477   -11.8%   42678780proc-vmstat.pgfree
> >   3060   +38.9%   4250
> > slabinfo.ftrace_event_field.active_objs
> >   3060   +38.9%   4250
> > slabinfo.ftrace_event_field.num_objs
> >   2748 ±  3%  -8.9%   2502 ±  3%  
> > slabinfo.sighand_cache.active_objs
> >   2763 ±  3%  -8.9%   2517 ±  3%  
> > slabinfo.sighand_cache.num_objs
> >   4125 ±  4% -10.3%   3700 ±  2%  
> > slabinfo.signal_cache.active_objs
> >   4125 ±  4% -10.3%   3700 ±  2%  slabinfo.signal_cache.num_objs
> >   1104   +57.3%   1736
> > slabinfo.trace_event_file.active_objs
> >   1104   +57.3%   1736
> > slabinfo.trace_event_file.num_objs
> >   0.78 ±  4%  -0.10.67 ±  5%  
> > 

Re: [lkp-robot] [tracing/x86] 1c758a2202: aim9.disk_rr.ops_per_sec -12.0% regression

2018-06-01 Thread Steven Rostedt
On Mon, 28 May 2018 19:34:19 +0800
kernel test robot  wrote:

> Greeting,
> 
> FYI, we noticed a -12.0% regression of aim9.disk_rr.ops_per_sec due to commit:
> 
> 
> commit: 1c758a2202a6b4624d0703013a2c6cfa6e7455aa ("tracing/x86: Update 
> syscall trace events to handle new prefixed syscall func names")

How can this commit cause a run time regression. It changes code
in an __init call (that gets removed after boot)??

-- Steve
 

> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> 
> in testcase: aim9
> on test machine: 144 threads Intel(R) Xeon(R) CPU E7-8890 v3 @ 2.50GHz with 
> 512G memory
> with following parameters:
> 
>   testtime: 300s
>   test: disk_rr
>   cpufreq_governor: performance
> 
> test-description: Suite IX is the "AIM Independent Resource Benchmark:" the 
> famous synthetic benchmark.
> test-url: https://sourceforge.net/projects/aimbench/files/aim-suite9/
> 
> 
> Details are as below:
> -->
>   
> =
> compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase/testtime:
>   
> gcc-7/performance/x86_64-rhel-7.2/debian-x86_64-2016-08-31.cgz/lkp-hsx04/disk_rr/aim9/300s
> 
> commit: 
>   1cdae042fc ("tracing: Add missing forward declaration")
>   1c758a2202 ("tracing/x86: Update syscall trace events to handle new 
> prefixed syscall func names")
> 
> 1cdae042fc63dd98 1c758a2202a6b4624d0703013a 
>  -- 
>  %stddev %change %stddev
>  \  |\  
> 633310   -12.0% 557268aim9.disk_rr.ops_per_sec
> 244.24+2.2% 249.57aim9.time.system_time
>  55.76-9.6%  50.43aim9.time.user_time
>   8135 ± 39% +73.2%  14091 ± 33%  numa-meminfo.node0.Shmem
>   1328   -11.9%   1171pmeter.performance_per_watt
> 450606 ±  3%  -9.5% 407878 ±  5%  meminfo.Committed_AS
> 406.75 ±173%+300.1%   1627meminfo.Mlocked
>  20124 ±  4%  +8.4%  21819 ±  6%  softirqs.NET_RX
>8237636 ±  6% -15.4%6965294 ±  2%  softirqs.RCU
>   2033 ± 39% +73.0%   3518 ± 33%  numa-vmstat.node0.nr_shmem
>  21.25 ±173%+378.8% 101.75 ± 27%  numa-vmstat.node2.nr_mlock
>  21.25 ±173%+378.8% 101.75 ± 27%  numa-vmstat.node3.nr_mlock
>  9.408e+08 ±  6% +53.3%  1.442e+09 ± 20%  perf-stat.dTLB-load-misses
>  47.39 ± 17% -10.4   36.99 ±  8%  perf-stat.iTLB-load-miss-rate%
>   1279 ± 27% +63.4%   2089 ± 21%  
> perf-stat.instructions-per-iTLB-miss
>  46.73 ±  5%  -5.4   41.33 ±  5%  perf-stat.node-store-miss-rate%
>  19240+1.2%  19474
> proc-vmstat.nr_indirectly_reclaimable
>  18868+4.0%  19628proc-vmstat.nr_slab_reclaimable
>   48395423   -11.8%   42700849proc-vmstat.numa_hit
>   48314997   -11.8%   42620296proc-vmstat.numa_local
>3153408   -12.0%2775642proc-vmstat.pgactivate
>   48365477   -11.8%   42678780proc-vmstat.pgfree
>   3060   +38.9%   4250
> slabinfo.ftrace_event_field.active_objs
>   3060   +38.9%   4250
> slabinfo.ftrace_event_field.num_objs
>   2748 ±  3%  -8.9%   2502 ±  3%  
> slabinfo.sighand_cache.active_objs
>   2763 ±  3%  -8.9%   2517 ±  3%  slabinfo.sighand_cache.num_objs
>   4125 ±  4% -10.3%   3700 ±  2%  
> slabinfo.signal_cache.active_objs
>   4125 ±  4% -10.3%   3700 ±  2%  slabinfo.signal_cache.num_objs
>   1104   +57.3%   1736
> slabinfo.trace_event_file.active_objs
>   1104   +57.3%   1736
> slabinfo.trace_event_file.num_objs
>   0.78 ±  4%  -0.10.67 ±  5%  
> perf-profile.calltrace.cycles-pp.rcu_process_callbacks.__softirqentry_text_start.irq_exit.smp_apic_timer_interrupt.apic_timer_interrupt
>   2.10 ±  2%  -0.12.00
> perf-profile.calltrace.cycles-pp.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt
>   1.89-0.11.79
> perf-profile.calltrace.cycles-pp.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt
>   1.71 ±  2%  -0.11.60
> perf-profile.calltrace.cycles-pp.update_process_times.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt
>   0.53 ± 12%  -0.10.44 ±  4%  
> perf-profile.children.cycles-pp._raw_spin_lock_irqsave
>   0.17 ±  7%  -0.00.15 ±  4%  
> perf-profile.children.cycles-pp.leave_mm
>   0.09 ±  8%  -0.00.08 ± 11%  
> 

Re: [lkp-robot] [tracing/x86] 1c758a2202: aim9.disk_rr.ops_per_sec -12.0% regression

2018-06-01 Thread Steven Rostedt
On Mon, 28 May 2018 19:34:19 +0800
kernel test robot  wrote:

> Greeting,
> 
> FYI, we noticed a -12.0% regression of aim9.disk_rr.ops_per_sec due to commit:
> 
> 
> commit: 1c758a2202a6b4624d0703013a2c6cfa6e7455aa ("tracing/x86: Update 
> syscall trace events to handle new prefixed syscall func names")

How can this commit cause a run time regression. It changes code
in an __init call (that gets removed after boot)??

-- Steve
 

> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> 
> in testcase: aim9
> on test machine: 144 threads Intel(R) Xeon(R) CPU E7-8890 v3 @ 2.50GHz with 
> 512G memory
> with following parameters:
> 
>   testtime: 300s
>   test: disk_rr
>   cpufreq_governor: performance
> 
> test-description: Suite IX is the "AIM Independent Resource Benchmark:" the 
> famous synthetic benchmark.
> test-url: https://sourceforge.net/projects/aimbench/files/aim-suite9/
> 
> 
> Details are as below:
> -->
>   
> =
> compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase/testtime:
>   
> gcc-7/performance/x86_64-rhel-7.2/debian-x86_64-2016-08-31.cgz/lkp-hsx04/disk_rr/aim9/300s
> 
> commit: 
>   1cdae042fc ("tracing: Add missing forward declaration")
>   1c758a2202 ("tracing/x86: Update syscall trace events to handle new 
> prefixed syscall func names")
> 
> 1cdae042fc63dd98 1c758a2202a6b4624d0703013a 
>  -- 
>  %stddev %change %stddev
>  \  |\  
> 633310   -12.0% 557268aim9.disk_rr.ops_per_sec
> 244.24+2.2% 249.57aim9.time.system_time
>  55.76-9.6%  50.43aim9.time.user_time
>   8135 ± 39% +73.2%  14091 ± 33%  numa-meminfo.node0.Shmem
>   1328   -11.9%   1171pmeter.performance_per_watt
> 450606 ±  3%  -9.5% 407878 ±  5%  meminfo.Committed_AS
> 406.75 ±173%+300.1%   1627meminfo.Mlocked
>  20124 ±  4%  +8.4%  21819 ±  6%  softirqs.NET_RX
>8237636 ±  6% -15.4%6965294 ±  2%  softirqs.RCU
>   2033 ± 39% +73.0%   3518 ± 33%  numa-vmstat.node0.nr_shmem
>  21.25 ±173%+378.8% 101.75 ± 27%  numa-vmstat.node2.nr_mlock
>  21.25 ±173%+378.8% 101.75 ± 27%  numa-vmstat.node3.nr_mlock
>  9.408e+08 ±  6% +53.3%  1.442e+09 ± 20%  perf-stat.dTLB-load-misses
>  47.39 ± 17% -10.4   36.99 ±  8%  perf-stat.iTLB-load-miss-rate%
>   1279 ± 27% +63.4%   2089 ± 21%  
> perf-stat.instructions-per-iTLB-miss
>  46.73 ±  5%  -5.4   41.33 ±  5%  perf-stat.node-store-miss-rate%
>  19240+1.2%  19474
> proc-vmstat.nr_indirectly_reclaimable
>  18868+4.0%  19628proc-vmstat.nr_slab_reclaimable
>   48395423   -11.8%   42700849proc-vmstat.numa_hit
>   48314997   -11.8%   42620296proc-vmstat.numa_local
>3153408   -12.0%2775642proc-vmstat.pgactivate
>   48365477   -11.8%   42678780proc-vmstat.pgfree
>   3060   +38.9%   4250
> slabinfo.ftrace_event_field.active_objs
>   3060   +38.9%   4250
> slabinfo.ftrace_event_field.num_objs
>   2748 ±  3%  -8.9%   2502 ±  3%  
> slabinfo.sighand_cache.active_objs
>   2763 ±  3%  -8.9%   2517 ±  3%  slabinfo.sighand_cache.num_objs
>   4125 ±  4% -10.3%   3700 ±  2%  
> slabinfo.signal_cache.active_objs
>   4125 ±  4% -10.3%   3700 ±  2%  slabinfo.signal_cache.num_objs
>   1104   +57.3%   1736
> slabinfo.trace_event_file.active_objs
>   1104   +57.3%   1736
> slabinfo.trace_event_file.num_objs
>   0.78 ±  4%  -0.10.67 ±  5%  
> perf-profile.calltrace.cycles-pp.rcu_process_callbacks.__softirqentry_text_start.irq_exit.smp_apic_timer_interrupt.apic_timer_interrupt
>   2.10 ±  2%  -0.12.00
> perf-profile.calltrace.cycles-pp.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt
>   1.89-0.11.79
> perf-profile.calltrace.cycles-pp.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt
>   1.71 ±  2%  -0.11.60
> perf-profile.calltrace.cycles-pp.update_process_times.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt
>   0.53 ± 12%  -0.10.44 ±  4%  
> perf-profile.children.cycles-pp._raw_spin_lock_irqsave
>   0.17 ±  7%  -0.00.15 ±  4%  
> perf-profile.children.cycles-pp.leave_mm
>   0.09 ±  8%  -0.00.08 ± 11%  
>