Re: [PATCH 5/5] ftrace: Update dynamic ftrace calls only if necessary

2016-03-20 Thread Steven Rostedt

Patches 4 and 5 seem agnostic to patches 1-3. I'll pull 4,5 into my
tree and 1-3 can go though tip. I'd like to run theses through my tests.

OK?

-- Steve


On Wed, 16 Mar 2016 15:34:33 +0100
Jiri Olsa  wrote:

> Currently dynamic ftrace calls are updated any time
> the ftrace_ops is un/registered. If we do  this update
> only when it's needed, we save lot of time for perf
> system wide ftrace function sampling/counting.
> 
> The reason is that for system wide sampling/counting,
> perf creates event for each cpu in the system.
> 
> Each event then registers separate copy of ftrace_ops,
> which ends up in FTRACE_UPDATE_CALLS updates. On servers
> with many cpus that means serious stall (240 cpus server):
> 
> Counting:
>   # time ./perf stat -e ftrace:function -a sleep 1
> 
>Performance counter stats for 'system wide':
> 
>   370,663  ftrace:function
> 
>   1.401427505 seconds time elapsed
> 
>   real3m51.743s
>   user0m0.023s
>   sys 3m48.569s
> 
> Sampling:
>   # time ./perf record -e ftrace:function -a sleep 1
>   [ perf record: Woken up 0 times to write data ]
>   Warning:
>   Processed 141200 events and lost 5 chunks!
> 
>   [ perf record: Captured and wrote 10.703 MB perf.data (135950 samples) ]
> 
>   real2m31.429s
>   user0m0.213s
>   sys 2m29.494s
> 
> There's no reason to do the FTRACE_UPDATE_CALLS update
> for each event in perf case, because all the ftrace_ops
> always share the same filter, so the updated calls are
> always the same.
> 
> It's required that only first ftrace_ops registration
> does the FTRACE_UPDATE_CALLS update (also sometimes
> the second if the first one used the trampoline), but
> the rest can be only cheaply linked into the ftrace_ops
> list.
> 
> Counting:
>   # time ./perf stat -e ftrace:function -a sleep 1
> 
>Performance counter stats for 'system wide':
> 
>  398,571  ftrace:function
> 
>  1.377503733 seconds time elapsed
> 
>   real0m2.787s
>   user0m0.005s
>   sys 0m1.883s
> 
> Sampling:
>   # time ./perf record -e ftrace:function -a sleep 1
>   [ perf record: Woken up 0 times to write data ]
>   Warning:
>   Processed 261730 events and lost 9 chunks!
> 
>   [ perf record: Captured and wrote 19.907 MB perf.data (256293 samples) ]
> 
>   real1m31.948s
>   user0m0.309s
>   sys 1m32.051s
> 
> Signed-off-by: Jiri Olsa 
> ---


Re: [PATCH 5/5] ftrace: Update dynamic ftrace calls only if necessary

2016-03-20 Thread Steven Rostedt

Patches 4 and 5 seem agnostic to patches 1-3. I'll pull 4,5 into my
tree and 1-3 can go though tip. I'd like to run theses through my tests.

OK?

-- Steve


On Wed, 16 Mar 2016 15:34:33 +0100
Jiri Olsa  wrote:

> Currently dynamic ftrace calls are updated any time
> the ftrace_ops is un/registered. If we do  this update
> only when it's needed, we save lot of time for perf
> system wide ftrace function sampling/counting.
> 
> The reason is that for system wide sampling/counting,
> perf creates event for each cpu in the system.
> 
> Each event then registers separate copy of ftrace_ops,
> which ends up in FTRACE_UPDATE_CALLS updates. On servers
> with many cpus that means serious stall (240 cpus server):
> 
> Counting:
>   # time ./perf stat -e ftrace:function -a sleep 1
> 
>Performance counter stats for 'system wide':
> 
>   370,663  ftrace:function
> 
>   1.401427505 seconds time elapsed
> 
>   real3m51.743s
>   user0m0.023s
>   sys 3m48.569s
> 
> Sampling:
>   # time ./perf record -e ftrace:function -a sleep 1
>   [ perf record: Woken up 0 times to write data ]
>   Warning:
>   Processed 141200 events and lost 5 chunks!
> 
>   [ perf record: Captured and wrote 10.703 MB perf.data (135950 samples) ]
> 
>   real2m31.429s
>   user0m0.213s
>   sys 2m29.494s
> 
> There's no reason to do the FTRACE_UPDATE_CALLS update
> for each event in perf case, because all the ftrace_ops
> always share the same filter, so the updated calls are
> always the same.
> 
> It's required that only first ftrace_ops registration
> does the FTRACE_UPDATE_CALLS update (also sometimes
> the second if the first one used the trampoline), but
> the rest can be only cheaply linked into the ftrace_ops
> list.
> 
> Counting:
>   # time ./perf stat -e ftrace:function -a sleep 1
> 
>Performance counter stats for 'system wide':
> 
>  398,571  ftrace:function
> 
>  1.377503733 seconds time elapsed
> 
>   real0m2.787s
>   user0m0.005s
>   sys 0m1.883s
> 
> Sampling:
>   # time ./perf record -e ftrace:function -a sleep 1
>   [ perf record: Woken up 0 times to write data ]
>   Warning:
>   Processed 261730 events and lost 9 chunks!
> 
>   [ perf record: Captured and wrote 19.907 MB perf.data (256293 samples) ]
> 
>   real1m31.948s
>   user0m0.309s
>   sys 1m32.051s
> 
> Signed-off-by: Jiri Olsa 
> ---


Re: [PATCH 5/5] ftrace: Update dynamic ftrace calls only if necessary

2016-03-19 Thread Namhyung Kim
On Wed, Mar 16, 2016 at 03:34:33PM +0100, Jiri Olsa wrote:
> Currently dynamic ftrace calls are updated any time
> the ftrace_ops is un/registered. If we do  this update
> only when it's needed, we save lot of time for perf
> system wide ftrace function sampling/counting.
> 
> The reason is that for system wide sampling/counting,
> perf creates event for each cpu in the system.
> 
> Each event then registers separate copy of ftrace_ops,
> which ends up in FTRACE_UPDATE_CALLS updates. On servers
> with many cpus that means serious stall (240 cpus server):
> 
> Counting:
>   # time ./perf stat -e ftrace:function -a sleep 1
> 
>Performance counter stats for 'system wide':
> 
>   370,663  ftrace:function
> 
>   1.401427505 seconds time elapsed
> 
>   real3m51.743s
>   user0m0.023s
>   sys 3m48.569s
> 
> Sampling:
>   # time ./perf record -e ftrace:function -a sleep 1
>   [ perf record: Woken up 0 times to write data ]
>   Warning:
>   Processed 141200 events and lost 5 chunks!
> 
>   [ perf record: Captured and wrote 10.703 MB perf.data (135950 samples) ]
> 
>   real2m31.429s
>   user0m0.213s
>   sys 2m29.494s
> 
> There's no reason to do the FTRACE_UPDATE_CALLS update
> for each event in perf case, because all the ftrace_ops
> always share the same filter, so the updated calls are
> always the same.
> 
> It's required that only first ftrace_ops registration
> does the FTRACE_UPDATE_CALLS update (also sometimes
> the second if the first one used the trampoline), but
> the rest can be only cheaply linked into the ftrace_ops
> list.
> 
> Counting:
>   # time ./perf stat -e ftrace:function -a sleep 1
> 
>Performance counter stats for 'system wide':
> 
>  398,571  ftrace:function
> 
>  1.377503733 seconds time elapsed
> 
>   real0m2.787s
>   user0m0.005s
>   sys 0m1.883s
> 
> Sampling:
>   # time ./perf record -e ftrace:function -a sleep 1
>   [ perf record: Woken up 0 times to write data ]
>   Warning:
>   Processed 261730 events and lost 9 chunks!
> 
>   [ perf record: Captured and wrote 19.907 MB perf.data (256293 samples) ]
> 
>   real1m31.948s
>   user0m0.309s
>   sys 1m32.051s
> 
> Signed-off-by: Jiri Olsa 

Acked-by: Namhyung Kim 

Thanks,
Namhyung


> ---
>  kernel/trace/ftrace.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index 3a9a12215b50..22ef6fc597a2 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -2651,7 +2651,6 @@ static int ftrace_startup(struct ftrace_ops *ops, int 
> command)
>   return ret;
>  
>   ftrace_start_up++;
> - command |= FTRACE_UPDATE_CALLS;
>  
>   /*
>* Note that ftrace probes uses this to start up
> @@ -2672,7 +2671,8 @@ static int ftrace_startup(struct ftrace_ops *ops, int 
> command)
>   return ret;
>   }
>  
> - ftrace_hash_rec_enable(ops, 1);
> + if (ftrace_hash_rec_enable(ops, 1))
> + command |= FTRACE_UPDATE_CALLS;
>  
>   ftrace_startup_enable(command);
>  
> @@ -2702,11 +2702,11 @@ static int ftrace_shutdown(struct ftrace_ops *ops, 
> int command)
>  
>   /* Disabling ipmodify never fails */
>   ftrace_hash_ipmodify_disable(ops);
> - ftrace_hash_rec_disable(ops, 1);
>  
> - ops->flags &= ~FTRACE_OPS_FL_ENABLED;
> + if (ftrace_hash_rec_disable(ops, 1))
> + command |= FTRACE_UPDATE_CALLS;
>  
> - command |= FTRACE_UPDATE_CALLS;
> + ops->flags &= ~FTRACE_OPS_FL_ENABLED;
>  
>   if (saved_ftrace_func != ftrace_trace_function) {
>   saved_ftrace_func = ftrace_trace_function;
> -- 
> 2.4.3
> 


Re: [PATCH 5/5] ftrace: Update dynamic ftrace calls only if necessary

2016-03-19 Thread Namhyung Kim
On Wed, Mar 16, 2016 at 03:34:33PM +0100, Jiri Olsa wrote:
> Currently dynamic ftrace calls are updated any time
> the ftrace_ops is un/registered. If we do  this update
> only when it's needed, we save lot of time for perf
> system wide ftrace function sampling/counting.
> 
> The reason is that for system wide sampling/counting,
> perf creates event for each cpu in the system.
> 
> Each event then registers separate copy of ftrace_ops,
> which ends up in FTRACE_UPDATE_CALLS updates. On servers
> with many cpus that means serious stall (240 cpus server):
> 
> Counting:
>   # time ./perf stat -e ftrace:function -a sleep 1
> 
>Performance counter stats for 'system wide':
> 
>   370,663  ftrace:function
> 
>   1.401427505 seconds time elapsed
> 
>   real3m51.743s
>   user0m0.023s
>   sys 3m48.569s
> 
> Sampling:
>   # time ./perf record -e ftrace:function -a sleep 1
>   [ perf record: Woken up 0 times to write data ]
>   Warning:
>   Processed 141200 events and lost 5 chunks!
> 
>   [ perf record: Captured and wrote 10.703 MB perf.data (135950 samples) ]
> 
>   real2m31.429s
>   user0m0.213s
>   sys 2m29.494s
> 
> There's no reason to do the FTRACE_UPDATE_CALLS update
> for each event in perf case, because all the ftrace_ops
> always share the same filter, so the updated calls are
> always the same.
> 
> It's required that only first ftrace_ops registration
> does the FTRACE_UPDATE_CALLS update (also sometimes
> the second if the first one used the trampoline), but
> the rest can be only cheaply linked into the ftrace_ops
> list.
> 
> Counting:
>   # time ./perf stat -e ftrace:function -a sleep 1
> 
>Performance counter stats for 'system wide':
> 
>  398,571  ftrace:function
> 
>  1.377503733 seconds time elapsed
> 
>   real0m2.787s
>   user0m0.005s
>   sys 0m1.883s
> 
> Sampling:
>   # time ./perf record -e ftrace:function -a sleep 1
>   [ perf record: Woken up 0 times to write data ]
>   Warning:
>   Processed 261730 events and lost 9 chunks!
> 
>   [ perf record: Captured and wrote 19.907 MB perf.data (256293 samples) ]
> 
>   real1m31.948s
>   user0m0.309s
>   sys 1m32.051s
> 
> Signed-off-by: Jiri Olsa 

Acked-by: Namhyung Kim 

Thanks,
Namhyung


> ---
>  kernel/trace/ftrace.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index 3a9a12215b50..22ef6fc597a2 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -2651,7 +2651,6 @@ static int ftrace_startup(struct ftrace_ops *ops, int 
> command)
>   return ret;
>  
>   ftrace_start_up++;
> - command |= FTRACE_UPDATE_CALLS;
>  
>   /*
>* Note that ftrace probes uses this to start up
> @@ -2672,7 +2671,8 @@ static int ftrace_startup(struct ftrace_ops *ops, int 
> command)
>   return ret;
>   }
>  
> - ftrace_hash_rec_enable(ops, 1);
> + if (ftrace_hash_rec_enable(ops, 1))
> + command |= FTRACE_UPDATE_CALLS;
>  
>   ftrace_startup_enable(command);
>  
> @@ -2702,11 +2702,11 @@ static int ftrace_shutdown(struct ftrace_ops *ops, 
> int command)
>  
>   /* Disabling ipmodify never fails */
>   ftrace_hash_ipmodify_disable(ops);
> - ftrace_hash_rec_disable(ops, 1);
>  
> - ops->flags &= ~FTRACE_OPS_FL_ENABLED;
> + if (ftrace_hash_rec_disable(ops, 1))
> + command |= FTRACE_UPDATE_CALLS;
>  
> - command |= FTRACE_UPDATE_CALLS;
> + ops->flags &= ~FTRACE_OPS_FL_ENABLED;
>  
>   if (saved_ftrace_func != ftrace_trace_function) {
>   saved_ftrace_func = ftrace_trace_function;
> -- 
> 2.4.3
> 


[PATCH 5/5] ftrace: Update dynamic ftrace calls only if necessary

2016-03-19 Thread Jiri Olsa
Currently dynamic ftrace calls are updated any time
the ftrace_ops is un/registered. If we do  this update
only when it's needed, we save lot of time for perf
system wide ftrace function sampling/counting.

The reason is that for system wide sampling/counting,
perf creates event for each cpu in the system.

Each event then registers separate copy of ftrace_ops,
which ends up in FTRACE_UPDATE_CALLS updates. On servers
with many cpus that means serious stall (240 cpus server):

Counting:
  # time ./perf stat -e ftrace:function -a sleep 1

   Performance counter stats for 'system wide':

  370,663  ftrace:function

  1.401427505 seconds time elapsed

  real3m51.743s
  user0m0.023s
  sys 3m48.569s

Sampling:
  # time ./perf record -e ftrace:function -a sleep 1
  [ perf record: Woken up 0 times to write data ]
  Warning:
  Processed 141200 events and lost 5 chunks!

  [ perf record: Captured and wrote 10.703 MB perf.data (135950 samples) ]

  real2m31.429s
  user0m0.213s
  sys 2m29.494s

There's no reason to do the FTRACE_UPDATE_CALLS update
for each event in perf case, because all the ftrace_ops
always share the same filter, so the updated calls are
always the same.

It's required that only first ftrace_ops registration
does the FTRACE_UPDATE_CALLS update (also sometimes
the second if the first one used the trampoline), but
the rest can be only cheaply linked into the ftrace_ops
list.

Counting:
  # time ./perf stat -e ftrace:function -a sleep 1

   Performance counter stats for 'system wide':

 398,571  ftrace:function

 1.377503733 seconds time elapsed

  real0m2.787s
  user0m0.005s
  sys 0m1.883s

Sampling:
  # time ./perf record -e ftrace:function -a sleep 1
  [ perf record: Woken up 0 times to write data ]
  Warning:
  Processed 261730 events and lost 9 chunks!

  [ perf record: Captured and wrote 19.907 MB perf.data (256293 samples) ]

  real1m31.948s
  user0m0.309s
  sys 1m32.051s

Signed-off-by: Jiri Olsa 
---
 kernel/trace/ftrace.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 3a9a12215b50..22ef6fc597a2 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2651,7 +2651,6 @@ static int ftrace_startup(struct ftrace_ops *ops, int 
command)
return ret;
 
ftrace_start_up++;
-   command |= FTRACE_UPDATE_CALLS;
 
/*
 * Note that ftrace probes uses this to start up
@@ -2672,7 +2671,8 @@ static int ftrace_startup(struct ftrace_ops *ops, int 
command)
return ret;
}
 
-   ftrace_hash_rec_enable(ops, 1);
+   if (ftrace_hash_rec_enable(ops, 1))
+   command |= FTRACE_UPDATE_CALLS;
 
ftrace_startup_enable(command);
 
@@ -2702,11 +2702,11 @@ static int ftrace_shutdown(struct ftrace_ops *ops, int 
command)
 
/* Disabling ipmodify never fails */
ftrace_hash_ipmodify_disable(ops);
-   ftrace_hash_rec_disable(ops, 1);
 
-   ops->flags &= ~FTRACE_OPS_FL_ENABLED;
+   if (ftrace_hash_rec_disable(ops, 1))
+   command |= FTRACE_UPDATE_CALLS;
 
-   command |= FTRACE_UPDATE_CALLS;
+   ops->flags &= ~FTRACE_OPS_FL_ENABLED;
 
if (saved_ftrace_func != ftrace_trace_function) {
saved_ftrace_func = ftrace_trace_function;
-- 
2.4.3



[PATCH 5/5] ftrace: Update dynamic ftrace calls only if necessary

2016-03-19 Thread Jiri Olsa
Currently dynamic ftrace calls are updated any time
the ftrace_ops is un/registered. If we do  this update
only when it's needed, we save lot of time for perf
system wide ftrace function sampling/counting.

The reason is that for system wide sampling/counting,
perf creates event for each cpu in the system.

Each event then registers separate copy of ftrace_ops,
which ends up in FTRACE_UPDATE_CALLS updates. On servers
with many cpus that means serious stall (240 cpus server):

Counting:
  # time ./perf stat -e ftrace:function -a sleep 1

   Performance counter stats for 'system wide':

  370,663  ftrace:function

  1.401427505 seconds time elapsed

  real3m51.743s
  user0m0.023s
  sys 3m48.569s

Sampling:
  # time ./perf record -e ftrace:function -a sleep 1
  [ perf record: Woken up 0 times to write data ]
  Warning:
  Processed 141200 events and lost 5 chunks!

  [ perf record: Captured and wrote 10.703 MB perf.data (135950 samples) ]

  real2m31.429s
  user0m0.213s
  sys 2m29.494s

There's no reason to do the FTRACE_UPDATE_CALLS update
for each event in perf case, because all the ftrace_ops
always share the same filter, so the updated calls are
always the same.

It's required that only first ftrace_ops registration
does the FTRACE_UPDATE_CALLS update (also sometimes
the second if the first one used the trampoline), but
the rest can be only cheaply linked into the ftrace_ops
list.

Counting:
  # time ./perf stat -e ftrace:function -a sleep 1

   Performance counter stats for 'system wide':

 398,571  ftrace:function

 1.377503733 seconds time elapsed

  real0m2.787s
  user0m0.005s
  sys 0m1.883s

Sampling:
  # time ./perf record -e ftrace:function -a sleep 1
  [ perf record: Woken up 0 times to write data ]
  Warning:
  Processed 261730 events and lost 9 chunks!

  [ perf record: Captured and wrote 19.907 MB perf.data (256293 samples) ]

  real1m31.948s
  user0m0.309s
  sys 1m32.051s

Signed-off-by: Jiri Olsa 
---
 kernel/trace/ftrace.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 3a9a12215b50..22ef6fc597a2 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2651,7 +2651,6 @@ static int ftrace_startup(struct ftrace_ops *ops, int 
command)
return ret;
 
ftrace_start_up++;
-   command |= FTRACE_UPDATE_CALLS;
 
/*
 * Note that ftrace probes uses this to start up
@@ -2672,7 +2671,8 @@ static int ftrace_startup(struct ftrace_ops *ops, int 
command)
return ret;
}
 
-   ftrace_hash_rec_enable(ops, 1);
+   if (ftrace_hash_rec_enable(ops, 1))
+   command |= FTRACE_UPDATE_CALLS;
 
ftrace_startup_enable(command);
 
@@ -2702,11 +2702,11 @@ static int ftrace_shutdown(struct ftrace_ops *ops, int 
command)
 
/* Disabling ipmodify never fails */
ftrace_hash_ipmodify_disable(ops);
-   ftrace_hash_rec_disable(ops, 1);
 
-   ops->flags &= ~FTRACE_OPS_FL_ENABLED;
+   if (ftrace_hash_rec_disable(ops, 1))
+   command |= FTRACE_UPDATE_CALLS;
 
-   command |= FTRACE_UPDATE_CALLS;
+   ops->flags &= ~FTRACE_OPS_FL_ENABLED;
 
if (saved_ftrace_func != ftrace_trace_function) {
saved_ftrace_func = ftrace_trace_function;
-- 
2.4.3



[PATCH 5/5] ftrace: Update dynamic ftrace calls only if necessary

2016-03-09 Thread Jiri Olsa
Currently dynamic ftrace calls are updated any time
the ftrace_ops is un/registered. If we do  this update
only when it's needed, we save lot of time for perf
system wide ftrace function sampling/counting.

The reason is that for system wide sampling/counting,
perf creates event for each cpu in the system.

Each event then registers separate copy of ftrace_ops,
which ends up in FTRACE_UPDATE_CALLS updates. On servers
with many cpus that means serious stall (240 cpus server):

Counting:
  # time ./perf stat -e ftrace:function -a sleep 1

   Performance counter stats for 'system wide':

  370,663  ftrace:function

  1.401427505 seconds time elapsed

  real3m51.743s
  user0m0.023s
  sys 3m48.569s

Sampling:
  # time ./perf record -e ftrace:function -a sleep 1
  [ perf record: Woken up 0 times to write data ]
  Warning:
  Processed 141200 events and lost 5 chunks!

  [ perf record: Captured and wrote 10.703 MB perf.data (135950 samples) ]

  real2m31.429s
  user0m0.213s
  sys 2m29.494s

There's no reason to do the FTRACE_UPDATE_CALLS update
for each event in perf case, because all the ftrace_ops
always share the same filter, so the updated calls are
always the same.

It's required that only first ftrace_ops registration
does the FTRACE_UPDATE_CALLS update (also sometimes
the second if the first one used the trampoline), but
the rest can be only cheaply linked into the ftrace_ops
list.

Counting:
  # time ./perf stat -e ftrace:function -a sleep 1

   Performance counter stats for 'system wide':

 398,571  ftrace:function

 1.377503733 seconds time elapsed

  real0m2.787s
  user0m0.005s
  sys 0m1.883s

Sampling:
  # time ./perf record -e ftrace:function -a sleep 1
  [ perf record: Woken up 0 times to write data ]
  Warning:
  Processed 261730 events and lost 9 chunks!

  [ perf record: Captured and wrote 19.907 MB perf.data (256293 samples) ]

  real1m31.948s
  user0m0.309s
  sys 1m32.051s

Signed-off-by: Jiri Olsa 
---
 kernel/trace/ftrace.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 123dddc660e9..48b491463549 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2650,7 +2650,6 @@ static int ftrace_startup(struct ftrace_ops *ops, int 
command)
return ret;
 
ftrace_start_up++;
-   command |= FTRACE_UPDATE_CALLS;
 
/*
 * Note that ftrace probes uses this to start up
@@ -2671,7 +2670,8 @@ static int ftrace_startup(struct ftrace_ops *ops, int 
command)
return ret;
}
 
-   ftrace_hash_rec_enable(ops, 1);
+   if (ftrace_hash_rec_enable(ops, 1))
+   command |= FTRACE_UPDATE_CALLS;
 
ftrace_startup_enable(command);
 
@@ -2701,11 +2701,11 @@ static int ftrace_shutdown(struct ftrace_ops *ops, int 
command)
 
/* Disabling ipmodify never fails */
ftrace_hash_ipmodify_disable(ops);
-   ftrace_hash_rec_disable(ops, 1);
 
-   ops->flags &= ~FTRACE_OPS_FL_ENABLED;
+   if (ftrace_hash_rec_disable(ops, 1))
+   command |= FTRACE_UPDATE_CALLS;
 
-   command |= FTRACE_UPDATE_CALLS;
+   ops->flags &= ~FTRACE_OPS_FL_ENABLED;
 
if (saved_ftrace_func != ftrace_trace_function) {
saved_ftrace_func = ftrace_trace_function;
-- 
2.4.3



[PATCH 5/5] ftrace: Update dynamic ftrace calls only if necessary

2016-03-09 Thread Jiri Olsa
Currently dynamic ftrace calls are updated any time
the ftrace_ops is un/registered. If we do  this update
only when it's needed, we save lot of time for perf
system wide ftrace function sampling/counting.

The reason is that for system wide sampling/counting,
perf creates event for each cpu in the system.

Each event then registers separate copy of ftrace_ops,
which ends up in FTRACE_UPDATE_CALLS updates. On servers
with many cpus that means serious stall (240 cpus server):

Counting:
  # time ./perf stat -e ftrace:function -a sleep 1

   Performance counter stats for 'system wide':

  370,663  ftrace:function

  1.401427505 seconds time elapsed

  real3m51.743s
  user0m0.023s
  sys 3m48.569s

Sampling:
  # time ./perf record -e ftrace:function -a sleep 1
  [ perf record: Woken up 0 times to write data ]
  Warning:
  Processed 141200 events and lost 5 chunks!

  [ perf record: Captured and wrote 10.703 MB perf.data (135950 samples) ]

  real2m31.429s
  user0m0.213s
  sys 2m29.494s

There's no reason to do the FTRACE_UPDATE_CALLS update
for each event in perf case, because all the ftrace_ops
always share the same filter, so the updated calls are
always the same.

It's required that only first ftrace_ops registration
does the FTRACE_UPDATE_CALLS update (also sometimes
the second if the first one used the trampoline), but
the rest can be only cheaply linked into the ftrace_ops
list.

Counting:
  # time ./perf stat -e ftrace:function -a sleep 1

   Performance counter stats for 'system wide':

 398,571  ftrace:function

 1.377503733 seconds time elapsed

  real0m2.787s
  user0m0.005s
  sys 0m1.883s

Sampling:
  # time ./perf record -e ftrace:function -a sleep 1
  [ perf record: Woken up 0 times to write data ]
  Warning:
  Processed 261730 events and lost 9 chunks!

  [ perf record: Captured and wrote 19.907 MB perf.data (256293 samples) ]

  real1m31.948s
  user0m0.309s
  sys 1m32.051s

Signed-off-by: Jiri Olsa 
---
 kernel/trace/ftrace.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 123dddc660e9..48b491463549 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2650,7 +2650,6 @@ static int ftrace_startup(struct ftrace_ops *ops, int 
command)
return ret;
 
ftrace_start_up++;
-   command |= FTRACE_UPDATE_CALLS;
 
/*
 * Note that ftrace probes uses this to start up
@@ -2671,7 +2670,8 @@ static int ftrace_startup(struct ftrace_ops *ops, int 
command)
return ret;
}
 
-   ftrace_hash_rec_enable(ops, 1);
+   if (ftrace_hash_rec_enable(ops, 1))
+   command |= FTRACE_UPDATE_CALLS;
 
ftrace_startup_enable(command);
 
@@ -2701,11 +2701,11 @@ static int ftrace_shutdown(struct ftrace_ops *ops, int 
command)
 
/* Disabling ipmodify never fails */
ftrace_hash_ipmodify_disable(ops);
-   ftrace_hash_rec_disable(ops, 1);
 
-   ops->flags &= ~FTRACE_OPS_FL_ENABLED;
+   if (ftrace_hash_rec_disable(ops, 1))
+   command |= FTRACE_UPDATE_CALLS;
 
-   command |= FTRACE_UPDATE_CALLS;
+   ops->flags &= ~FTRACE_OPS_FL_ENABLED;
 
if (saved_ftrace_func != ftrace_trace_function) {
saved_ftrace_func = ftrace_trace_function;
-- 
2.4.3