Re: [PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace

2012-12-17 Thread Hiraku Toyooka

(12/15/2012 02:08 AM), Steven Rostedt wrote:

On Fri, 2012-12-07 at 11:07 +0900, Hiraku Toyooka wrote:

Hi, Steven,

(2012/11/30 23:17), Steven Rostedt wrote:
[snip]
  >
  > Actually, I would have:
  >
  >   status\input | 0  | 1  |else|
  >  --++++
  >  not allocated |(do nothing)| alloc+swap |   EINVAL   |
  >  --++++
  >allocated   |free|   swap |   clear|
  >  --++++
  >
  > Perhaps we don't need to do the clear on swap, just let the trace
  > continue where it left off? But in case we should swap...
  >

I think we don't need the clear on swap too.
I'll update my patches like this table.

  > There's a fast way to clear the tracer. Look at what the wakeup tracer
  > does. We can make that generic. If you want, I can write that code up
  > too. Hmm, maybe I'll do that, as it will speed things up for
  > everyone :-)
  >



BTW, any update on this? I really like to get this into 3.9.



I'm applying your review comment and rebasing now. So, I'll post the
updated patch series in a few days.

Thanks,
Hiraku Toyooka

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace

2012-12-17 Thread Hiraku Toyooka

(12/15/2012 02:08 AM), Steven Rostedt wrote:

On Fri, 2012-12-07 at 11:07 +0900, Hiraku Toyooka wrote:

Hi, Steven,

(2012/11/30 23:17), Steven Rostedt wrote:
[snip]
  
   Actually, I would have:
  
 status\input | 0  | 1  |else|
--++++
not allocated |(do nothing)| alloc+swap |   EINVAL   |
--++++
  allocated   |free|   swap |   clear|
--++++
  
   Perhaps we don't need to do the clear on swap, just let the trace
   continue where it left off? But in case we should swap...
  

I think we don't need the clear on swap too.
I'll update my patches like this table.

   There's a fast way to clear the tracer. Look at what the wakeup tracer
   does. We can make that generic. If you want, I can write that code up
   too. Hmm, maybe I'll do that, as it will speed things up for
   everyone :-)
  



BTW, any update on this? I really like to get this into 3.9.



I'm applying your review comment and rebasing now. So, I'll post the
updated patch series in a few days.

Thanks,
Hiraku Toyooka

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace

2012-12-14 Thread Steven Rostedt
On Fri, 2012-12-07 at 11:07 +0900, Hiraku Toyooka wrote:
> Hi, Steven,
> 
> (2012/11/30 23:17), Steven Rostedt wrote:
> [snip]
>  >
>  > Actually, I would have:
>  >
>  >   status\input | 0  | 1  |else|
>  >  --++++
>  >  not allocated |(do nothing)| alloc+swap |   EINVAL   |
>  >  --++++
>  >allocated   |free|   swap |   clear|
>  >  --++++
>  >
>  > Perhaps we don't need to do the clear on swap, just let the trace
>  > continue where it left off? But in case we should swap...
>  >
> 
> I think we don't need the clear on swap too.
> I'll update my patches like this table.
> 
>  > There's a fast way to clear the tracer. Look at what the wakeup tracer
>  > does. We can make that generic. If you want, I can write that code up
>  > too. Hmm, maybe I'll do that, as it will speed things up for
>  > everyone :-)
>  >
> 

BTW, any update on this? I really like to get this into 3.9.

Thanks!

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace

2012-12-14 Thread Steven Rostedt
On Fri, 2012-12-07 at 11:07 +0900, Hiraku Toyooka wrote:
 Hi, Steven,
 
 (2012/11/30 23:17), Steven Rostedt wrote:
 [snip]
  
   Actually, I would have:
  
 status\input | 0  | 1  |else|
--++++
not allocated |(do nothing)| alloc+swap |   EINVAL   |
--++++
  allocated   |free|   swap |   clear|
--++++
  
   Perhaps we don't need to do the clear on swap, just let the trace
   continue where it left off? But in case we should swap...
  
 
 I think we don't need the clear on swap too.
 I'll update my patches like this table.
 
   There's a fast way to clear the tracer. Look at what the wakeup tracer
   does. We can make that generic. If you want, I can write that code up
   too. Hmm, maybe I'll do that, as it will speed things up for
   everyone :-)
  
 

BTW, any update on this? I really like to get this into 3.9.

Thanks!

-- Steve


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace

2012-12-07 Thread Steven Rostedt
On Fri, 2012-12-07 at 11:07 +0900, Hiraku Toyooka wrote:
> Hi, Steven,
> 
> (2012/11/30 23:17), Steven Rostedt wrote:
> [snip]
>  >
>  > Actually, I would have:
>  >
>  >   status\input | 0  | 1  |else|
>  >  --++++
>  >  not allocated |(do nothing)| alloc+swap |   EINVAL   |
>  >  --++++
>  >allocated   |free|   swap |   clear|
>  >  --++++
>  >
>  > Perhaps we don't need to do the clear on swap, just let the trace
>  > continue where it left off? But in case we should swap...
>  >
> 
> I think we don't need the clear on swap too.
> I'll update my patches like this table.
> 
>  > There's a fast way to clear the tracer. Look at what the wakeup tracer
>  > does. We can make that generic. If you want, I can write that code up
>  > too. Hmm, maybe I'll do that, as it will speed things up for
>  > everyone :-)
>  >
> 
> (I looked over the wakeup tracer, but I couldn't find that code...)

Heh, sorry, you needed to look at the "update_max_tr()" in
kernel/trace/trace.c. Where we update the time_start value. Then the
output skips all timestamps before that start. This is much more
efficient than a 'reset', as we don't need to sync or anything. Just
record the timestamp of where we want to consider the buffer started,
and ignore any event before that.


> I think that seq_read() calls s_stop() even if s_start() failed.
> 
> seq_read()@fs/seq_file.c:
> 
>  p = m->op->start(m, );
>  while (1) {
>  err = PTR_ERR(p);
>  if (!p || IS_ERR(p))
>  break;
>  ...
>  }
>  m->op->stop(m, p);
> 
> So, I think we need the check in s_stop(), don't we?

Crap, you're right. Hmm, why was I thinking that it didn't. I better go
and review some of my recent code to make sure that I didn't have that
wrong assumption.

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace

2012-12-07 Thread Steven Rostedt
On Fri, 2012-12-07 at 11:07 +0900, Hiraku Toyooka wrote:
 Hi, Steven,
 
 (2012/11/30 23:17), Steven Rostedt wrote:
 [snip]
  
   Actually, I would have:
  
 status\input | 0  | 1  |else|
--++++
not allocated |(do nothing)| alloc+swap |   EINVAL   |
--++++
  allocated   |free|   swap |   clear|
--++++
  
   Perhaps we don't need to do the clear on swap, just let the trace
   continue where it left off? But in case we should swap...
  
 
 I think we don't need the clear on swap too.
 I'll update my patches like this table.
 
   There's a fast way to clear the tracer. Look at what the wakeup tracer
   does. We can make that generic. If you want, I can write that code up
   too. Hmm, maybe I'll do that, as it will speed things up for
   everyone :-)
  
 
 (I looked over the wakeup tracer, but I couldn't find that code...)

Heh, sorry, you needed to look at the update_max_tr() in
kernel/trace/trace.c. Where we update the time_start value. Then the
output skips all timestamps before that start. This is much more
efficient than a 'reset', as we don't need to sync or anything. Just
record the timestamp of where we want to consider the buffer started,
and ignore any event before that.


 I think that seq_read() calls s_stop() even if s_start() failed.
 
 seq_read()@fs/seq_file.c:
 
  p = m-op-start(m, pos);
  while (1) {
  err = PTR_ERR(p);
  if (!p || IS_ERR(p))
  break;
  ...
  }
  m-op-stop(m, p);
 
 So, I think we need the check in s_stop(), don't we?

Crap, you're right. Hmm, why was I thinking that it didn't. I better go
and review some of my recent code to make sure that I didn't have that
wrong assumption.

-- Steve


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace

2012-12-06 Thread Hiraku Toyooka

Hi, Steven,

(2012/11/30 23:17), Steven Rostedt wrote:
[snip]
>
> Actually, I would have:
>
>   status\input | 0  | 1  |else|
>  --++++
>  not allocated |(do nothing)| alloc+swap |   EINVAL   |
>  --++++
>allocated   |free|   swap |   clear|
>  --++++
>
> Perhaps we don't need to do the clear on swap, just let the trace
> continue where it left off? But in case we should swap...
>

I think we don't need the clear on swap too.
I'll update my patches like this table.

> There's a fast way to clear the tracer. Look at what the wakeup tracer
> does. We can make that generic. If you want, I can write that code up
> too. Hmm, maybe I'll do that, as it will speed things up for
> everyone :-)
>

(I looked over the wakeup tracer, but I couldn't find that code...)

>
>> I think it is almost OK, but there is a problem.
>> When we echo "1" to the allocated snapshot, the clear operation adds
>> some delay because the time cost of tracing_reset_online_cpus() is in
>> proportion to the number of CPUs.
>> (It takes 72ms in my 8 CPU environment.)
>>
>> So, when the snapshot is already cleared by echoing "else" values, we
>> can avoid the delay on echoing "1" by keeping "cleared" status
>> internally. For example, we can add the "cleared" flag to struct tracer.
>> What do you think about it?
>>
>>  >
>>  > Also we can add a "trace_snapshot" to the kernel parameters to 
have it

>>  > allocated on boot. But I can add that if you update these patches.
>>  >
>>
>> OK, I'll update my patches.
>
> This part (kernel parameter) can be a separate patch.
>

Yes.

>
>>  > Either test here, or better yet, put the test into
>>  > tracing_reset_online_cpus().
>>  >
>>  > if (!buffer)
>>  > return;
>>  >
>>
>> I see. I'll add the test to tracing_reset_online_cpus(). Should I make a
>> separated patch?
>
> It's a small change, you can add it to your patch or make it separate.
> I'll leave that up to you.
>

I see. Perhaps I'll make it separate.

>> [snip]
>>  >> +static ssize_t tracing_snapshot_read(struct file *filp, char 
__user *ubuf,

>>  >> + size_t cnt, loff_t *ppos)
>>  >> +{
>>  >> +ssize_t ret = 0;
>>  >> +
>>  >> +mutex_lock(_types_lock);
>>  >> +if (current_trace && current_trace->use_max_tr)
>>  >> +ret = -EBUSY;
>>  >> +mutex_unlock(_types_lock);
>>  >
>>  > I don't like this, as it is racy. The current tracer could change 
after

>>  > the unlock, and your back to the problem.
>>  >
>>
>> You're right...
>> This is racy.
>>
>>  > Now what we may be able to do, but it would take a little 
checking for
>>  > lock ordering with trace_access_lock() and 
trace_event_read_lock(), but

>>  > we could add the mutex to trace_types_lock to both s_start() and
>>  > s_stop() and do the check in s_start() if iter->snapshot is true.
>>  >
>>
>> If I catch your meaning, do s_start() and s_stop() become like following
>> code?
>> (Now, s_start() is used from two files - "trace" and "snapshot", so we
>> should change static "old_tracer" to per open-file.)
>
> Actually, lets nuke all the old_tracer totally, and instead do:
>
> if (unlikely(strcmp(iter->trace->name, current_trace->name) != 0)) {
>
> You can make this into a separate patch. You can add a check if
> current_trace is not NULL too, but I need to fix that, as current_trace
> should never be NULL (except for very early boot). But don't worry about
> that, I'll handle that.
>

O.K. I'll change all the old_tracer checking to the strcmp.

> Or I can write up this patch and send it to you, and you can include it
> in your series.
>
>> static void *s_start(struct seq_file *m, loff_t *pos)
>> {
>>   struct trace_iterator *iter = m->private;
>> -static struct tracer *old_tracer;
>> ...
>>   /* copy the tracer to avoid using a global lock all around */
>>   mutex_lock(_types_lock);
>> -if (unlikely(old_tracer != current_trace && current_trace)) {
>> -old_tracer = current_trace;
>> +if (unlikely(iter->old_tracer != current_trace && current_trace)) {
>> +iter->old_tracer = current_trace;
>>   *iter->trace = *current_trace;
>>   }
>>   mutex_unlock(_types_lock);
>>
>> +if (iter->snapshot && iter->trace->use_max_tr)
>> +return ERR_PTR(-EBUSY);
>> +
>> ...
>> }
>>
>> static void s_stop(struct seq_file *m, void *p)
>> {
>>   struct trace_iterator *iter = m->private;
>>
>> +if (iter->snapshot && iter->trace->use_max_tr)
>> +return;
>
> This part shouldn't be needed, as if s_start fails it wont call
> s_stop(). But if you are paranoid (like I am ;-) then we can do:
>
> if (WARN_ON_ONCE(iter->snapshot && iter->trace->use_max_tr)
> return;

I think that seq_read() calls s_stop() even if s_start() failed.

seq_read()@fs/seq_file.c:

p = m->op->start(m, );
   

Re: [PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace

2012-12-06 Thread Hiraku Toyooka

Hi, Steven,

(2012/11/30 23:17), Steven Rostedt wrote:
[snip]

 Actually, I would have:

   status\input | 0  | 1  |else|
  --++++
  not allocated |(do nothing)| alloc+swap |   EINVAL   |
  --++++
allocated   |free|   swap |   clear|
  --++++

 Perhaps we don't need to do the clear on swap, just let the trace
 continue where it left off? But in case we should swap...


I think we don't need the clear on swap too.
I'll update my patches like this table.

 There's a fast way to clear the tracer. Look at what the wakeup tracer
 does. We can make that generic. If you want, I can write that code up
 too. Hmm, maybe I'll do that, as it will speed things up for
 everyone :-)


(I looked over the wakeup tracer, but I couldn't find that code...)


 I think it is almost OK, but there is a problem.
 When we echo 1 to the allocated snapshot, the clear operation adds
 some delay because the time cost of tracing_reset_online_cpus() is in
 proportion to the number of CPUs.
 (It takes 72ms in my 8 CPU environment.)

 So, when the snapshot is already cleared by echoing else values, we
 can avoid the delay on echoing 1 by keeping cleared status
 internally. For example, we can add the cleared flag to struct tracer.
 What do you think about it?

  
   Also we can add a trace_snapshot to the kernel parameters to 
have it

   allocated on boot. But I can add that if you update these patches.
  

 OK, I'll update my patches.

 This part (kernel parameter) can be a separate patch.


Yes.


   Either test here, or better yet, put the test into
   tracing_reset_online_cpus().
  
   if (!buffer)
   return;
  

 I see. I'll add the test to tracing_reset_online_cpus(). Should I make a
 separated patch?

 It's a small change, you can add it to your patch or make it separate.
 I'll leave that up to you.


I see. Perhaps I'll make it separate.

 [snip]
   +static ssize_t tracing_snapshot_read(struct file *filp, char 
__user *ubuf,

   + size_t cnt, loff_t *ppos)
   +{
   +ssize_t ret = 0;
   +
   +mutex_lock(trace_types_lock);
   +if (current_trace  current_trace-use_max_tr)
   +ret = -EBUSY;
   +mutex_unlock(trace_types_lock);
  
   I don't like this, as it is racy. The current tracer could change 
after

   the unlock, and your back to the problem.
  

 You're right...
 This is racy.

   Now what we may be able to do, but it would take a little 
checking for
   lock ordering with trace_access_lock() and 
trace_event_read_lock(), but

   we could add the mutex to trace_types_lock to both s_start() and
   s_stop() and do the check in s_start() if iter-snapshot is true.
  

 If I catch your meaning, do s_start() and s_stop() become like following
 code?
 (Now, s_start() is used from two files - trace and snapshot, so we
 should change static old_tracer to per open-file.)

 Actually, lets nuke all the old_tracer totally, and instead do:

 if (unlikely(strcmp(iter-trace-name, current_trace-name) != 0)) {

 You can make this into a separate patch. You can add a check if
 current_trace is not NULL too, but I need to fix that, as current_trace
 should never be NULL (except for very early boot). But don't worry about
 that, I'll handle that.


O.K. I'll change all the old_tracer checking to the strcmp.

 Or I can write up this patch and send it to you, and you can include it
 in your series.

 static void *s_start(struct seq_file *m, loff_t *pos)
 {
   struct trace_iterator *iter = m-private;
 -static struct tracer *old_tracer;
 ...
   /* copy the tracer to avoid using a global lock all around */
   mutex_lock(trace_types_lock);
 -if (unlikely(old_tracer != current_trace  current_trace)) {
 -old_tracer = current_trace;
 +if (unlikely(iter-old_tracer != current_trace  current_trace)) {
 +iter-old_tracer = current_trace;
   *iter-trace = *current_trace;
   }
   mutex_unlock(trace_types_lock);

 +if (iter-snapshot  iter-trace-use_max_tr)
 +return ERR_PTR(-EBUSY);
 +
 ...
 }

 static void s_stop(struct seq_file *m, void *p)
 {
   struct trace_iterator *iter = m-private;

 +if (iter-snapshot  iter-trace-use_max_tr)
 +return;

 This part shouldn't be needed, as if s_start fails it wont call
 s_stop(). But if you are paranoid (like I am ;-) then we can do:

 if (WARN_ON_ONCE(iter-snapshot  iter-trace-use_max_tr)
 return;

I think that seq_read() calls s_stop() even if s_start() failed.

seq_read()@fs/seq_file.c:

p = m-op-start(m, pos);
while (1) {
err = PTR_ERR(p);
if (!p || IS_ERR(p))
break;
...
}
m-op-stop(m, p);

So, I think we need the check in s_stop(), don't we?


Thanks,
Hiraku Toyooka
--
To unsubscribe 

Re: [PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace

2012-11-30 Thread Steven Rostedt
On Fri, 2012-11-30 at 14:46 +0900, Hiraku Toyooka wrote:
> Hi, Steven,
> 
> Thank you for your review.
> 
> (2012/11/16 10:46), Steven Rostedt wrote:
> [snip]
>  > I was thinking about this some more, and I don't like the
>  > snapshot_allocate part. Also, I think the snapshot should not be
>  > allocated by default, and not used until the user explicitly asks for
>  > it.
>  >
>  > Have this:
>  >
>  > ---
>  >  # cat snapshot
>  > Snapshot is not allocated. To allocate it write the ASCII "1" into
>  > this file:
>  >
>  >   echo 1 > snapshot
>  >
>  > This will allocate the buffer for you. To free the snapshot echo "0"
>  > into this file.
>  >
>  >   echo "0" > snapshot
>  >
>  > Anything else will reset the snapshot if it is allocated, or return
>  > EINVAL if it is not allocated.
>  > ---
>  >
> 
> Your idea about "snapshot" is like following table, isn't it?
> 
>   status\input | 0  | 1  |else|
> --++++
> not allocated |   EINVAL   | alloc+swap |   EINVAL   |
> --++++
>allocated   |free| clear+swap |   clear|
> --++++

Actually, I would have:

  status\input | 0  | 1  |else|
 --++++
 not allocated |(do nothing)| alloc+swap |   EINVAL   |
 --++++
   allocated   |free|   swap |   clear|
 --++++

Perhaps we don't need to do the clear on swap, just let the trace
continue where it left off? But in case we should swap...

There's a fast way to clear the tracer. Look at what the wakeup tracer
does. We can make that generic. If you want, I can write that code up
too. Hmm, maybe I'll do that, as it will speed things up for
everyone :-)


> 
> I think it is almost OK, but there is a problem.
> When we echo "1" to the allocated snapshot, the clear operation adds
> some delay because the time cost of tracing_reset_online_cpus() is in
> proportion to the number of CPUs.
> (It takes 72ms in my 8 CPU environment.)
> 
> So, when the snapshot is already cleared by echoing "else" values, we
> can avoid the delay on echoing "1" by keeping "cleared" status
> internally. For example, we can add the "cleared" flag to struct tracer.
> What do you think about it?
> 
>  >
>  > Also we can add a "trace_snapshot" to the kernel parameters to have it
>  > allocated on boot. But I can add that if you update these patches.
>  >
> 
> OK, I'll update my patches.

This part (kernel parameter) can be a separate patch.


>  > Either test here, or better yet, put the test into
>  > tracing_reset_online_cpus().
>  >
>  > if (!buffer)
>  > return;
>  >
> 
> I see. I'll add the test to tracing_reset_online_cpus(). Should I make a
> separated patch?

It's a small change, you can add it to your patch or make it separate.
I'll leave that up to you.

> 
> [snip]
>  >> +static ssize_t tracing_snapshot_read(struct file *filp, char __user 
> *ubuf,
>  >> + size_t cnt, loff_t *ppos)
>  >> +{
>  >> +ssize_t ret = 0;
>  >> +
>  >> +mutex_lock(_types_lock);
>  >> +if (current_trace && current_trace->use_max_tr)
>  >> +ret = -EBUSY;
>  >> +mutex_unlock(_types_lock);
>  >
>  > I don't like this, as it is racy. The current tracer could change after
>  > the unlock, and your back to the problem.
>  >
> 
> You're right...
> This is racy.
> 
>  > Now what we may be able to do, but it would take a little checking for
>  > lock ordering with trace_access_lock() and trace_event_read_lock(), but
>  > we could add the mutex to trace_types_lock to both s_start() and
>  > s_stop() and do the check in s_start() if iter->snapshot is true.
>  >
> 
> If I catch your meaning, do s_start() and s_stop() become like following
> code?
> (Now, s_start() is used from two files - "trace" and "snapshot", so we
> should change static "old_tracer" to per open-file.)

Actually, lets nuke all the old_tracer totally, and instead do:

if (unlikely(strcmp(iter->trace->name, current_trace->name) != 0)) {

You can make this into a separate patch. You can add a check if
current_trace is not NULL too, but I need to fix that, as current_trace
should never be NULL (except for very early boot). But don't worry about
that, I'll handle that.

Or I can write up this patch and send it to you, and you can include it
in your series.

> 
> static void *s_start(struct seq_file *m, loff_t *pos)
> {
>   struct trace_iterator *iter = m->private;
> -static struct tracer *old_tracer;
> ...
>   /* copy the tracer to avoid using a global lock all around */
>   mutex_lock(_types_lock);
> -if (unlikely(old_tracer != current_trace && current_trace)) {
> -old_tracer = current_trace;
> +if (unlikely(iter->old_tracer != current_trace && current_trace)) {
> + 

Re: [PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace

2012-11-30 Thread Steven Rostedt
On Fri, 2012-11-30 at 14:46 +0900, Hiraku Toyooka wrote:
 Hi, Steven,
 
 Thank you for your review.
 
 (2012/11/16 10:46), Steven Rostedt wrote:
 [snip]
   I was thinking about this some more, and I don't like the
   snapshot_allocate part. Also, I think the snapshot should not be
   allocated by default, and not used until the user explicitly asks for
   it.
  
   Have this:
  
   ---
# cat snapshot
   Snapshot is not allocated. To allocate it write the ASCII 1 into
   this file:
  
 echo 1  snapshot
  
   This will allocate the buffer for you. To free the snapshot echo 0
   into this file.
  
 echo 0  snapshot
  
   Anything else will reset the snapshot if it is allocated, or return
   EINVAL if it is not allocated.
   ---
  
 
 Your idea about snapshot is like following table, isn't it?
 
   status\input | 0  | 1  |else|
 --++++
 not allocated |   EINVAL   | alloc+swap |   EINVAL   |
 --++++
allocated   |free| clear+swap |   clear|
 --++++

Actually, I would have:

  status\input | 0  | 1  |else|
 --++++
 not allocated |(do nothing)| alloc+swap |   EINVAL   |
 --++++
   allocated   |free|   swap |   clear|
 --++++

Perhaps we don't need to do the clear on swap, just let the trace
continue where it left off? But in case we should swap...

There's a fast way to clear the tracer. Look at what the wakeup tracer
does. We can make that generic. If you want, I can write that code up
too. Hmm, maybe I'll do that, as it will speed things up for
everyone :-)


 
 I think it is almost OK, but there is a problem.
 When we echo 1 to the allocated snapshot, the clear operation adds
 some delay because the time cost of tracing_reset_online_cpus() is in
 proportion to the number of CPUs.
 (It takes 72ms in my 8 CPU environment.)
 
 So, when the snapshot is already cleared by echoing else values, we
 can avoid the delay on echoing 1 by keeping cleared status
 internally. For example, we can add the cleared flag to struct tracer.
 What do you think about it?
 
  
   Also we can add a trace_snapshot to the kernel parameters to have it
   allocated on boot. But I can add that if you update these patches.
  
 
 OK, I'll update my patches.

This part (kernel parameter) can be a separate patch.


   Either test here, or better yet, put the test into
   tracing_reset_online_cpus().
  
   if (!buffer)
   return;
  
 
 I see. I'll add the test to tracing_reset_online_cpus(). Should I make a
 separated patch?

It's a small change, you can add it to your patch or make it separate.
I'll leave that up to you.

 
 [snip]
   +static ssize_t tracing_snapshot_read(struct file *filp, char __user 
 *ubuf,
   + size_t cnt, loff_t *ppos)
   +{
   +ssize_t ret = 0;
   +
   +mutex_lock(trace_types_lock);
   +if (current_trace  current_trace-use_max_tr)
   +ret = -EBUSY;
   +mutex_unlock(trace_types_lock);
  
   I don't like this, as it is racy. The current tracer could change after
   the unlock, and your back to the problem.
  
 
 You're right...
 This is racy.
 
   Now what we may be able to do, but it would take a little checking for
   lock ordering with trace_access_lock() and trace_event_read_lock(), but
   we could add the mutex to trace_types_lock to both s_start() and
   s_stop() and do the check in s_start() if iter-snapshot is true.
  
 
 If I catch your meaning, do s_start() and s_stop() become like following
 code?
 (Now, s_start() is used from two files - trace and snapshot, so we
 should change static old_tracer to per open-file.)

Actually, lets nuke all the old_tracer totally, and instead do:

if (unlikely(strcmp(iter-trace-name, current_trace-name) != 0)) {

You can make this into a separate patch. You can add a check if
current_trace is not NULL too, but I need to fix that, as current_trace
should never be NULL (except for very early boot). But don't worry about
that, I'll handle that.

Or I can write up this patch and send it to you, and you can include it
in your series.

 
 static void *s_start(struct seq_file *m, loff_t *pos)
 {
   struct trace_iterator *iter = m-private;
 -static struct tracer *old_tracer;
 ...
   /* copy the tracer to avoid using a global lock all around */
   mutex_lock(trace_types_lock);
 -if (unlikely(old_tracer != current_trace  current_trace)) {
 -old_tracer = current_trace;
 +if (unlikely(iter-old_tracer != current_trace  current_trace)) {
 +iter-old_tracer = current_trace;
   *iter-trace = *current_trace;
   }
   mutex_unlock(trace_types_lock);
 
 +if (iter-snapshot  iter-trace-use_max_tr)
 +return 

Re: [PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace

2012-11-29 Thread Hiraku Toyooka

Hi, Steven,

Thank you for your review.

(2012/11/16 10:46), Steven Rostedt wrote:
[snip]
> I was thinking about this some more, and I don't like the
> snapshot_allocate part. Also, I think the snapshot should not be
> allocated by default, and not used until the user explicitly asks for
> it.
>
> Have this:
>
> ---
>  # cat snapshot
> Snapshot is not allocated. To allocate it write the ASCII "1" into
> this file:
>
>   echo 1 > snapshot
>
> This will allocate the buffer for you. To free the snapshot echo "0"
> into this file.
>
>   echo "0" > snapshot
>
> Anything else will reset the snapshot if it is allocated, or return
> EINVAL if it is not allocated.
> ---
>

Your idea about "snapshot" is like following table, isn't it?

 status\input | 0  | 1  |else|
--++++
not allocated |   EINVAL   | alloc+swap |   EINVAL   |
--++++
  allocated   |free| clear+swap |   clear|
--++++

I think it is almost OK, but there is a problem.
When we echo "1" to the allocated snapshot, the clear operation adds
some delay because the time cost of tracing_reset_online_cpus() is in
proportion to the number of CPUs.
(It takes 72ms in my 8 CPU environment.)

So, when the snapshot is already cleared by echoing "else" values, we
can avoid the delay on echoing "1" by keeping "cleared" status
internally. For example, we can add the "cleared" flag to struct tracer.
What do you think about it?

>
> Also we can add a "trace_snapshot" to the kernel parameters to have it
> allocated on boot. But I can add that if you update these patches.
>

OK, I'll update my patches.

[snip]
>> diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
>> index 4cea4f4..73d56d5 100644
>> --- a/kernel/trace/Kconfig
>> +++ b/kernel/trace/Kconfig
>> @@ -102,6 +102,17 @@ config RING_BUFFER_ALLOW_SWAP
>>   Allow the use of ring_buffer_swap_cpu.
>>   Adds a very slight overhead to tracing when enabled.
>>
>
> Move this config down below FTRACE_SYSCALLS and give it a prompt. As
> well as do not make it default y.
>

I'll modify it.

>
>> +config TRACER_SNAPSHOT
>> +bool
>
> bool "Create a snapshot trace buffer"
>

I'll fix it.


>
>> +default y
>> +select TRACER_MAX_TRACE
>> +help
>> +  Allow tracing users to take snapshot of the current buffer 
using the

>> +  ftrace interface, e.g.:
>> +
>> +  echo 1 > /sys/kernel/debug/tracing/snapshot
>> +  cat snapshot
>> +
[snip]
>>  static struct trace_iterator *
>> -__tracing_open(struct inode *inode, struct file *file)
>> +__tracing_open(struct inode *inode, struct file *file, int snapshot)
>
> bool snapshot
>

I'll fix it.


>>  {
>>  long cpu_file = (long) inode->i_private;
>>  struct trace_iterator *iter;
>> @@ -2408,10 +2410,11 @@ __tracing_open(struct inode *inode, struct 
file *file)

>>  if (!zalloc_cpumask_var(>started, GFP_KERNEL))
>>  goto fail;
>>
>> -if (current_trace && current_trace->print_max)
>> +if ((current_trace && current_trace->print_max) || snapshot)
>>  iter->tr = _tr;
>>  else
>>  iter->tr = _trace;
>> +iter->snapshot = !!snapshot;
>
> Get rid of the !!
>

I'll fix it.

[snip]
>> @@ -2517,7 +2522,7 @@ static int tracing_open(struct inode *inode, 
struct file *file)

>>  }
>>
>>  if (file->f_mode & FMODE_READ) {
>> -iter = __tracing_open(inode, file);
>> +iter = __tracing_open(inode, file, 0);
>
> , false)
>

I'll fix it.

>>  if (IS_ERR(iter))
>>  ret = PTR_ERR(iter);
>>  else if (trace_flags & TRACE_ITER_LATENCY_FMT)
>> @@ -3186,7 +3191,8 @@ static int tracing_set_tracer(const char *buf)
>>  trace_branch_disable();
>>  if (current_trace && current_trace->reset)
>>  current_trace->reset(tr);
>> -if (current_trace && current_trace->use_max_tr) {
>> +if (current_trace && current_trace->allocated_snapshot) {
>> +tracing_reset_online_cpus(_tr);
>
> max_tr->buffer could be NULL.
>
> Either test here, or better yet, put the test into
> tracing_reset_online_cpus().
>
> if (!buffer)
> return;
>

I see. I'll add the test to tracing_reset_online_cpus(). Should I make a
separated patch?

[snip]
>> +static ssize_t tracing_snapshot_read(struct file *filp, char __user 
*ubuf,

>> + size_t cnt, loff_t *ppos)
>> +{
>> +ssize_t ret = 0;
>> +
>> +mutex_lock(_types_lock);
>> +if (current_trace && current_trace->use_max_tr)
>> +ret = -EBUSY;
>> +mutex_unlock(_types_lock);
>
> I don't like this, as it is racy. The current tracer could change after
> the unlock, and your back to the problem.
>

You're right...
This is racy.

> Now what we may be able to do, but it would take a little checking for
> lock ordering with trace_access_lock() and trace_event_read_lock(), but
> we could add the mutex 

Re: [PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace

2012-11-29 Thread Hiraku Toyooka

Hi, Steven,

Thank you for your review.

(2012/11/16 10:46), Steven Rostedt wrote:
[snip]
 I was thinking about this some more, and I don't like the
 snapshot_allocate part. Also, I think the snapshot should not be
 allocated by default, and not used until the user explicitly asks for
 it.

 Have this:

 ---
  # cat snapshot
 Snapshot is not allocated. To allocate it write the ASCII 1 into
 this file:

   echo 1  snapshot

 This will allocate the buffer for you. To free the snapshot echo 0
 into this file.

   echo 0  snapshot

 Anything else will reset the snapshot if it is allocated, or return
 EINVAL if it is not allocated.
 ---


Your idea about snapshot is like following table, isn't it?

 status\input | 0  | 1  |else|
--++++
not allocated |   EINVAL   | alloc+swap |   EINVAL   |
--++++
  allocated   |free| clear+swap |   clear|
--++++

I think it is almost OK, but there is a problem.
When we echo 1 to the allocated snapshot, the clear operation adds
some delay because the time cost of tracing_reset_online_cpus() is in
proportion to the number of CPUs.
(It takes 72ms in my 8 CPU environment.)

So, when the snapshot is already cleared by echoing else values, we
can avoid the delay on echoing 1 by keeping cleared status
internally. For example, we can add the cleared flag to struct tracer.
What do you think about it?


 Also we can add a trace_snapshot to the kernel parameters to have it
 allocated on boot. But I can add that if you update these patches.


OK, I'll update my patches.

[snip]
 diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
 index 4cea4f4..73d56d5 100644
 --- a/kernel/trace/Kconfig
 +++ b/kernel/trace/Kconfig
 @@ -102,6 +102,17 @@ config RING_BUFFER_ALLOW_SWAP
   Allow the use of ring_buffer_swap_cpu.
   Adds a very slight overhead to tracing when enabled.


 Move this config down below FTRACE_SYSCALLS and give it a prompt. As
 well as do not make it default y.


I'll modify it.


 +config TRACER_SNAPSHOT
 +bool

 bool Create a snapshot trace buffer


I'll fix it.



 +default y
 +select TRACER_MAX_TRACE
 +help
 +  Allow tracing users to take snapshot of the current buffer 
using the

 +  ftrace interface, e.g.:
 +
 +  echo 1  /sys/kernel/debug/tracing/snapshot
 +  cat snapshot
 +
[snip]
  static struct trace_iterator *
 -__tracing_open(struct inode *inode, struct file *file)
 +__tracing_open(struct inode *inode, struct file *file, int snapshot)

 bool snapshot


I'll fix it.


  {
  long cpu_file = (long) inode-i_private;
  struct trace_iterator *iter;
 @@ -2408,10 +2410,11 @@ __tracing_open(struct inode *inode, struct 
file *file)

  if (!zalloc_cpumask_var(iter-started, GFP_KERNEL))
  goto fail;

 -if (current_trace  current_trace-print_max)
 +if ((current_trace  current_trace-print_max) || snapshot)
  iter-tr = max_tr;
  else
  iter-tr = global_trace;
 +iter-snapshot = !!snapshot;

 Get rid of the !!


I'll fix it.

[snip]
 @@ -2517,7 +2522,7 @@ static int tracing_open(struct inode *inode, 
struct file *file)

  }

  if (file-f_mode  FMODE_READ) {
 -iter = __tracing_open(inode, file);
 +iter = __tracing_open(inode, file, 0);

 , false)


I'll fix it.

  if (IS_ERR(iter))
  ret = PTR_ERR(iter);
  else if (trace_flags  TRACE_ITER_LATENCY_FMT)
 @@ -3186,7 +3191,8 @@ static int tracing_set_tracer(const char *buf)
  trace_branch_disable();
  if (current_trace  current_trace-reset)
  current_trace-reset(tr);
 -if (current_trace  current_trace-use_max_tr) {
 +if (current_trace  current_trace-allocated_snapshot) {
 +tracing_reset_online_cpus(max_tr);

 max_tr-buffer could be NULL.

 Either test here, or better yet, put the test into
 tracing_reset_online_cpus().

 if (!buffer)
 return;


I see. I'll add the test to tracing_reset_online_cpus(). Should I make a
separated patch?

[snip]
 +static ssize_t tracing_snapshot_read(struct file *filp, char __user 
*ubuf,

 + size_t cnt, loff_t *ppos)
 +{
 +ssize_t ret = 0;
 +
 +mutex_lock(trace_types_lock);
 +if (current_trace  current_trace-use_max_tr)
 +ret = -EBUSY;
 +mutex_unlock(trace_types_lock);

 I don't like this, as it is racy. The current tracer could change after
 the unlock, and your back to the problem.


You're right...
This is racy.

 Now what we may be able to do, but it would take a little checking for
 lock ordering with trace_access_lock() and trace_event_read_lock(), but
 we could add the mutex to trace_types_lock to both s_start() and
 s_stop() and do the check in s_start() if iter-snapshot is true.


If I catch your meaning, do s_start() and s_stop() become like following
code?
(Now, s_start() 

Re: [PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace

2012-11-15 Thread Steven Rostedt
On Wed, 2012-10-17 at 11:56 +0900, Hiraku Toyooka wrote:
> Ftrace has a snapshot feature available from kernel space and
> latency tracers (e.g. irqsoff) are using it. This patch enables
> user applictions to take a snapshot via debugfs.
> 
> Add following two debugfs files in "tracing" directory.
> 
>   snapshot:
> This is used to take a snapshot and to read the output of the
> snapshot.
>  # echo 1 > snapshot
>  # cat snapshot
> 
>   snapshot_allocate:
> Echoing 1 to the "snapshot" allocates the max_tr buffer if
> it is not allocated. So taking a snapshot may delay or fail
> beacuse of memory allocation. To avoid that, this file
> provides a mean to pre-allocate (or free) the max_tr buffer.
>  # echo 1 > snapshot_allocate
>  [...]
>  # echo 1 > snapshot

I was thinking about this some more, and I don't like the
snapshot_allocate part. Also, I think the snapshot should not be
allocated by default, and not used until the user explicitly asks for
it.

Have this:

---
 # cat snapshot
Snapshot is not allocated. To allocate it write the ASCII "1" into
this file:

  echo 1 > snapshot

This will allocate the buffer for you. To free the snapshot echo "0"
into this file.

  echo "0" > snapshot

Anything else will reset the snapshot if it is allocated, or return
EINVAL if it is not allocated.
---


Also we can add a "trace_snapshot" to the kernel parameters to have it
allocated on boot. But I can add that if you update these patches.



> 
> Signed-off-by: Hiraku Toyooka 
> Cc: Steven Rostedt 
> Cc: Frederic Weisbecker 
> Cc: Ingo Molnar 
> Cc: Jiri Olsa 
> Cc: Li Zefan 
> Cc: linux-kernel@vger.kernel.org
> ---
> 
>  include/linux/ftrace_event.h |3 +
>  kernel/trace/Kconfig |   11 ++
>  kernel/trace/trace.c |  190 
> +++---
>  kernel/trace/trace.h |1 
>  4 files changed, 193 insertions(+), 12 deletions(-)
> 
> diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
> index 642928c..8c32c6e 100644
> --- a/include/linux/ftrace_event.h
> +++ b/include/linux/ftrace_event.h
> @@ -84,6 +84,9 @@ struct trace_iterator {
>   longidx;
>  
>   cpumask_var_t   started;
> +
> + /* it's true when current open file is snapshot */
> + boolsnapshot;
>  };
>  
> 
> diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
> index 4cea4f4..73d56d5 100644
> --- a/kernel/trace/Kconfig
> +++ b/kernel/trace/Kconfig
> @@ -102,6 +102,17 @@ config RING_BUFFER_ALLOW_SWAP
>Allow the use of ring_buffer_swap_cpu.
>Adds a very slight overhead to tracing when enabled.
>  

Move this config down below FTRACE_SYSCALLS and give it a prompt. As
well as do not make it default y.


> +config TRACER_SNAPSHOT
> + bool

bool "Create a snapshot trace buffer"


> + default y
> + select TRACER_MAX_TRACE
> + help
> +   Allow tracing users to take snapshot of the current buffer using the
> +   ftrace interface, e.g.:
> +
> +   echo 1 > /sys/kernel/debug/tracing/snapshot
> +   cat snapshot
> +
>  # All tracer options should select GENERIC_TRACER. For those options that are
>  # enabled by all tracers (context switch and event tracer) they select 
> TRACING.
>  # This allows those options to appear when no other tracer is selected. But 
> the
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index d71eee1..b0d097e 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -699,7 +699,7 @@ update_max_tr(struct trace_array *tr, struct task_struct 
> *tsk, int cpu)
>   return;
>  
>   WARN_ON_ONCE(!irqs_disabled());
> - if (!current_trace->use_max_tr) {
> + if (!current_trace->allocated_snapshot) {
>   WARN_ON_ONCE(1);
>   return;
>   }
> @@ -729,7 +729,7 @@ update_max_tr_single(struct trace_array *tr, struct 
> task_struct *tsk, int cpu)
>   return;
>  
>   WARN_ON_ONCE(!irqs_disabled());
> - if (!current_trace->use_max_tr) {
> + if (!current_trace->allocated_snapshot) {
>   WARN_ON_ONCE(1);
>   return;
>   }
> @@ -1902,7 +1902,8 @@ static void *s_start(struct seq_file *m, loff_t *pos)
>   }
>   mutex_unlock(_types_lock);
>  
> - atomic_inc(_record_cmdline_disabled);
> + if (!iter->snapshot)
> + atomic_inc(_record_cmdline_disabled);
>  
>   if (*pos != iter->pos) {
>   iter->ent = NULL;
> @@ -1941,7 +1942,8 @@ static void s_stop(struct seq_file *m, void *p)
>  {
>   struct trace_iterator *iter = m->private;
>  
> - atomic_dec(_record_cmdline_disabled);
> + if (!iter->snapshot)
> + atomic_dec(_record_cmdline_disabled);
>   trace_access_unlock(iter->cpu_file);
>   trace_event_read_unlock();
>  }
> @@ -2375,7 +2377,7 @@ static const struct seq_operations tracer_seq_ops = {
>  };
>  
>  static struct trace_iterator *

Re: [PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace

2012-11-15 Thread Steven Rostedt
On Wed, 2012-10-17 at 11:56 +0900, Hiraku Toyooka wrote:
 Ftrace has a snapshot feature available from kernel space and
 latency tracers (e.g. irqsoff) are using it. This patch enables
 user applictions to take a snapshot via debugfs.
 
 Add following two debugfs files in tracing directory.
 
   snapshot:
 This is used to take a snapshot and to read the output of the
 snapshot.
  # echo 1  snapshot
  # cat snapshot
 
   snapshot_allocate:
 Echoing 1 to the snapshot allocates the max_tr buffer if
 it is not allocated. So taking a snapshot may delay or fail
 beacuse of memory allocation. To avoid that, this file
 provides a mean to pre-allocate (or free) the max_tr buffer.
  # echo 1  snapshot_allocate
  [...]
  # echo 1  snapshot

I was thinking about this some more, and I don't like the
snapshot_allocate part. Also, I think the snapshot should not be
allocated by default, and not used until the user explicitly asks for
it.

Have this:

---
 # cat snapshot
Snapshot is not allocated. To allocate it write the ASCII 1 into
this file:

  echo 1  snapshot

This will allocate the buffer for you. To free the snapshot echo 0
into this file.

  echo 0  snapshot

Anything else will reset the snapshot if it is allocated, or return
EINVAL if it is not allocated.
---


Also we can add a trace_snapshot to the kernel parameters to have it
allocated on boot. But I can add that if you update these patches.



 
 Signed-off-by: Hiraku Toyooka hiraku.toyooka...@hitachi.com
 Cc: Steven Rostedt rost...@goodmis.org
 Cc: Frederic Weisbecker fweis...@gmail.com
 Cc: Ingo Molnar mi...@redhat.com
 Cc: Jiri Olsa jo...@redhat.com
 Cc: Li Zefan l...@cn.fujitsu.com
 Cc: linux-kernel@vger.kernel.org
 ---
 
  include/linux/ftrace_event.h |3 +
  kernel/trace/Kconfig |   11 ++
  kernel/trace/trace.c |  190 
 +++---
  kernel/trace/trace.h |1 
  4 files changed, 193 insertions(+), 12 deletions(-)
 
 diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
 index 642928c..8c32c6e 100644
 --- a/include/linux/ftrace_event.h
 +++ b/include/linux/ftrace_event.h
 @@ -84,6 +84,9 @@ struct trace_iterator {
   longidx;
  
   cpumask_var_t   started;
 +
 + /* it's true when current open file is snapshot */
 + boolsnapshot;
  };
  
 
 diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
 index 4cea4f4..73d56d5 100644
 --- a/kernel/trace/Kconfig
 +++ b/kernel/trace/Kconfig
 @@ -102,6 +102,17 @@ config RING_BUFFER_ALLOW_SWAP
Allow the use of ring_buffer_swap_cpu.
Adds a very slight overhead to tracing when enabled.
  

Move this config down below FTRACE_SYSCALLS and give it a prompt. As
well as do not make it default y.


 +config TRACER_SNAPSHOT
 + bool

bool Create a snapshot trace buffer


 + default y
 + select TRACER_MAX_TRACE
 + help
 +   Allow tracing users to take snapshot of the current buffer using the
 +   ftrace interface, e.g.:
 +
 +   echo 1  /sys/kernel/debug/tracing/snapshot
 +   cat snapshot
 +
  # All tracer options should select GENERIC_TRACER. For those options that are
  # enabled by all tracers (context switch and event tracer) they select 
 TRACING.
  # This allows those options to appear when no other tracer is selected. But 
 the
 diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
 index d71eee1..b0d097e 100644
 --- a/kernel/trace/trace.c
 +++ b/kernel/trace/trace.c
 @@ -699,7 +699,7 @@ update_max_tr(struct trace_array *tr, struct task_struct 
 *tsk, int cpu)
   return;
  
   WARN_ON_ONCE(!irqs_disabled());
 - if (!current_trace-use_max_tr) {
 + if (!current_trace-allocated_snapshot) {
   WARN_ON_ONCE(1);
   return;
   }
 @@ -729,7 +729,7 @@ update_max_tr_single(struct trace_array *tr, struct 
 task_struct *tsk, int cpu)
   return;
  
   WARN_ON_ONCE(!irqs_disabled());
 - if (!current_trace-use_max_tr) {
 + if (!current_trace-allocated_snapshot) {
   WARN_ON_ONCE(1);
   return;
   }
 @@ -1902,7 +1902,8 @@ static void *s_start(struct seq_file *m, loff_t *pos)
   }
   mutex_unlock(trace_types_lock);
  
 - atomic_inc(trace_record_cmdline_disabled);
 + if (!iter-snapshot)
 + atomic_inc(trace_record_cmdline_disabled);
  
   if (*pos != iter-pos) {
   iter-ent = NULL;
 @@ -1941,7 +1942,8 @@ static void s_stop(struct seq_file *m, void *p)
  {
   struct trace_iterator *iter = m-private;
  
 - atomic_dec(trace_record_cmdline_disabled);
 + if (!iter-snapshot)
 + atomic_dec(trace_record_cmdline_disabled);
   trace_access_unlock(iter-cpu_file);
   trace_event_read_unlock();
  }
 @@ -2375,7 +2377,7 @@ static const struct seq_operations tracer_seq_ops = {
  };
  
  static struct trace_iterator *
 

[PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace

2012-10-16 Thread Hiraku Toyooka
Ftrace has a snapshot feature available from kernel space and
latency tracers (e.g. irqsoff) are using it. This patch enables
user applictions to take a snapshot via debugfs.

Add following two debugfs files in "tracing" directory.

  snapshot:
This is used to take a snapshot and to read the output of the
snapshot.
 # echo 1 > snapshot
 # cat snapshot

  snapshot_allocate:
Echoing 1 to the "snapshot" allocates the max_tr buffer if
it is not allocated. So taking a snapshot may delay or fail
beacuse of memory allocation. To avoid that, this file
provides a mean to pre-allocate (or free) the max_tr buffer.
 # echo 1 > snapshot_allocate
 [...]
 # echo 1 > snapshot

Signed-off-by: Hiraku Toyooka 
Cc: Steven Rostedt 
Cc: Frederic Weisbecker 
Cc: Ingo Molnar 
Cc: Jiri Olsa 
Cc: Li Zefan 
Cc: linux-kernel@vger.kernel.org
---

 include/linux/ftrace_event.h |3 +
 kernel/trace/Kconfig |   11 ++
 kernel/trace/trace.c |  190 +++---
 kernel/trace/trace.h |1 
 4 files changed, 193 insertions(+), 12 deletions(-)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 642928c..8c32c6e 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -84,6 +84,9 @@ struct trace_iterator {
longidx;
 
cpumask_var_t   started;
+
+   /* it's true when current open file is snapshot */
+   boolsnapshot;
 };
 
 
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 4cea4f4..73d56d5 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -102,6 +102,17 @@ config RING_BUFFER_ALLOW_SWAP
 Allow the use of ring_buffer_swap_cpu.
 Adds a very slight overhead to tracing when enabled.
 
+config TRACER_SNAPSHOT
+   bool
+   default y
+   select TRACER_MAX_TRACE
+   help
+ Allow tracing users to take snapshot of the current buffer using the
+ ftrace interface, e.g.:
+
+ echo 1 > /sys/kernel/debug/tracing/snapshot
+ cat snapshot
+
 # All tracer options should select GENERIC_TRACER. For those options that are
 # enabled by all tracers (context switch and event tracer) they select TRACING.
 # This allows those options to appear when no other tracer is selected. But the
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index d71eee1..b0d097e 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -699,7 +699,7 @@ update_max_tr(struct trace_array *tr, struct task_struct 
*tsk, int cpu)
return;
 
WARN_ON_ONCE(!irqs_disabled());
-   if (!current_trace->use_max_tr) {
+   if (!current_trace->allocated_snapshot) {
WARN_ON_ONCE(1);
return;
}
@@ -729,7 +729,7 @@ update_max_tr_single(struct trace_array *tr, struct 
task_struct *tsk, int cpu)
return;
 
WARN_ON_ONCE(!irqs_disabled());
-   if (!current_trace->use_max_tr) {
+   if (!current_trace->allocated_snapshot) {
WARN_ON_ONCE(1);
return;
}
@@ -1902,7 +1902,8 @@ static void *s_start(struct seq_file *m, loff_t *pos)
}
mutex_unlock(_types_lock);
 
-   atomic_inc(_record_cmdline_disabled);
+   if (!iter->snapshot)
+   atomic_inc(_record_cmdline_disabled);
 
if (*pos != iter->pos) {
iter->ent = NULL;
@@ -1941,7 +1942,8 @@ static void s_stop(struct seq_file *m, void *p)
 {
struct trace_iterator *iter = m->private;
 
-   atomic_dec(_record_cmdline_disabled);
+   if (!iter->snapshot)
+   atomic_dec(_record_cmdline_disabled);
trace_access_unlock(iter->cpu_file);
trace_event_read_unlock();
 }
@@ -2375,7 +2377,7 @@ static const struct seq_operations tracer_seq_ops = {
 };
 
 static struct trace_iterator *
-__tracing_open(struct inode *inode, struct file *file)
+__tracing_open(struct inode *inode, struct file *file, int snapshot)
 {
long cpu_file = (long) inode->i_private;
struct trace_iterator *iter;
@@ -2408,10 +2410,11 @@ __tracing_open(struct inode *inode, struct file *file)
if (!zalloc_cpumask_var(>started, GFP_KERNEL))
goto fail;
 
-   if (current_trace && current_trace->print_max)
+   if ((current_trace && current_trace->print_max) || snapshot)
iter->tr = _tr;
else
iter->tr = _trace;
+   iter->snapshot = !!snapshot;
iter->pos = -1;
mutex_init(>mutex);
iter->cpu_file = cpu_file;
@@ -2424,8 +2427,9 @@ __tracing_open(struct inode *inode, struct file *file)
if (ring_buffer_overruns(iter->tr->buffer))
iter->iter_flags |= TRACE_FILE_ANNOTATE;
 
-   /* stop the trace while dumping */
-   tracing_stop();
+   /* stop the trace while dumping if we are not opening "snapshot" */
+   if 

[PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace

2012-10-16 Thread Hiraku Toyooka
Ftrace has a snapshot feature available from kernel space and
latency tracers (e.g. irqsoff) are using it. This patch enables
user applictions to take a snapshot via debugfs.

Add following two debugfs files in tracing directory.

  snapshot:
This is used to take a snapshot and to read the output of the
snapshot.
 # echo 1  snapshot
 # cat snapshot

  snapshot_allocate:
Echoing 1 to the snapshot allocates the max_tr buffer if
it is not allocated. So taking a snapshot may delay or fail
beacuse of memory allocation. To avoid that, this file
provides a mean to pre-allocate (or free) the max_tr buffer.
 # echo 1  snapshot_allocate
 [...]
 # echo 1  snapshot

Signed-off-by: Hiraku Toyooka hiraku.toyooka...@hitachi.com
Cc: Steven Rostedt rost...@goodmis.org
Cc: Frederic Weisbecker fweis...@gmail.com
Cc: Ingo Molnar mi...@redhat.com
Cc: Jiri Olsa jo...@redhat.com
Cc: Li Zefan l...@cn.fujitsu.com
Cc: linux-kernel@vger.kernel.org
---

 include/linux/ftrace_event.h |3 +
 kernel/trace/Kconfig |   11 ++
 kernel/trace/trace.c |  190 +++---
 kernel/trace/trace.h |1 
 4 files changed, 193 insertions(+), 12 deletions(-)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 642928c..8c32c6e 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -84,6 +84,9 @@ struct trace_iterator {
longidx;
 
cpumask_var_t   started;
+
+   /* it's true when current open file is snapshot */
+   boolsnapshot;
 };
 
 
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 4cea4f4..73d56d5 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -102,6 +102,17 @@ config RING_BUFFER_ALLOW_SWAP
 Allow the use of ring_buffer_swap_cpu.
 Adds a very slight overhead to tracing when enabled.
 
+config TRACER_SNAPSHOT
+   bool
+   default y
+   select TRACER_MAX_TRACE
+   help
+ Allow tracing users to take snapshot of the current buffer using the
+ ftrace interface, e.g.:
+
+ echo 1  /sys/kernel/debug/tracing/snapshot
+ cat snapshot
+
 # All tracer options should select GENERIC_TRACER. For those options that are
 # enabled by all tracers (context switch and event tracer) they select TRACING.
 # This allows those options to appear when no other tracer is selected. But the
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index d71eee1..b0d097e 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -699,7 +699,7 @@ update_max_tr(struct trace_array *tr, struct task_struct 
*tsk, int cpu)
return;
 
WARN_ON_ONCE(!irqs_disabled());
-   if (!current_trace-use_max_tr) {
+   if (!current_trace-allocated_snapshot) {
WARN_ON_ONCE(1);
return;
}
@@ -729,7 +729,7 @@ update_max_tr_single(struct trace_array *tr, struct 
task_struct *tsk, int cpu)
return;
 
WARN_ON_ONCE(!irqs_disabled());
-   if (!current_trace-use_max_tr) {
+   if (!current_trace-allocated_snapshot) {
WARN_ON_ONCE(1);
return;
}
@@ -1902,7 +1902,8 @@ static void *s_start(struct seq_file *m, loff_t *pos)
}
mutex_unlock(trace_types_lock);
 
-   atomic_inc(trace_record_cmdline_disabled);
+   if (!iter-snapshot)
+   atomic_inc(trace_record_cmdline_disabled);
 
if (*pos != iter-pos) {
iter-ent = NULL;
@@ -1941,7 +1942,8 @@ static void s_stop(struct seq_file *m, void *p)
 {
struct trace_iterator *iter = m-private;
 
-   atomic_dec(trace_record_cmdline_disabled);
+   if (!iter-snapshot)
+   atomic_dec(trace_record_cmdline_disabled);
trace_access_unlock(iter-cpu_file);
trace_event_read_unlock();
 }
@@ -2375,7 +2377,7 @@ static const struct seq_operations tracer_seq_ops = {
 };
 
 static struct trace_iterator *
-__tracing_open(struct inode *inode, struct file *file)
+__tracing_open(struct inode *inode, struct file *file, int snapshot)
 {
long cpu_file = (long) inode-i_private;
struct trace_iterator *iter;
@@ -2408,10 +2410,11 @@ __tracing_open(struct inode *inode, struct file *file)
if (!zalloc_cpumask_var(iter-started, GFP_KERNEL))
goto fail;
 
-   if (current_trace  current_trace-print_max)
+   if ((current_trace  current_trace-print_max) || snapshot)
iter-tr = max_tr;
else
iter-tr = global_trace;
+   iter-snapshot = !!snapshot;
iter-pos = -1;
mutex_init(iter-mutex);
iter-cpu_file = cpu_file;
@@ -2424,8 +2427,9 @@ __tracing_open(struct inode *inode, struct file *file)
if (ring_buffer_overruns(iter-tr-buffer))
iter-iter_flags |= TRACE_FILE_ANNOTATE;
 
-   /* stop the trace while dumping */
-