Re: [PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace
(12/15/2012 02:08 AM), Steven Rostedt wrote: On Fri, 2012-12-07 at 11:07 +0900, Hiraku Toyooka wrote: Hi, Steven, (2012/11/30 23:17), Steven Rostedt wrote: [snip] > > Actually, I would have: > > status\input | 0 | 1 |else| > --++++ > not allocated |(do nothing)| alloc+swap | EINVAL | > --++++ >allocated |free| swap | clear| > --++++ > > Perhaps we don't need to do the clear on swap, just let the trace > continue where it left off? But in case we should swap... > I think we don't need the clear on swap too. I'll update my patches like this table. > There's a fast way to clear the tracer. Look at what the wakeup tracer > does. We can make that generic. If you want, I can write that code up > too. Hmm, maybe I'll do that, as it will speed things up for > everyone :-) > BTW, any update on this? I really like to get this into 3.9. I'm applying your review comment and rebasing now. So, I'll post the updated patch series in a few days. Thanks, Hiraku Toyooka -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace
(12/15/2012 02:08 AM), Steven Rostedt wrote: On Fri, 2012-12-07 at 11:07 +0900, Hiraku Toyooka wrote: Hi, Steven, (2012/11/30 23:17), Steven Rostedt wrote: [snip] Actually, I would have: status\input | 0 | 1 |else| --++++ not allocated |(do nothing)| alloc+swap | EINVAL | --++++ allocated |free| swap | clear| --++++ Perhaps we don't need to do the clear on swap, just let the trace continue where it left off? But in case we should swap... I think we don't need the clear on swap too. I'll update my patches like this table. There's a fast way to clear the tracer. Look at what the wakeup tracer does. We can make that generic. If you want, I can write that code up too. Hmm, maybe I'll do that, as it will speed things up for everyone :-) BTW, any update on this? I really like to get this into 3.9. I'm applying your review comment and rebasing now. So, I'll post the updated patch series in a few days. Thanks, Hiraku Toyooka -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace
On Fri, 2012-12-07 at 11:07 +0900, Hiraku Toyooka wrote: > Hi, Steven, > > (2012/11/30 23:17), Steven Rostedt wrote: > [snip] > > > > Actually, I would have: > > > > status\input | 0 | 1 |else| > > --++++ > > not allocated |(do nothing)| alloc+swap | EINVAL | > > --++++ > >allocated |free| swap | clear| > > --++++ > > > > Perhaps we don't need to do the clear on swap, just let the trace > > continue where it left off? But in case we should swap... > > > > I think we don't need the clear on swap too. > I'll update my patches like this table. > > > There's a fast way to clear the tracer. Look at what the wakeup tracer > > does. We can make that generic. If you want, I can write that code up > > too. Hmm, maybe I'll do that, as it will speed things up for > > everyone :-) > > > BTW, any update on this? I really like to get this into 3.9. Thanks! -- Steve -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace
On Fri, 2012-12-07 at 11:07 +0900, Hiraku Toyooka wrote: Hi, Steven, (2012/11/30 23:17), Steven Rostedt wrote: [snip] Actually, I would have: status\input | 0 | 1 |else| --++++ not allocated |(do nothing)| alloc+swap | EINVAL | --++++ allocated |free| swap | clear| --++++ Perhaps we don't need to do the clear on swap, just let the trace continue where it left off? But in case we should swap... I think we don't need the clear on swap too. I'll update my patches like this table. There's a fast way to clear the tracer. Look at what the wakeup tracer does. We can make that generic. If you want, I can write that code up too. Hmm, maybe I'll do that, as it will speed things up for everyone :-) BTW, any update on this? I really like to get this into 3.9. Thanks! -- Steve -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace
On Fri, 2012-12-07 at 11:07 +0900, Hiraku Toyooka wrote: > Hi, Steven, > > (2012/11/30 23:17), Steven Rostedt wrote: > [snip] > > > > Actually, I would have: > > > > status\input | 0 | 1 |else| > > --++++ > > not allocated |(do nothing)| alloc+swap | EINVAL | > > --++++ > >allocated |free| swap | clear| > > --++++ > > > > Perhaps we don't need to do the clear on swap, just let the trace > > continue where it left off? But in case we should swap... > > > > I think we don't need the clear on swap too. > I'll update my patches like this table. > > > There's a fast way to clear the tracer. Look at what the wakeup tracer > > does. We can make that generic. If you want, I can write that code up > > too. Hmm, maybe I'll do that, as it will speed things up for > > everyone :-) > > > > (I looked over the wakeup tracer, but I couldn't find that code...) Heh, sorry, you needed to look at the "update_max_tr()" in kernel/trace/trace.c. Where we update the time_start value. Then the output skips all timestamps before that start. This is much more efficient than a 'reset', as we don't need to sync or anything. Just record the timestamp of where we want to consider the buffer started, and ignore any event before that. > I think that seq_read() calls s_stop() even if s_start() failed. > > seq_read()@fs/seq_file.c: > > p = m->op->start(m, ); > while (1) { > err = PTR_ERR(p); > if (!p || IS_ERR(p)) > break; > ... > } > m->op->stop(m, p); > > So, I think we need the check in s_stop(), don't we? Crap, you're right. Hmm, why was I thinking that it didn't. I better go and review some of my recent code to make sure that I didn't have that wrong assumption. -- Steve -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace
On Fri, 2012-12-07 at 11:07 +0900, Hiraku Toyooka wrote: Hi, Steven, (2012/11/30 23:17), Steven Rostedt wrote: [snip] Actually, I would have: status\input | 0 | 1 |else| --++++ not allocated |(do nothing)| alloc+swap | EINVAL | --++++ allocated |free| swap | clear| --++++ Perhaps we don't need to do the clear on swap, just let the trace continue where it left off? But in case we should swap... I think we don't need the clear on swap too. I'll update my patches like this table. There's a fast way to clear the tracer. Look at what the wakeup tracer does. We can make that generic. If you want, I can write that code up too. Hmm, maybe I'll do that, as it will speed things up for everyone :-) (I looked over the wakeup tracer, but I couldn't find that code...) Heh, sorry, you needed to look at the update_max_tr() in kernel/trace/trace.c. Where we update the time_start value. Then the output skips all timestamps before that start. This is much more efficient than a 'reset', as we don't need to sync or anything. Just record the timestamp of where we want to consider the buffer started, and ignore any event before that. I think that seq_read() calls s_stop() even if s_start() failed. seq_read()@fs/seq_file.c: p = m-op-start(m, pos); while (1) { err = PTR_ERR(p); if (!p || IS_ERR(p)) break; ... } m-op-stop(m, p); So, I think we need the check in s_stop(), don't we? Crap, you're right. Hmm, why was I thinking that it didn't. I better go and review some of my recent code to make sure that I didn't have that wrong assumption. -- Steve -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace
Hi, Steven, (2012/11/30 23:17), Steven Rostedt wrote: [snip] > > Actually, I would have: > > status\input | 0 | 1 |else| > --++++ > not allocated |(do nothing)| alloc+swap | EINVAL | > --++++ >allocated |free| swap | clear| > --++++ > > Perhaps we don't need to do the clear on swap, just let the trace > continue where it left off? But in case we should swap... > I think we don't need the clear on swap too. I'll update my patches like this table. > There's a fast way to clear the tracer. Look at what the wakeup tracer > does. We can make that generic. If you want, I can write that code up > too. Hmm, maybe I'll do that, as it will speed things up for > everyone :-) > (I looked over the wakeup tracer, but I couldn't find that code...) > >> I think it is almost OK, but there is a problem. >> When we echo "1" to the allocated snapshot, the clear operation adds >> some delay because the time cost of tracing_reset_online_cpus() is in >> proportion to the number of CPUs. >> (It takes 72ms in my 8 CPU environment.) >> >> So, when the snapshot is already cleared by echoing "else" values, we >> can avoid the delay on echoing "1" by keeping "cleared" status >> internally. For example, we can add the "cleared" flag to struct tracer. >> What do you think about it? >> >> > >> > Also we can add a "trace_snapshot" to the kernel parameters to have it >> > allocated on boot. But I can add that if you update these patches. >> > >> >> OK, I'll update my patches. > > This part (kernel parameter) can be a separate patch. > Yes. > >> > Either test here, or better yet, put the test into >> > tracing_reset_online_cpus(). >> > >> > if (!buffer) >> > return; >> > >> >> I see. I'll add the test to tracing_reset_online_cpus(). Should I make a >> separated patch? > > It's a small change, you can add it to your patch or make it separate. > I'll leave that up to you. > I see. Perhaps I'll make it separate. >> [snip] >> >> +static ssize_t tracing_snapshot_read(struct file *filp, char __user *ubuf, >> >> + size_t cnt, loff_t *ppos) >> >> +{ >> >> +ssize_t ret = 0; >> >> + >> >> +mutex_lock(_types_lock); >> >> +if (current_trace && current_trace->use_max_tr) >> >> +ret = -EBUSY; >> >> +mutex_unlock(_types_lock); >> > >> > I don't like this, as it is racy. The current tracer could change after >> > the unlock, and your back to the problem. >> > >> >> You're right... >> This is racy. >> >> > Now what we may be able to do, but it would take a little checking for >> > lock ordering with trace_access_lock() and trace_event_read_lock(), but >> > we could add the mutex to trace_types_lock to both s_start() and >> > s_stop() and do the check in s_start() if iter->snapshot is true. >> > >> >> If I catch your meaning, do s_start() and s_stop() become like following >> code? >> (Now, s_start() is used from two files - "trace" and "snapshot", so we >> should change static "old_tracer" to per open-file.) > > Actually, lets nuke all the old_tracer totally, and instead do: > > if (unlikely(strcmp(iter->trace->name, current_trace->name) != 0)) { > > You can make this into a separate patch. You can add a check if > current_trace is not NULL too, but I need to fix that, as current_trace > should never be NULL (except for very early boot). But don't worry about > that, I'll handle that. > O.K. I'll change all the old_tracer checking to the strcmp. > Or I can write up this patch and send it to you, and you can include it > in your series. > >> static void *s_start(struct seq_file *m, loff_t *pos) >> { >> struct trace_iterator *iter = m->private; >> -static struct tracer *old_tracer; >> ... >> /* copy the tracer to avoid using a global lock all around */ >> mutex_lock(_types_lock); >> -if (unlikely(old_tracer != current_trace && current_trace)) { >> -old_tracer = current_trace; >> +if (unlikely(iter->old_tracer != current_trace && current_trace)) { >> +iter->old_tracer = current_trace; >> *iter->trace = *current_trace; >> } >> mutex_unlock(_types_lock); >> >> +if (iter->snapshot && iter->trace->use_max_tr) >> +return ERR_PTR(-EBUSY); >> + >> ... >> } >> >> static void s_stop(struct seq_file *m, void *p) >> { >> struct trace_iterator *iter = m->private; >> >> +if (iter->snapshot && iter->trace->use_max_tr) >> +return; > > This part shouldn't be needed, as if s_start fails it wont call > s_stop(). But if you are paranoid (like I am ;-) then we can do: > > if (WARN_ON_ONCE(iter->snapshot && iter->trace->use_max_tr) > return; I think that seq_read() calls s_stop() even if s_start() failed. seq_read()@fs/seq_file.c: p = m->op->start(m, );
Re: [PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace
Hi, Steven, (2012/11/30 23:17), Steven Rostedt wrote: [snip] Actually, I would have: status\input | 0 | 1 |else| --++++ not allocated |(do nothing)| alloc+swap | EINVAL | --++++ allocated |free| swap | clear| --++++ Perhaps we don't need to do the clear on swap, just let the trace continue where it left off? But in case we should swap... I think we don't need the clear on swap too. I'll update my patches like this table. There's a fast way to clear the tracer. Look at what the wakeup tracer does. We can make that generic. If you want, I can write that code up too. Hmm, maybe I'll do that, as it will speed things up for everyone :-) (I looked over the wakeup tracer, but I couldn't find that code...) I think it is almost OK, but there is a problem. When we echo 1 to the allocated snapshot, the clear operation adds some delay because the time cost of tracing_reset_online_cpus() is in proportion to the number of CPUs. (It takes 72ms in my 8 CPU environment.) So, when the snapshot is already cleared by echoing else values, we can avoid the delay on echoing 1 by keeping cleared status internally. For example, we can add the cleared flag to struct tracer. What do you think about it? Also we can add a trace_snapshot to the kernel parameters to have it allocated on boot. But I can add that if you update these patches. OK, I'll update my patches. This part (kernel parameter) can be a separate patch. Yes. Either test here, or better yet, put the test into tracing_reset_online_cpus(). if (!buffer) return; I see. I'll add the test to tracing_reset_online_cpus(). Should I make a separated patch? It's a small change, you can add it to your patch or make it separate. I'll leave that up to you. I see. Perhaps I'll make it separate. [snip] +static ssize_t tracing_snapshot_read(struct file *filp, char __user *ubuf, + size_t cnt, loff_t *ppos) +{ +ssize_t ret = 0; + +mutex_lock(trace_types_lock); +if (current_trace current_trace-use_max_tr) +ret = -EBUSY; +mutex_unlock(trace_types_lock); I don't like this, as it is racy. The current tracer could change after the unlock, and your back to the problem. You're right... This is racy. Now what we may be able to do, but it would take a little checking for lock ordering with trace_access_lock() and trace_event_read_lock(), but we could add the mutex to trace_types_lock to both s_start() and s_stop() and do the check in s_start() if iter-snapshot is true. If I catch your meaning, do s_start() and s_stop() become like following code? (Now, s_start() is used from two files - trace and snapshot, so we should change static old_tracer to per open-file.) Actually, lets nuke all the old_tracer totally, and instead do: if (unlikely(strcmp(iter-trace-name, current_trace-name) != 0)) { You can make this into a separate patch. You can add a check if current_trace is not NULL too, but I need to fix that, as current_trace should never be NULL (except for very early boot). But don't worry about that, I'll handle that. O.K. I'll change all the old_tracer checking to the strcmp. Or I can write up this patch and send it to you, and you can include it in your series. static void *s_start(struct seq_file *m, loff_t *pos) { struct trace_iterator *iter = m-private; -static struct tracer *old_tracer; ... /* copy the tracer to avoid using a global lock all around */ mutex_lock(trace_types_lock); -if (unlikely(old_tracer != current_trace current_trace)) { -old_tracer = current_trace; +if (unlikely(iter-old_tracer != current_trace current_trace)) { +iter-old_tracer = current_trace; *iter-trace = *current_trace; } mutex_unlock(trace_types_lock); +if (iter-snapshot iter-trace-use_max_tr) +return ERR_PTR(-EBUSY); + ... } static void s_stop(struct seq_file *m, void *p) { struct trace_iterator *iter = m-private; +if (iter-snapshot iter-trace-use_max_tr) +return; This part shouldn't be needed, as if s_start fails it wont call s_stop(). But if you are paranoid (like I am ;-) then we can do: if (WARN_ON_ONCE(iter-snapshot iter-trace-use_max_tr) return; I think that seq_read() calls s_stop() even if s_start() failed. seq_read()@fs/seq_file.c: p = m-op-start(m, pos); while (1) { err = PTR_ERR(p); if (!p || IS_ERR(p)) break; ... } m-op-stop(m, p); So, I think we need the check in s_stop(), don't we? Thanks, Hiraku Toyooka -- To unsubscribe
Re: [PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace
On Fri, 2012-11-30 at 14:46 +0900, Hiraku Toyooka wrote: > Hi, Steven, > > Thank you for your review. > > (2012/11/16 10:46), Steven Rostedt wrote: > [snip] > > I was thinking about this some more, and I don't like the > > snapshot_allocate part. Also, I think the snapshot should not be > > allocated by default, and not used until the user explicitly asks for > > it. > > > > Have this: > > > > --- > > # cat snapshot > > Snapshot is not allocated. To allocate it write the ASCII "1" into > > this file: > > > > echo 1 > snapshot > > > > This will allocate the buffer for you. To free the snapshot echo "0" > > into this file. > > > > echo "0" > snapshot > > > > Anything else will reset the snapshot if it is allocated, or return > > EINVAL if it is not allocated. > > --- > > > > Your idea about "snapshot" is like following table, isn't it? > > status\input | 0 | 1 |else| > --++++ > not allocated | EINVAL | alloc+swap | EINVAL | > --++++ >allocated |free| clear+swap | clear| > --++++ Actually, I would have: status\input | 0 | 1 |else| --++++ not allocated |(do nothing)| alloc+swap | EINVAL | --++++ allocated |free| swap | clear| --++++ Perhaps we don't need to do the clear on swap, just let the trace continue where it left off? But in case we should swap... There's a fast way to clear the tracer. Look at what the wakeup tracer does. We can make that generic. If you want, I can write that code up too. Hmm, maybe I'll do that, as it will speed things up for everyone :-) > > I think it is almost OK, but there is a problem. > When we echo "1" to the allocated snapshot, the clear operation adds > some delay because the time cost of tracing_reset_online_cpus() is in > proportion to the number of CPUs. > (It takes 72ms in my 8 CPU environment.) > > So, when the snapshot is already cleared by echoing "else" values, we > can avoid the delay on echoing "1" by keeping "cleared" status > internally. For example, we can add the "cleared" flag to struct tracer. > What do you think about it? > > > > > Also we can add a "trace_snapshot" to the kernel parameters to have it > > allocated on boot. But I can add that if you update these patches. > > > > OK, I'll update my patches. This part (kernel parameter) can be a separate patch. > > Either test here, or better yet, put the test into > > tracing_reset_online_cpus(). > > > > if (!buffer) > > return; > > > > I see. I'll add the test to tracing_reset_online_cpus(). Should I make a > separated patch? It's a small change, you can add it to your patch or make it separate. I'll leave that up to you. > > [snip] > >> +static ssize_t tracing_snapshot_read(struct file *filp, char __user > *ubuf, > >> + size_t cnt, loff_t *ppos) > >> +{ > >> +ssize_t ret = 0; > >> + > >> +mutex_lock(_types_lock); > >> +if (current_trace && current_trace->use_max_tr) > >> +ret = -EBUSY; > >> +mutex_unlock(_types_lock); > > > > I don't like this, as it is racy. The current tracer could change after > > the unlock, and your back to the problem. > > > > You're right... > This is racy. > > > Now what we may be able to do, but it would take a little checking for > > lock ordering with trace_access_lock() and trace_event_read_lock(), but > > we could add the mutex to trace_types_lock to both s_start() and > > s_stop() and do the check in s_start() if iter->snapshot is true. > > > > If I catch your meaning, do s_start() and s_stop() become like following > code? > (Now, s_start() is used from two files - "trace" and "snapshot", so we > should change static "old_tracer" to per open-file.) Actually, lets nuke all the old_tracer totally, and instead do: if (unlikely(strcmp(iter->trace->name, current_trace->name) != 0)) { You can make this into a separate patch. You can add a check if current_trace is not NULL too, but I need to fix that, as current_trace should never be NULL (except for very early boot). But don't worry about that, I'll handle that. Or I can write up this patch and send it to you, and you can include it in your series. > > static void *s_start(struct seq_file *m, loff_t *pos) > { > struct trace_iterator *iter = m->private; > -static struct tracer *old_tracer; > ... > /* copy the tracer to avoid using a global lock all around */ > mutex_lock(_types_lock); > -if (unlikely(old_tracer != current_trace && current_trace)) { > -old_tracer = current_trace; > +if (unlikely(iter->old_tracer != current_trace && current_trace)) { > +
Re: [PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace
On Fri, 2012-11-30 at 14:46 +0900, Hiraku Toyooka wrote: Hi, Steven, Thank you for your review. (2012/11/16 10:46), Steven Rostedt wrote: [snip] I was thinking about this some more, and I don't like the snapshot_allocate part. Also, I think the snapshot should not be allocated by default, and not used until the user explicitly asks for it. Have this: --- # cat snapshot Snapshot is not allocated. To allocate it write the ASCII 1 into this file: echo 1 snapshot This will allocate the buffer for you. To free the snapshot echo 0 into this file. echo 0 snapshot Anything else will reset the snapshot if it is allocated, or return EINVAL if it is not allocated. --- Your idea about snapshot is like following table, isn't it? status\input | 0 | 1 |else| --++++ not allocated | EINVAL | alloc+swap | EINVAL | --++++ allocated |free| clear+swap | clear| --++++ Actually, I would have: status\input | 0 | 1 |else| --++++ not allocated |(do nothing)| alloc+swap | EINVAL | --++++ allocated |free| swap | clear| --++++ Perhaps we don't need to do the clear on swap, just let the trace continue where it left off? But in case we should swap... There's a fast way to clear the tracer. Look at what the wakeup tracer does. We can make that generic. If you want, I can write that code up too. Hmm, maybe I'll do that, as it will speed things up for everyone :-) I think it is almost OK, but there is a problem. When we echo 1 to the allocated snapshot, the clear operation adds some delay because the time cost of tracing_reset_online_cpus() is in proportion to the number of CPUs. (It takes 72ms in my 8 CPU environment.) So, when the snapshot is already cleared by echoing else values, we can avoid the delay on echoing 1 by keeping cleared status internally. For example, we can add the cleared flag to struct tracer. What do you think about it? Also we can add a trace_snapshot to the kernel parameters to have it allocated on boot. But I can add that if you update these patches. OK, I'll update my patches. This part (kernel parameter) can be a separate patch. Either test here, or better yet, put the test into tracing_reset_online_cpus(). if (!buffer) return; I see. I'll add the test to tracing_reset_online_cpus(). Should I make a separated patch? It's a small change, you can add it to your patch or make it separate. I'll leave that up to you. [snip] +static ssize_t tracing_snapshot_read(struct file *filp, char __user *ubuf, + size_t cnt, loff_t *ppos) +{ +ssize_t ret = 0; + +mutex_lock(trace_types_lock); +if (current_trace current_trace-use_max_tr) +ret = -EBUSY; +mutex_unlock(trace_types_lock); I don't like this, as it is racy. The current tracer could change after the unlock, and your back to the problem. You're right... This is racy. Now what we may be able to do, but it would take a little checking for lock ordering with trace_access_lock() and trace_event_read_lock(), but we could add the mutex to trace_types_lock to both s_start() and s_stop() and do the check in s_start() if iter-snapshot is true. If I catch your meaning, do s_start() and s_stop() become like following code? (Now, s_start() is used from two files - trace and snapshot, so we should change static old_tracer to per open-file.) Actually, lets nuke all the old_tracer totally, and instead do: if (unlikely(strcmp(iter-trace-name, current_trace-name) != 0)) { You can make this into a separate patch. You can add a check if current_trace is not NULL too, but I need to fix that, as current_trace should never be NULL (except for very early boot). But don't worry about that, I'll handle that. Or I can write up this patch and send it to you, and you can include it in your series. static void *s_start(struct seq_file *m, loff_t *pos) { struct trace_iterator *iter = m-private; -static struct tracer *old_tracer; ... /* copy the tracer to avoid using a global lock all around */ mutex_lock(trace_types_lock); -if (unlikely(old_tracer != current_trace current_trace)) { -old_tracer = current_trace; +if (unlikely(iter-old_tracer != current_trace current_trace)) { +iter-old_tracer = current_trace; *iter-trace = *current_trace; } mutex_unlock(trace_types_lock); +if (iter-snapshot iter-trace-use_max_tr) +return
Re: [PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace
Hi, Steven, Thank you for your review. (2012/11/16 10:46), Steven Rostedt wrote: [snip] > I was thinking about this some more, and I don't like the > snapshot_allocate part. Also, I think the snapshot should not be > allocated by default, and not used until the user explicitly asks for > it. > > Have this: > > --- > # cat snapshot > Snapshot is not allocated. To allocate it write the ASCII "1" into > this file: > > echo 1 > snapshot > > This will allocate the buffer for you. To free the snapshot echo "0" > into this file. > > echo "0" > snapshot > > Anything else will reset the snapshot if it is allocated, or return > EINVAL if it is not allocated. > --- > Your idea about "snapshot" is like following table, isn't it? status\input | 0 | 1 |else| --++++ not allocated | EINVAL | alloc+swap | EINVAL | --++++ allocated |free| clear+swap | clear| --++++ I think it is almost OK, but there is a problem. When we echo "1" to the allocated snapshot, the clear operation adds some delay because the time cost of tracing_reset_online_cpus() is in proportion to the number of CPUs. (It takes 72ms in my 8 CPU environment.) So, when the snapshot is already cleared by echoing "else" values, we can avoid the delay on echoing "1" by keeping "cleared" status internally. For example, we can add the "cleared" flag to struct tracer. What do you think about it? > > Also we can add a "trace_snapshot" to the kernel parameters to have it > allocated on boot. But I can add that if you update these patches. > OK, I'll update my patches. [snip] >> diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig >> index 4cea4f4..73d56d5 100644 >> --- a/kernel/trace/Kconfig >> +++ b/kernel/trace/Kconfig >> @@ -102,6 +102,17 @@ config RING_BUFFER_ALLOW_SWAP >> Allow the use of ring_buffer_swap_cpu. >> Adds a very slight overhead to tracing when enabled. >> > > Move this config down below FTRACE_SYSCALLS and give it a prompt. As > well as do not make it default y. > I'll modify it. > >> +config TRACER_SNAPSHOT >> +bool > > bool "Create a snapshot trace buffer" > I'll fix it. > >> +default y >> +select TRACER_MAX_TRACE >> +help >> + Allow tracing users to take snapshot of the current buffer using the >> + ftrace interface, e.g.: >> + >> + echo 1 > /sys/kernel/debug/tracing/snapshot >> + cat snapshot >> + [snip] >> static struct trace_iterator * >> -__tracing_open(struct inode *inode, struct file *file) >> +__tracing_open(struct inode *inode, struct file *file, int snapshot) > > bool snapshot > I'll fix it. >> { >> long cpu_file = (long) inode->i_private; >> struct trace_iterator *iter; >> @@ -2408,10 +2410,11 @@ __tracing_open(struct inode *inode, struct file *file) >> if (!zalloc_cpumask_var(>started, GFP_KERNEL)) >> goto fail; >> >> -if (current_trace && current_trace->print_max) >> +if ((current_trace && current_trace->print_max) || snapshot) >> iter->tr = _tr; >> else >> iter->tr = _trace; >> +iter->snapshot = !!snapshot; > > Get rid of the !! > I'll fix it. [snip] >> @@ -2517,7 +2522,7 @@ static int tracing_open(struct inode *inode, struct file *file) >> } >> >> if (file->f_mode & FMODE_READ) { >> -iter = __tracing_open(inode, file); >> +iter = __tracing_open(inode, file, 0); > > , false) > I'll fix it. >> if (IS_ERR(iter)) >> ret = PTR_ERR(iter); >> else if (trace_flags & TRACE_ITER_LATENCY_FMT) >> @@ -3186,7 +3191,8 @@ static int tracing_set_tracer(const char *buf) >> trace_branch_disable(); >> if (current_trace && current_trace->reset) >> current_trace->reset(tr); >> -if (current_trace && current_trace->use_max_tr) { >> +if (current_trace && current_trace->allocated_snapshot) { >> +tracing_reset_online_cpus(_tr); > > max_tr->buffer could be NULL. > > Either test here, or better yet, put the test into > tracing_reset_online_cpus(). > > if (!buffer) > return; > I see. I'll add the test to tracing_reset_online_cpus(). Should I make a separated patch? [snip] >> +static ssize_t tracing_snapshot_read(struct file *filp, char __user *ubuf, >> + size_t cnt, loff_t *ppos) >> +{ >> +ssize_t ret = 0; >> + >> +mutex_lock(_types_lock); >> +if (current_trace && current_trace->use_max_tr) >> +ret = -EBUSY; >> +mutex_unlock(_types_lock); > > I don't like this, as it is racy. The current tracer could change after > the unlock, and your back to the problem. > You're right... This is racy. > Now what we may be able to do, but it would take a little checking for > lock ordering with trace_access_lock() and trace_event_read_lock(), but > we could add the mutex
Re: [PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace
Hi, Steven, Thank you for your review. (2012/11/16 10:46), Steven Rostedt wrote: [snip] I was thinking about this some more, and I don't like the snapshot_allocate part. Also, I think the snapshot should not be allocated by default, and not used until the user explicitly asks for it. Have this: --- # cat snapshot Snapshot is not allocated. To allocate it write the ASCII 1 into this file: echo 1 snapshot This will allocate the buffer for you. To free the snapshot echo 0 into this file. echo 0 snapshot Anything else will reset the snapshot if it is allocated, or return EINVAL if it is not allocated. --- Your idea about snapshot is like following table, isn't it? status\input | 0 | 1 |else| --++++ not allocated | EINVAL | alloc+swap | EINVAL | --++++ allocated |free| clear+swap | clear| --++++ I think it is almost OK, but there is a problem. When we echo 1 to the allocated snapshot, the clear operation adds some delay because the time cost of tracing_reset_online_cpus() is in proportion to the number of CPUs. (It takes 72ms in my 8 CPU environment.) So, when the snapshot is already cleared by echoing else values, we can avoid the delay on echoing 1 by keeping cleared status internally. For example, we can add the cleared flag to struct tracer. What do you think about it? Also we can add a trace_snapshot to the kernel parameters to have it allocated on boot. But I can add that if you update these patches. OK, I'll update my patches. [snip] diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig index 4cea4f4..73d56d5 100644 --- a/kernel/trace/Kconfig +++ b/kernel/trace/Kconfig @@ -102,6 +102,17 @@ config RING_BUFFER_ALLOW_SWAP Allow the use of ring_buffer_swap_cpu. Adds a very slight overhead to tracing when enabled. Move this config down below FTRACE_SYSCALLS and give it a prompt. As well as do not make it default y. I'll modify it. +config TRACER_SNAPSHOT +bool bool Create a snapshot trace buffer I'll fix it. +default y +select TRACER_MAX_TRACE +help + Allow tracing users to take snapshot of the current buffer using the + ftrace interface, e.g.: + + echo 1 /sys/kernel/debug/tracing/snapshot + cat snapshot + [snip] static struct trace_iterator * -__tracing_open(struct inode *inode, struct file *file) +__tracing_open(struct inode *inode, struct file *file, int snapshot) bool snapshot I'll fix it. { long cpu_file = (long) inode-i_private; struct trace_iterator *iter; @@ -2408,10 +2410,11 @@ __tracing_open(struct inode *inode, struct file *file) if (!zalloc_cpumask_var(iter-started, GFP_KERNEL)) goto fail; -if (current_trace current_trace-print_max) +if ((current_trace current_trace-print_max) || snapshot) iter-tr = max_tr; else iter-tr = global_trace; +iter-snapshot = !!snapshot; Get rid of the !! I'll fix it. [snip] @@ -2517,7 +2522,7 @@ static int tracing_open(struct inode *inode, struct file *file) } if (file-f_mode FMODE_READ) { -iter = __tracing_open(inode, file); +iter = __tracing_open(inode, file, 0); , false) I'll fix it. if (IS_ERR(iter)) ret = PTR_ERR(iter); else if (trace_flags TRACE_ITER_LATENCY_FMT) @@ -3186,7 +3191,8 @@ static int tracing_set_tracer(const char *buf) trace_branch_disable(); if (current_trace current_trace-reset) current_trace-reset(tr); -if (current_trace current_trace-use_max_tr) { +if (current_trace current_trace-allocated_snapshot) { +tracing_reset_online_cpus(max_tr); max_tr-buffer could be NULL. Either test here, or better yet, put the test into tracing_reset_online_cpus(). if (!buffer) return; I see. I'll add the test to tracing_reset_online_cpus(). Should I make a separated patch? [snip] +static ssize_t tracing_snapshot_read(struct file *filp, char __user *ubuf, + size_t cnt, loff_t *ppos) +{ +ssize_t ret = 0; + +mutex_lock(trace_types_lock); +if (current_trace current_trace-use_max_tr) +ret = -EBUSY; +mutex_unlock(trace_types_lock); I don't like this, as it is racy. The current tracer could change after the unlock, and your back to the problem. You're right... This is racy. Now what we may be able to do, but it would take a little checking for lock ordering with trace_access_lock() and trace_event_read_lock(), but we could add the mutex to trace_types_lock to both s_start() and s_stop() and do the check in s_start() if iter-snapshot is true. If I catch your meaning, do s_start() and s_stop() become like following code? (Now, s_start()
Re: [PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace
On Wed, 2012-10-17 at 11:56 +0900, Hiraku Toyooka wrote: > Ftrace has a snapshot feature available from kernel space and > latency tracers (e.g. irqsoff) are using it. This patch enables > user applictions to take a snapshot via debugfs. > > Add following two debugfs files in "tracing" directory. > > snapshot: > This is used to take a snapshot and to read the output of the > snapshot. > # echo 1 > snapshot > # cat snapshot > > snapshot_allocate: > Echoing 1 to the "snapshot" allocates the max_tr buffer if > it is not allocated. So taking a snapshot may delay or fail > beacuse of memory allocation. To avoid that, this file > provides a mean to pre-allocate (or free) the max_tr buffer. > # echo 1 > snapshot_allocate > [...] > # echo 1 > snapshot I was thinking about this some more, and I don't like the snapshot_allocate part. Also, I think the snapshot should not be allocated by default, and not used until the user explicitly asks for it. Have this: --- # cat snapshot Snapshot is not allocated. To allocate it write the ASCII "1" into this file: echo 1 > snapshot This will allocate the buffer for you. To free the snapshot echo "0" into this file. echo "0" > snapshot Anything else will reset the snapshot if it is allocated, or return EINVAL if it is not allocated. --- Also we can add a "trace_snapshot" to the kernel parameters to have it allocated on boot. But I can add that if you update these patches. > > Signed-off-by: Hiraku Toyooka > Cc: Steven Rostedt > Cc: Frederic Weisbecker > Cc: Ingo Molnar > Cc: Jiri Olsa > Cc: Li Zefan > Cc: linux-kernel@vger.kernel.org > --- > > include/linux/ftrace_event.h |3 + > kernel/trace/Kconfig | 11 ++ > kernel/trace/trace.c | 190 > +++--- > kernel/trace/trace.h |1 > 4 files changed, 193 insertions(+), 12 deletions(-) > > diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h > index 642928c..8c32c6e 100644 > --- a/include/linux/ftrace_event.h > +++ b/include/linux/ftrace_event.h > @@ -84,6 +84,9 @@ struct trace_iterator { > longidx; > > cpumask_var_t started; > + > + /* it's true when current open file is snapshot */ > + boolsnapshot; > }; > > > diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig > index 4cea4f4..73d56d5 100644 > --- a/kernel/trace/Kconfig > +++ b/kernel/trace/Kconfig > @@ -102,6 +102,17 @@ config RING_BUFFER_ALLOW_SWAP >Allow the use of ring_buffer_swap_cpu. >Adds a very slight overhead to tracing when enabled. > Move this config down below FTRACE_SYSCALLS and give it a prompt. As well as do not make it default y. > +config TRACER_SNAPSHOT > + bool bool "Create a snapshot trace buffer" > + default y > + select TRACER_MAX_TRACE > + help > + Allow tracing users to take snapshot of the current buffer using the > + ftrace interface, e.g.: > + > + echo 1 > /sys/kernel/debug/tracing/snapshot > + cat snapshot > + > # All tracer options should select GENERIC_TRACER. For those options that are > # enabled by all tracers (context switch and event tracer) they select > TRACING. > # This allows those options to appear when no other tracer is selected. But > the > diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c > index d71eee1..b0d097e 100644 > --- a/kernel/trace/trace.c > +++ b/kernel/trace/trace.c > @@ -699,7 +699,7 @@ update_max_tr(struct trace_array *tr, struct task_struct > *tsk, int cpu) > return; > > WARN_ON_ONCE(!irqs_disabled()); > - if (!current_trace->use_max_tr) { > + if (!current_trace->allocated_snapshot) { > WARN_ON_ONCE(1); > return; > } > @@ -729,7 +729,7 @@ update_max_tr_single(struct trace_array *tr, struct > task_struct *tsk, int cpu) > return; > > WARN_ON_ONCE(!irqs_disabled()); > - if (!current_trace->use_max_tr) { > + if (!current_trace->allocated_snapshot) { > WARN_ON_ONCE(1); > return; > } > @@ -1902,7 +1902,8 @@ static void *s_start(struct seq_file *m, loff_t *pos) > } > mutex_unlock(_types_lock); > > - atomic_inc(_record_cmdline_disabled); > + if (!iter->snapshot) > + atomic_inc(_record_cmdline_disabled); > > if (*pos != iter->pos) { > iter->ent = NULL; > @@ -1941,7 +1942,8 @@ static void s_stop(struct seq_file *m, void *p) > { > struct trace_iterator *iter = m->private; > > - atomic_dec(_record_cmdline_disabled); > + if (!iter->snapshot) > + atomic_dec(_record_cmdline_disabled); > trace_access_unlock(iter->cpu_file); > trace_event_read_unlock(); > } > @@ -2375,7 +2377,7 @@ static const struct seq_operations tracer_seq_ops = { > }; > > static struct trace_iterator *
Re: [PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace
On Wed, 2012-10-17 at 11:56 +0900, Hiraku Toyooka wrote: Ftrace has a snapshot feature available from kernel space and latency tracers (e.g. irqsoff) are using it. This patch enables user applictions to take a snapshot via debugfs. Add following two debugfs files in tracing directory. snapshot: This is used to take a snapshot and to read the output of the snapshot. # echo 1 snapshot # cat snapshot snapshot_allocate: Echoing 1 to the snapshot allocates the max_tr buffer if it is not allocated. So taking a snapshot may delay or fail beacuse of memory allocation. To avoid that, this file provides a mean to pre-allocate (or free) the max_tr buffer. # echo 1 snapshot_allocate [...] # echo 1 snapshot I was thinking about this some more, and I don't like the snapshot_allocate part. Also, I think the snapshot should not be allocated by default, and not used until the user explicitly asks for it. Have this: --- # cat snapshot Snapshot is not allocated. To allocate it write the ASCII 1 into this file: echo 1 snapshot This will allocate the buffer for you. To free the snapshot echo 0 into this file. echo 0 snapshot Anything else will reset the snapshot if it is allocated, or return EINVAL if it is not allocated. --- Also we can add a trace_snapshot to the kernel parameters to have it allocated on boot. But I can add that if you update these patches. Signed-off-by: Hiraku Toyooka hiraku.toyooka...@hitachi.com Cc: Steven Rostedt rost...@goodmis.org Cc: Frederic Weisbecker fweis...@gmail.com Cc: Ingo Molnar mi...@redhat.com Cc: Jiri Olsa jo...@redhat.com Cc: Li Zefan l...@cn.fujitsu.com Cc: linux-kernel@vger.kernel.org --- include/linux/ftrace_event.h |3 + kernel/trace/Kconfig | 11 ++ kernel/trace/trace.c | 190 +++--- kernel/trace/trace.h |1 4 files changed, 193 insertions(+), 12 deletions(-) diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h index 642928c..8c32c6e 100644 --- a/include/linux/ftrace_event.h +++ b/include/linux/ftrace_event.h @@ -84,6 +84,9 @@ struct trace_iterator { longidx; cpumask_var_t started; + + /* it's true when current open file is snapshot */ + boolsnapshot; }; diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig index 4cea4f4..73d56d5 100644 --- a/kernel/trace/Kconfig +++ b/kernel/trace/Kconfig @@ -102,6 +102,17 @@ config RING_BUFFER_ALLOW_SWAP Allow the use of ring_buffer_swap_cpu. Adds a very slight overhead to tracing when enabled. Move this config down below FTRACE_SYSCALLS and give it a prompt. As well as do not make it default y. +config TRACER_SNAPSHOT + bool bool Create a snapshot trace buffer + default y + select TRACER_MAX_TRACE + help + Allow tracing users to take snapshot of the current buffer using the + ftrace interface, e.g.: + + echo 1 /sys/kernel/debug/tracing/snapshot + cat snapshot + # All tracer options should select GENERIC_TRACER. For those options that are # enabled by all tracers (context switch and event tracer) they select TRACING. # This allows those options to appear when no other tracer is selected. But the diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index d71eee1..b0d097e 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -699,7 +699,7 @@ update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu) return; WARN_ON_ONCE(!irqs_disabled()); - if (!current_trace-use_max_tr) { + if (!current_trace-allocated_snapshot) { WARN_ON_ONCE(1); return; } @@ -729,7 +729,7 @@ update_max_tr_single(struct trace_array *tr, struct task_struct *tsk, int cpu) return; WARN_ON_ONCE(!irqs_disabled()); - if (!current_trace-use_max_tr) { + if (!current_trace-allocated_snapshot) { WARN_ON_ONCE(1); return; } @@ -1902,7 +1902,8 @@ static void *s_start(struct seq_file *m, loff_t *pos) } mutex_unlock(trace_types_lock); - atomic_inc(trace_record_cmdline_disabled); + if (!iter-snapshot) + atomic_inc(trace_record_cmdline_disabled); if (*pos != iter-pos) { iter-ent = NULL; @@ -1941,7 +1942,8 @@ static void s_stop(struct seq_file *m, void *p) { struct trace_iterator *iter = m-private; - atomic_dec(trace_record_cmdline_disabled); + if (!iter-snapshot) + atomic_dec(trace_record_cmdline_disabled); trace_access_unlock(iter-cpu_file); trace_event_read_unlock(); } @@ -2375,7 +2377,7 @@ static const struct seq_operations tracer_seq_ops = { }; static struct trace_iterator *
[PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace
Ftrace has a snapshot feature available from kernel space and latency tracers (e.g. irqsoff) are using it. This patch enables user applictions to take a snapshot via debugfs. Add following two debugfs files in "tracing" directory. snapshot: This is used to take a snapshot and to read the output of the snapshot. # echo 1 > snapshot # cat snapshot snapshot_allocate: Echoing 1 to the "snapshot" allocates the max_tr buffer if it is not allocated. So taking a snapshot may delay or fail beacuse of memory allocation. To avoid that, this file provides a mean to pre-allocate (or free) the max_tr buffer. # echo 1 > snapshot_allocate [...] # echo 1 > snapshot Signed-off-by: Hiraku Toyooka Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: Jiri Olsa Cc: Li Zefan Cc: linux-kernel@vger.kernel.org --- include/linux/ftrace_event.h |3 + kernel/trace/Kconfig | 11 ++ kernel/trace/trace.c | 190 +++--- kernel/trace/trace.h |1 4 files changed, 193 insertions(+), 12 deletions(-) diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h index 642928c..8c32c6e 100644 --- a/include/linux/ftrace_event.h +++ b/include/linux/ftrace_event.h @@ -84,6 +84,9 @@ struct trace_iterator { longidx; cpumask_var_t started; + + /* it's true when current open file is snapshot */ + boolsnapshot; }; diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig index 4cea4f4..73d56d5 100644 --- a/kernel/trace/Kconfig +++ b/kernel/trace/Kconfig @@ -102,6 +102,17 @@ config RING_BUFFER_ALLOW_SWAP Allow the use of ring_buffer_swap_cpu. Adds a very slight overhead to tracing when enabled. +config TRACER_SNAPSHOT + bool + default y + select TRACER_MAX_TRACE + help + Allow tracing users to take snapshot of the current buffer using the + ftrace interface, e.g.: + + echo 1 > /sys/kernel/debug/tracing/snapshot + cat snapshot + # All tracer options should select GENERIC_TRACER. For those options that are # enabled by all tracers (context switch and event tracer) they select TRACING. # This allows those options to appear when no other tracer is selected. But the diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index d71eee1..b0d097e 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -699,7 +699,7 @@ update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu) return; WARN_ON_ONCE(!irqs_disabled()); - if (!current_trace->use_max_tr) { + if (!current_trace->allocated_snapshot) { WARN_ON_ONCE(1); return; } @@ -729,7 +729,7 @@ update_max_tr_single(struct trace_array *tr, struct task_struct *tsk, int cpu) return; WARN_ON_ONCE(!irqs_disabled()); - if (!current_trace->use_max_tr) { + if (!current_trace->allocated_snapshot) { WARN_ON_ONCE(1); return; } @@ -1902,7 +1902,8 @@ static void *s_start(struct seq_file *m, loff_t *pos) } mutex_unlock(_types_lock); - atomic_inc(_record_cmdline_disabled); + if (!iter->snapshot) + atomic_inc(_record_cmdline_disabled); if (*pos != iter->pos) { iter->ent = NULL; @@ -1941,7 +1942,8 @@ static void s_stop(struct seq_file *m, void *p) { struct trace_iterator *iter = m->private; - atomic_dec(_record_cmdline_disabled); + if (!iter->snapshot) + atomic_dec(_record_cmdline_disabled); trace_access_unlock(iter->cpu_file); trace_event_read_unlock(); } @@ -2375,7 +2377,7 @@ static const struct seq_operations tracer_seq_ops = { }; static struct trace_iterator * -__tracing_open(struct inode *inode, struct file *file) +__tracing_open(struct inode *inode, struct file *file, int snapshot) { long cpu_file = (long) inode->i_private; struct trace_iterator *iter; @@ -2408,10 +2410,11 @@ __tracing_open(struct inode *inode, struct file *file) if (!zalloc_cpumask_var(>started, GFP_KERNEL)) goto fail; - if (current_trace && current_trace->print_max) + if ((current_trace && current_trace->print_max) || snapshot) iter->tr = _tr; else iter->tr = _trace; + iter->snapshot = !!snapshot; iter->pos = -1; mutex_init(>mutex); iter->cpu_file = cpu_file; @@ -2424,8 +2427,9 @@ __tracing_open(struct inode *inode, struct file *file) if (ring_buffer_overruns(iter->tr->buffer)) iter->iter_flags |= TRACE_FILE_ANNOTATE; - /* stop the trace while dumping */ - tracing_stop(); + /* stop the trace while dumping if we are not opening "snapshot" */ + if
[PATCH v2 -tip 3/4] tracing: make a snapshot feature available from userspace
Ftrace has a snapshot feature available from kernel space and latency tracers (e.g. irqsoff) are using it. This patch enables user applictions to take a snapshot via debugfs. Add following two debugfs files in tracing directory. snapshot: This is used to take a snapshot and to read the output of the snapshot. # echo 1 snapshot # cat snapshot snapshot_allocate: Echoing 1 to the snapshot allocates the max_tr buffer if it is not allocated. So taking a snapshot may delay or fail beacuse of memory allocation. To avoid that, this file provides a mean to pre-allocate (or free) the max_tr buffer. # echo 1 snapshot_allocate [...] # echo 1 snapshot Signed-off-by: Hiraku Toyooka hiraku.toyooka...@hitachi.com Cc: Steven Rostedt rost...@goodmis.org Cc: Frederic Weisbecker fweis...@gmail.com Cc: Ingo Molnar mi...@redhat.com Cc: Jiri Olsa jo...@redhat.com Cc: Li Zefan l...@cn.fujitsu.com Cc: linux-kernel@vger.kernel.org --- include/linux/ftrace_event.h |3 + kernel/trace/Kconfig | 11 ++ kernel/trace/trace.c | 190 +++--- kernel/trace/trace.h |1 4 files changed, 193 insertions(+), 12 deletions(-) diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h index 642928c..8c32c6e 100644 --- a/include/linux/ftrace_event.h +++ b/include/linux/ftrace_event.h @@ -84,6 +84,9 @@ struct trace_iterator { longidx; cpumask_var_t started; + + /* it's true when current open file is snapshot */ + boolsnapshot; }; diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig index 4cea4f4..73d56d5 100644 --- a/kernel/trace/Kconfig +++ b/kernel/trace/Kconfig @@ -102,6 +102,17 @@ config RING_BUFFER_ALLOW_SWAP Allow the use of ring_buffer_swap_cpu. Adds a very slight overhead to tracing when enabled. +config TRACER_SNAPSHOT + bool + default y + select TRACER_MAX_TRACE + help + Allow tracing users to take snapshot of the current buffer using the + ftrace interface, e.g.: + + echo 1 /sys/kernel/debug/tracing/snapshot + cat snapshot + # All tracer options should select GENERIC_TRACER. For those options that are # enabled by all tracers (context switch and event tracer) they select TRACING. # This allows those options to appear when no other tracer is selected. But the diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index d71eee1..b0d097e 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -699,7 +699,7 @@ update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu) return; WARN_ON_ONCE(!irqs_disabled()); - if (!current_trace-use_max_tr) { + if (!current_trace-allocated_snapshot) { WARN_ON_ONCE(1); return; } @@ -729,7 +729,7 @@ update_max_tr_single(struct trace_array *tr, struct task_struct *tsk, int cpu) return; WARN_ON_ONCE(!irqs_disabled()); - if (!current_trace-use_max_tr) { + if (!current_trace-allocated_snapshot) { WARN_ON_ONCE(1); return; } @@ -1902,7 +1902,8 @@ static void *s_start(struct seq_file *m, loff_t *pos) } mutex_unlock(trace_types_lock); - atomic_inc(trace_record_cmdline_disabled); + if (!iter-snapshot) + atomic_inc(trace_record_cmdline_disabled); if (*pos != iter-pos) { iter-ent = NULL; @@ -1941,7 +1942,8 @@ static void s_stop(struct seq_file *m, void *p) { struct trace_iterator *iter = m-private; - atomic_dec(trace_record_cmdline_disabled); + if (!iter-snapshot) + atomic_dec(trace_record_cmdline_disabled); trace_access_unlock(iter-cpu_file); trace_event_read_unlock(); } @@ -2375,7 +2377,7 @@ static const struct seq_operations tracer_seq_ops = { }; static struct trace_iterator * -__tracing_open(struct inode *inode, struct file *file) +__tracing_open(struct inode *inode, struct file *file, int snapshot) { long cpu_file = (long) inode-i_private; struct trace_iterator *iter; @@ -2408,10 +2410,11 @@ __tracing_open(struct inode *inode, struct file *file) if (!zalloc_cpumask_var(iter-started, GFP_KERNEL)) goto fail; - if (current_trace current_trace-print_max) + if ((current_trace current_trace-print_max) || snapshot) iter-tr = max_tr; else iter-tr = global_trace; + iter-snapshot = !!snapshot; iter-pos = -1; mutex_init(iter-mutex); iter-cpu_file = cpu_file; @@ -2424,8 +2427,9 @@ __tracing_open(struct inode *inode, struct file *file) if (ring_buffer_overruns(iter-tr-buffer)) iter-iter_flags |= TRACE_FILE_ANNOTATE; - /* stop the trace while dumping */ -