Re: [PATCH 2/6] tracing: Add trace event error log

2018-04-19 Thread Steven Rostedt
On Thu, 19 Apr 2018 09:40:33 +0900
Namhyung Kim  wrote:

> > I totally agree. I just want to make sure that whatever we come up with
> > will be well thought out. Perhaps we can wait till November to talk
> > about it.  
> 
> I'm not sure I can go to LPC this year, but definitely interested in
> improving error logging for tracing.

Feel free to send any ideas to us that you would like for us to add to
this discussion.

-- Steve


Re: [PATCH 2/6] tracing: Add trace event error log

2018-04-19 Thread Steven Rostedt
On Thu, 19 Apr 2018 09:40:33 +0900
Namhyung Kim  wrote:

> > I totally agree. I just want to make sure that whatever we come up with
> > will be well thought out. Perhaps we can wait till November to talk
> > about it.  
> 
> I'm not sure I can go to LPC this year, but definitely interested in
> improving error logging for tracing.

Feel free to send any ideas to us that you would like for us to add to
this discussion.

-- Steve


Re: [PATCH 2/6] tracing: Add trace event error log

2018-04-18 Thread Namhyung Kim
Hi guys, :)

On Wed, Apr 18, 2018 at 09:49:24AM -0400, Steven Rostedt wrote:
> On Wed, 18 Apr 2018 18:34:34 +0900
> Masami Hiramatsu  wrote:
> 
> > On Fri, 13 Apr 2018 10:44:32 -0400
> > Steven Rostedt  wrote:
> > 
> > > On Fri, 13 Apr 2018 09:24:34 -0500
> > > Tom Zanussi  wrote:
> > >   
> > > > Yeah, I agree - I'd rather get it right than get it in now.  I thought
> > > > this made sense, and was based on input from Masami, which I may have
> > > > misinterpreted, but I'll wait for some more ideas about the best way to
> > > > do this.  
> > > 
> > > Too bad we are not closer to November, as this would actually be a good
> > > Plumbers topic. Maybe it's not that important and we should wait until
> > > then. I'd like to get some brain storming ideas out before we decide on
> > > anything, and this is something I believe is better done face to face
> > > than over email.  
> > 
> > OK, sounds good for me too :)
> > My point was that printk buffer is not good place for the parser error
> > of ftrace, nor each sub-features (like hist, trigger, probe_events etc.) 
> > has different place to show it. I just want to unify the user experience
> > over the ftrace UI.
> 
> I totally agree. I just want to make sure that whatever we come up with
> will be well thought out. Perhaps we can wait till November to talk
> about it.

I'm not sure I can go to LPC this year, but definitely interested in
improving error logging for tracing.

Thanks,
Namhyung


Re: [PATCH 2/6] tracing: Add trace event error log

2018-04-18 Thread Namhyung Kim
Hi guys, :)

On Wed, Apr 18, 2018 at 09:49:24AM -0400, Steven Rostedt wrote:
> On Wed, 18 Apr 2018 18:34:34 +0900
> Masami Hiramatsu  wrote:
> 
> > On Fri, 13 Apr 2018 10:44:32 -0400
> > Steven Rostedt  wrote:
> > 
> > > On Fri, 13 Apr 2018 09:24:34 -0500
> > > Tom Zanussi  wrote:
> > >   
> > > > Yeah, I agree - I'd rather get it right than get it in now.  I thought
> > > > this made sense, and was based on input from Masami, which I may have
> > > > misinterpreted, but I'll wait for some more ideas about the best way to
> > > > do this.  
> > > 
> > > Too bad we are not closer to November, as this would actually be a good
> > > Plumbers topic. Maybe it's not that important and we should wait until
> > > then. I'd like to get some brain storming ideas out before we decide on
> > > anything, and this is something I believe is better done face to face
> > > than over email.  
> > 
> > OK, sounds good for me too :)
> > My point was that printk buffer is not good place for the parser error
> > of ftrace, nor each sub-features (like hist, trigger, probe_events etc.) 
> > has different place to show it. I just want to unify the user experience
> > over the ftrace UI.
> 
> I totally agree. I just want to make sure that whatever we come up with
> will be well thought out. Perhaps we can wait till November to talk
> about it.

I'm not sure I can go to LPC this year, but definitely interested in
improving error logging for tracing.

Thanks,
Namhyung


Re: [PATCH 2/6] tracing: Add trace event error log

2018-04-18 Thread Steven Rostedt
On Wed, 18 Apr 2018 18:34:34 +0900
Masami Hiramatsu  wrote:

> On Fri, 13 Apr 2018 10:44:32 -0400
> Steven Rostedt  wrote:
> 
> > On Fri, 13 Apr 2018 09:24:34 -0500
> > Tom Zanussi  wrote:
> >   
> > > Yeah, I agree - I'd rather get it right than get it in now.  I thought
> > > this made sense, and was based on input from Masami, which I may have
> > > misinterpreted, but I'll wait for some more ideas about the best way to
> > > do this.  
> > 
> > Too bad we are not closer to November, as this would actually be a good
> > Plumbers topic. Maybe it's not that important and we should wait until
> > then. I'd like to get some brain storming ideas out before we decide on
> > anything, and this is something I believe is better done face to face
> > than over email.  
> 
> OK, sounds good for me too :)
> My point was that printk buffer is not good place for the parser error
> of ftrace, nor each sub-features (like hist, trigger, probe_events etc.) 
> has different place to show it. I just want to unify the user experience
> over the ftrace UI.

I totally agree. I just want to make sure that whatever we come up with
will be well thought out. Perhaps we can wait till November to talk
about it.

-- Steve


Re: [PATCH 2/6] tracing: Add trace event error log

2018-04-18 Thread Steven Rostedt
On Wed, 18 Apr 2018 18:34:34 +0900
Masami Hiramatsu  wrote:

> On Fri, 13 Apr 2018 10:44:32 -0400
> Steven Rostedt  wrote:
> 
> > On Fri, 13 Apr 2018 09:24:34 -0500
> > Tom Zanussi  wrote:
> >   
> > > Yeah, I agree - I'd rather get it right than get it in now.  I thought
> > > this made sense, and was based on input from Masami, which I may have
> > > misinterpreted, but I'll wait for some more ideas about the best way to
> > > do this.  
> > 
> > Too bad we are not closer to November, as this would actually be a good
> > Plumbers topic. Maybe it's not that important and we should wait until
> > then. I'd like to get some brain storming ideas out before we decide on
> > anything, and this is something I believe is better done face to face
> > than over email.  
> 
> OK, sounds good for me too :)
> My point was that printk buffer is not good place for the parser error
> of ftrace, nor each sub-features (like hist, trigger, probe_events etc.) 
> has different place to show it. I just want to unify the user experience
> over the ftrace UI.

I totally agree. I just want to make sure that whatever we come up with
will be well thought out. Perhaps we can wait till November to talk
about it.

-- Steve


Re: [PATCH 2/6] tracing: Add trace event error log

2018-04-18 Thread Masami Hiramatsu
On Fri, 13 Apr 2018 10:44:32 -0400
Steven Rostedt  wrote:

> On Fri, 13 Apr 2018 09:24:34 -0500
> Tom Zanussi  wrote:
> 
> > Yeah, I agree - I'd rather get it right than get it in now.  I thought
> > this made sense, and was based on input from Masami, which I may have
> > misinterpreted, but I'll wait for some more ideas about the best way to
> > do this.
> 
> Too bad we are not closer to November, as this would actually be a good
> Plumbers topic. Maybe it's not that important and we should wait until
> then. I'd like to get some brain storming ideas out before we decide on
> anything, and this is something I believe is better done face to face
> than over email.

OK, sounds good for me too :)
My point was that printk buffer is not good place for the parser error
of ftrace, nor each sub-features (like hist, trigger, probe_events etc.) 
has different place to show it. I just want to unify the user experience
over the ftrace UI.

Thanks,

-- 
Masami Hiramatsu 


Re: [PATCH 2/6] tracing: Add trace event error log

2018-04-18 Thread Masami Hiramatsu
On Fri, 13 Apr 2018 10:44:32 -0400
Steven Rostedt  wrote:

> On Fri, 13 Apr 2018 09:24:34 -0500
> Tom Zanussi  wrote:
> 
> > Yeah, I agree - I'd rather get it right than get it in now.  I thought
> > this made sense, and was based on input from Masami, which I may have
> > misinterpreted, but I'll wait for some more ideas about the best way to
> > do this.
> 
> Too bad we are not closer to November, as this would actually be a good
> Plumbers topic. Maybe it's not that important and we should wait until
> then. I'd like to get some brain storming ideas out before we decide on
> anything, and this is something I believe is better done face to face
> than over email.

OK, sounds good for me too :)
My point was that printk buffer is not good place for the parser error
of ftrace, nor each sub-features (like hist, trigger, probe_events etc.) 
has different place to show it. I just want to unify the user experience
over the ftrace UI.

Thanks,

-- 
Masami Hiramatsu 


Re: [PATCH 2/6] tracing: Add trace event error log

2018-04-14 Thread Masami Hiramatsu
On Thu, 12 Apr 2018 18:52:13 -0500
Tom Zanussi  wrote:

> Hi Steve,
> 
> On Thu, 2018-04-12 at 18:20 -0400, Steven Rostedt wrote:
> > On Thu, 12 Apr 2018 10:13:17 -0500
> > Tom Zanussi  wrote:
> > 
> > > diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
> > > index 6fb46a0..f2dc7e6 100644
> > > --- a/kernel/trace/trace.h
> > > +++ b/kernel/trace/trace.h
> > > @@ -1765,6 +1765,9 @@ extern ssize_t trace_parse_run_command(struct file 
> > > *file,
> > >   const char __user *buffer, size_t count, loff_t *ppos,
> > >   int (*createfn)(int, char**));
> > >  
> > > +extern void event_log_err(const char *loc, const char *cmd, const char 
> > > *fmt,
> > > +   ...);
> > > +
> > >  /*
> > >   * Normal trace_printk() and friends allocates special buffers
> > >   * to do the manipulation, as well as saves the print formats
> > > diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> > > index 05c7172..fd02e22 100644
> > > --- a/kernel/trace/trace_events.c
> > > +++ b/kernel/trace/trace_events.c
> > > @@ -1668,6 +1668,164 @@ static void ignore_task_cpu(void *data)
> > >   return ret;
> > >  }
> > >  
> > > +#define EVENT_LOG_ERRS_MAX   (PAGE_SIZE / sizeof(struct 
> > > event_log_err))
> > 
> > > +#define EVENT_ERR_LOG_MASK   (EVENT_LOG_ERRS_MAX - 1)
> > 
> > BTW, the above only works if EVENT_LOG_ERRS_MAX is a power of two,
> > which it's not guaranteed to be.
> > 
> 
> My assumption was that we'd only ever need a page or two for the
> error_log and so would always would be a power of two, since the size of
> the struct event_log_err is 512.
> 
> Anyway, I should probably have put comments about all this in the code,
> and I will, but the way it works kind of assumes a very small number of
> errors - it's replacing a simple 'last error' facility for the hist
> triggers and making it a common facility for other things that have
> similar needs like Masami's kprobe_events errors.  For those purposes, I
> assumed it would suffice to simply be able to show that last 8 or some
> similar small number of errors and constantly recycle the slots.

Correct. I don't think the error log entry size over 16, it is too much
errors occur. User must check it before that. Or, we should push it
printk buffer.

> 
> Basically it just splits the page into 16 strings, 2 per error, one for
> the actual error text, the other for the command the user entered.  The
> struct event_log_err just overlays a struct on top of 2 strings just to
> make it easier to manage.
> 
> Anyway, because it is such a small number, and we start with a zeroed
> page, whenever we print the error log, we print all 16 strings even if
> we only have one error (2 strings).  The rest are NULL and print
> nothing.  We start with the tail, which could also be thought of as the
> 'oldest' or the 'first' error in the buffer and just cycle through them
> all.  Hope that clears up some of the other questions you had about how
> a non-full log gets printed, etc... 
> 
> > > +
> > > +struct event_log_err {
> > > + charerr[MAX_FILTER_STR_VAL];
> > > + charcmd[MAX_FILTER_STR_VAL];
> > > +};
> > 
> > I like the event_log_err idea, but the above can be shrunk to:
> > 
> > struct err_info {
> > u8  type; /* I can only imagine 254 types */
> > u8  pos;  /* MAX_FILTER_STR_VAR = 256 */
> > };
> > 
> > struct event_log_err {
> > struct err_info info;
> > charcmd[MAX_FILTER_STR_VAL];
> > };
> > 
> > There's no reason to put in a bunch of text that's going to be static
> > anyway. Have a lookup table like we do for filters.
> > 
> > +   log_err("Variable name not unique, need to use 
> > fully qualified name (%s) for variable: ", fqvar(system, event_name, 
> > var_name, true));
> > 
> 
> Hmm, most of the log_errs use printf strings that get expanded, so need
> a destination buffer, the event_log_err->err string, but I think I see
> what you're getting at - that we can get rid of the format strings
> altogether and make them static strings if we use the method of simply
> printing the static string and putting a caret where the error is as
> below.
> 
> > 
> > Instead of making the fqvar, find the location of the variable, and add:
> > 
> >  blah blah $var blah blah
> > ^
> >   Variable name not unique, need to use fully qualified name for variable
> > 
> 
> OK, if we can do this for every error type, then we can use the lookup
> table and the caret position instead of format strings.  Which means
> getting rid of the simple ring of strings, but whatever..

>From the viewpoint of kprobe events, I'm OK for either way.
I'll rewrite parser to get parsing position correctly.

Thanks!


-- 
Masami Hiramatsu 


Re: [PATCH 2/6] tracing: Add trace event error log

2018-04-14 Thread Masami Hiramatsu
On Thu, 12 Apr 2018 18:52:13 -0500
Tom Zanussi  wrote:

> Hi Steve,
> 
> On Thu, 2018-04-12 at 18:20 -0400, Steven Rostedt wrote:
> > On Thu, 12 Apr 2018 10:13:17 -0500
> > Tom Zanussi  wrote:
> > 
> > > diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
> > > index 6fb46a0..f2dc7e6 100644
> > > --- a/kernel/trace/trace.h
> > > +++ b/kernel/trace/trace.h
> > > @@ -1765,6 +1765,9 @@ extern ssize_t trace_parse_run_command(struct file 
> > > *file,
> > >   const char __user *buffer, size_t count, loff_t *ppos,
> > >   int (*createfn)(int, char**));
> > >  
> > > +extern void event_log_err(const char *loc, const char *cmd, const char 
> > > *fmt,
> > > +   ...);
> > > +
> > >  /*
> > >   * Normal trace_printk() and friends allocates special buffers
> > >   * to do the manipulation, as well as saves the print formats
> > > diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> > > index 05c7172..fd02e22 100644
> > > --- a/kernel/trace/trace_events.c
> > > +++ b/kernel/trace/trace_events.c
> > > @@ -1668,6 +1668,164 @@ static void ignore_task_cpu(void *data)
> > >   return ret;
> > >  }
> > >  
> > > +#define EVENT_LOG_ERRS_MAX   (PAGE_SIZE / sizeof(struct 
> > > event_log_err))
> > 
> > > +#define EVENT_ERR_LOG_MASK   (EVENT_LOG_ERRS_MAX - 1)
> > 
> > BTW, the above only works if EVENT_LOG_ERRS_MAX is a power of two,
> > which it's not guaranteed to be.
> > 
> 
> My assumption was that we'd only ever need a page or two for the
> error_log and so would always would be a power of two, since the size of
> the struct event_log_err is 512.
> 
> Anyway, I should probably have put comments about all this in the code,
> and I will, but the way it works kind of assumes a very small number of
> errors - it's replacing a simple 'last error' facility for the hist
> triggers and making it a common facility for other things that have
> similar needs like Masami's kprobe_events errors.  For those purposes, I
> assumed it would suffice to simply be able to show that last 8 or some
> similar small number of errors and constantly recycle the slots.

Correct. I don't think the error log entry size over 16, it is too much
errors occur. User must check it before that. Or, we should push it
printk buffer.

> 
> Basically it just splits the page into 16 strings, 2 per error, one for
> the actual error text, the other for the command the user entered.  The
> struct event_log_err just overlays a struct on top of 2 strings just to
> make it easier to manage.
> 
> Anyway, because it is such a small number, and we start with a zeroed
> page, whenever we print the error log, we print all 16 strings even if
> we only have one error (2 strings).  The rest are NULL and print
> nothing.  We start with the tail, which could also be thought of as the
> 'oldest' or the 'first' error in the buffer and just cycle through them
> all.  Hope that clears up some of the other questions you had about how
> a non-full log gets printed, etc... 
> 
> > > +
> > > +struct event_log_err {
> > > + charerr[MAX_FILTER_STR_VAL];
> > > + charcmd[MAX_FILTER_STR_VAL];
> > > +};
> > 
> > I like the event_log_err idea, but the above can be shrunk to:
> > 
> > struct err_info {
> > u8  type; /* I can only imagine 254 types */
> > u8  pos;  /* MAX_FILTER_STR_VAR = 256 */
> > };
> > 
> > struct event_log_err {
> > struct err_info info;
> > charcmd[MAX_FILTER_STR_VAL];
> > };
> > 
> > There's no reason to put in a bunch of text that's going to be static
> > anyway. Have a lookup table like we do for filters.
> > 
> > +   log_err("Variable name not unique, need to use 
> > fully qualified name (%s) for variable: ", fqvar(system, event_name, 
> > var_name, true));
> > 
> 
> Hmm, most of the log_errs use printf strings that get expanded, so need
> a destination buffer, the event_log_err->err string, but I think I see
> what you're getting at - that we can get rid of the format strings
> altogether and make them static strings if we use the method of simply
> printing the static string and putting a caret where the error is as
> below.
> 
> > 
> > Instead of making the fqvar, find the location of the variable, and add:
> > 
> >  blah blah $var blah blah
> > ^
> >   Variable name not unique, need to use fully qualified name for variable
> > 
> 
> OK, if we can do this for every error type, then we can use the lookup
> table and the caret position instead of format strings.  Which means
> getting rid of the simple ring of strings, but whatever..

>From the viewpoint of kprobe events, I'm OK for either way.
I'll rewrite parser to get parsing position correctly.

Thanks!


-- 
Masami Hiramatsu 


Re: [PATCH 2/6] tracing: Add trace event error log

2018-04-13 Thread Steven Rostedt
On Fri, 13 Apr 2018 09:24:34 -0500
Tom Zanussi  wrote:

> Yeah, I agree - I'd rather get it right than get it in now.  I thought
> this made sense, and was based on input from Masami, which I may have
> misinterpreted, but I'll wait for some more ideas about the best way to
> do this.

Too bad we are not closer to November, as this would actually be a good
Plumbers topic. Maybe it's not that important and we should wait until
then. I'd like to get some brain storming ideas out before we decide on
anything, and this is something I believe is better done face to face
than over email.

-- Steve


Re: [PATCH 2/6] tracing: Add trace event error log

2018-04-13 Thread Steven Rostedt
On Fri, 13 Apr 2018 09:24:34 -0500
Tom Zanussi  wrote:

> Yeah, I agree - I'd rather get it right than get it in now.  I thought
> this made sense, and was based on input from Masami, which I may have
> misinterpreted, but I'll wait for some more ideas about the best way to
> do this.

Too bad we are not closer to November, as this would actually be a good
Plumbers topic. Maybe it's not that important and we should wait until
then. I'd like to get some brain storming ideas out before we decide on
anything, and this is something I believe is better done face to face
than over email.

-- Steve


Re: [PATCH 2/6] tracing: Add trace event error log

2018-04-13 Thread Tom Zanussi
Hi Steve,

On Fri, 2018-04-13 at 09:45 -0400, Steven Rostedt wrote:
> On Thu, 12 Apr 2018 18:52:13 -0500
> Tom Zanussi  wrote:
> 
> > Hi Steve,
> > 
> > On Thu, 2018-04-12 at 18:20 -0400, Steven Rostedt wrote:
> > > On Thu, 12 Apr 2018 10:13:17 -0500
> > > Tom Zanussi  wrote:
> > >   
> > > > diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
> > > > index 6fb46a0..f2dc7e6 100644
> > > > --- a/kernel/trace/trace.h
> > > > +++ b/kernel/trace/trace.h
> > > > @@ -1765,6 +1765,9 @@ extern ssize_t trace_parse_run_command(struct 
> > > > file *file,
> > > > const char __user *buffer, size_t count, loff_t *ppos,
> > > > int (*createfn)(int, char**));
> > > >  
> > > > +extern void event_log_err(const char *loc, const char *cmd, const char 
> > > > *fmt,
> > > > + ...);
> > > > +
> > > >  /*
> > > >   * Normal trace_printk() and friends allocates special buffers
> > > >   * to do the manipulation, as well as saves the print formats
> > > > diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> > > > index 05c7172..fd02e22 100644
> > > > --- a/kernel/trace/trace_events.c
> > > > +++ b/kernel/trace/trace_events.c
> > > > @@ -1668,6 +1668,164 @@ static void ignore_task_cpu(void *data)
> > > > return ret;
> > > >  }
> > > >  
> > > > +#define EVENT_LOG_ERRS_MAX (PAGE_SIZE / sizeof(struct 
> > > > event_log_err))  
> > >   
> > > > +#define EVENT_ERR_LOG_MASK (EVENT_LOG_ERRS_MAX - 1)  
> > > 
> > > BTW, the above only works if EVENT_LOG_ERRS_MAX is a power of two,
> > > which it's not guaranteed to be.
> > >   
> > 
> > My assumption was that we'd only ever need a page or two for the
> > error_log and so would always would be a power of two, since the size of
> > the struct event_log_err is 512.
> 
> Assumptions are not what we want to rely on. There should be something
> like:
> 
>   BUILD_BUG_ON(EVENT_LOG_ERRS_MAX & EVENT_ERR_LOG_MASK);
> 
> Which would guarantee that your assumption is correct otherwise the
> kernel wont build.
> 

OK.

> 
> > 
> > Anyway, I should probably have put comments about all this in the code,
> > and I will, but the way it works kind of assumes a very small number of
> > errors - it's replacing a simple 'last error' facility for the hist
> > triggers and making it a common facility for other things that have
> > similar needs like Masami's kprobe_events errors.  For those purposes, I
> > assumed it would suffice to simply be able to show that last 8 or some
> > similar small number of errors and constantly recycle the slots.
> 
> The errors are still in the files that have the errors right? Perhaps
> just have a file that lists the files that contain errors. That way if
> something goes wrong, you can examine that file and then look at the
> file that contains the error?
> 

No, that's part of the motivation for this change - currently there is
just one last 'last error', the output tacked onto whichever event's
hist file you read (normally this would be the one you just got the
error for, but doesn't have to be) - there isn't a last error per event.
Masami of course found this unintuitive, which it is, I agree, and
wanted a single file (error_log) to look into for the last error.  In
addition, it should have a logging interface that any trace event
command could use, such as kprobe_events.

> And I'm not sure it being in the events directory is the best place
> either, especially, if you plan to have it handle kprobe_events because
> that's not in the events directory.
> 

Yeah, I put it there because it's associated with trace events - putting
it in tracing/ would imply that it's meant for ftrace in general (which
maybe it should be but this isn't).  Actually I'm not sure kprobe_events
shouldn't be in tracing/events too..

> > 
> > Basically it just splits the page into 16 strings, 2 per error, one for
> > the actual error text, the other for the command the user entered.  The
> > struct event_log_err just overlays a struct on top of 2 strings just to
> > make it easier to manage.
> > 
> > Anyway, because it is such a small number, and we start with a zeroed
> > page, whenever we print the error log, we print all 16 strings even if
> > we only have one error (2 strings).  The rest are NULL and print
> > nothing.  We start with the tail, which could also be thought of as the
> > 'oldest' or the 'first' error in the buffer and just cycle through them
> > all.  Hope that clears up some of the other questions you had about how
> > a non-full log gets printed, etc...
> 
> OK, I was thinking a NULL entry would return NULL, but we are
> returning a pointer to NULL. That's where I missed it.
>  
> > 
> > > > +
> > > > +struct event_log_err {
> > > > +   charerr[MAX_FILTER_STR_VAL];
> > > > +   charcmd[MAX_FILTER_STR_VAL];
> > > > +};  
> > > 
> > > I like the event_log_err idea, but the above 

Re: [PATCH 2/6] tracing: Add trace event error log

2018-04-13 Thread Tom Zanussi
Hi Steve,

On Fri, 2018-04-13 at 09:45 -0400, Steven Rostedt wrote:
> On Thu, 12 Apr 2018 18:52:13 -0500
> Tom Zanussi  wrote:
> 
> > Hi Steve,
> > 
> > On Thu, 2018-04-12 at 18:20 -0400, Steven Rostedt wrote:
> > > On Thu, 12 Apr 2018 10:13:17 -0500
> > > Tom Zanussi  wrote:
> > >   
> > > > diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
> > > > index 6fb46a0..f2dc7e6 100644
> > > > --- a/kernel/trace/trace.h
> > > > +++ b/kernel/trace/trace.h
> > > > @@ -1765,6 +1765,9 @@ extern ssize_t trace_parse_run_command(struct 
> > > > file *file,
> > > > const char __user *buffer, size_t count, loff_t *ppos,
> > > > int (*createfn)(int, char**));
> > > >  
> > > > +extern void event_log_err(const char *loc, const char *cmd, const char 
> > > > *fmt,
> > > > + ...);
> > > > +
> > > >  /*
> > > >   * Normal trace_printk() and friends allocates special buffers
> > > >   * to do the manipulation, as well as saves the print formats
> > > > diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> > > > index 05c7172..fd02e22 100644
> > > > --- a/kernel/trace/trace_events.c
> > > > +++ b/kernel/trace/trace_events.c
> > > > @@ -1668,6 +1668,164 @@ static void ignore_task_cpu(void *data)
> > > > return ret;
> > > >  }
> > > >  
> > > > +#define EVENT_LOG_ERRS_MAX (PAGE_SIZE / sizeof(struct 
> > > > event_log_err))  
> > >   
> > > > +#define EVENT_ERR_LOG_MASK (EVENT_LOG_ERRS_MAX - 1)  
> > > 
> > > BTW, the above only works if EVENT_LOG_ERRS_MAX is a power of two,
> > > which it's not guaranteed to be.
> > >   
> > 
> > My assumption was that we'd only ever need a page or two for the
> > error_log and so would always would be a power of two, since the size of
> > the struct event_log_err is 512.
> 
> Assumptions are not what we want to rely on. There should be something
> like:
> 
>   BUILD_BUG_ON(EVENT_LOG_ERRS_MAX & EVENT_ERR_LOG_MASK);
> 
> Which would guarantee that your assumption is correct otherwise the
> kernel wont build.
> 

OK.

> 
> > 
> > Anyway, I should probably have put comments about all this in the code,
> > and I will, but the way it works kind of assumes a very small number of
> > errors - it's replacing a simple 'last error' facility for the hist
> > triggers and making it a common facility for other things that have
> > similar needs like Masami's kprobe_events errors.  For those purposes, I
> > assumed it would suffice to simply be able to show that last 8 or some
> > similar small number of errors and constantly recycle the slots.
> 
> The errors are still in the files that have the errors right? Perhaps
> just have a file that lists the files that contain errors. That way if
> something goes wrong, you can examine that file and then look at the
> file that contains the error?
> 

No, that's part of the motivation for this change - currently there is
just one last 'last error', the output tacked onto whichever event's
hist file you read (normally this would be the one you just got the
error for, but doesn't have to be) - there isn't a last error per event.
Masami of course found this unintuitive, which it is, I agree, and
wanted a single file (error_log) to look into for the last error.  In
addition, it should have a logging interface that any trace event
command could use, such as kprobe_events.

> And I'm not sure it being in the events directory is the best place
> either, especially, if you plan to have it handle kprobe_events because
> that's not in the events directory.
> 

Yeah, I put it there because it's associated with trace events - putting
it in tracing/ would imply that it's meant for ftrace in general (which
maybe it should be but this isn't).  Actually I'm not sure kprobe_events
shouldn't be in tracing/events too..

> > 
> > Basically it just splits the page into 16 strings, 2 per error, one for
> > the actual error text, the other for the command the user entered.  The
> > struct event_log_err just overlays a struct on top of 2 strings just to
> > make it easier to manage.
> > 
> > Anyway, because it is such a small number, and we start with a zeroed
> > page, whenever we print the error log, we print all 16 strings even if
> > we only have one error (2 strings).  The rest are NULL and print
> > nothing.  We start with the tail, which could also be thought of as the
> > 'oldest' or the 'first' error in the buffer and just cycle through them
> > all.  Hope that clears up some of the other questions you had about how
> > a non-full log gets printed, etc...
> 
> OK, I was thinking a NULL entry would return NULL, but we are
> returning a pointer to NULL. That's where I missed it.
>  
> > 
> > > > +
> > > > +struct event_log_err {
> > > > +   charerr[MAX_FILTER_STR_VAL];
> > > > +   charcmd[MAX_FILTER_STR_VAL];
> > > > +};  
> > > 
> > > I like the event_log_err idea, but the above can be shrunk to:
> > > 
> > > struct err_info {
> > >   

Re: [PATCH 2/6] tracing: Add trace event error log

2018-04-13 Thread Steven Rostedt
On Thu, 12 Apr 2018 18:52:13 -0500
Tom Zanussi  wrote:

> Hi Steve,
> 
> On Thu, 2018-04-12 at 18:20 -0400, Steven Rostedt wrote:
> > On Thu, 12 Apr 2018 10:13:17 -0500
> > Tom Zanussi  wrote:
> >   
> > > diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
> > > index 6fb46a0..f2dc7e6 100644
> > > --- a/kernel/trace/trace.h
> > > +++ b/kernel/trace/trace.h
> > > @@ -1765,6 +1765,9 @@ extern ssize_t trace_parse_run_command(struct file 
> > > *file,
> > >   const char __user *buffer, size_t count, loff_t *ppos,
> > >   int (*createfn)(int, char**));
> > >  
> > > +extern void event_log_err(const char *loc, const char *cmd, const char 
> > > *fmt,
> > > +   ...);
> > > +
> > >  /*
> > >   * Normal trace_printk() and friends allocates special buffers
> > >   * to do the manipulation, as well as saves the print formats
> > > diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> > > index 05c7172..fd02e22 100644
> > > --- a/kernel/trace/trace_events.c
> > > +++ b/kernel/trace/trace_events.c
> > > @@ -1668,6 +1668,164 @@ static void ignore_task_cpu(void *data)
> > >   return ret;
> > >  }
> > >  
> > > +#define EVENT_LOG_ERRS_MAX   (PAGE_SIZE / sizeof(struct 
> > > event_log_err))  
> >   
> > > +#define EVENT_ERR_LOG_MASK   (EVENT_LOG_ERRS_MAX - 1)  
> > 
> > BTW, the above only works if EVENT_LOG_ERRS_MAX is a power of two,
> > which it's not guaranteed to be.
> >   
> 
> My assumption was that we'd only ever need a page or two for the
> error_log and so would always would be a power of two, since the size of
> the struct event_log_err is 512.

Assumptions are not what we want to rely on. There should be something
like:

BUILD_BUG_ON(EVENT_LOG_ERRS_MAX & EVENT_ERR_LOG_MASK);

Which would guarantee that your assumption is correct otherwise the
kernel wont build.


> 
> Anyway, I should probably have put comments about all this in the code,
> and I will, but the way it works kind of assumes a very small number of
> errors - it's replacing a simple 'last error' facility for the hist
> triggers and making it a common facility for other things that have
> similar needs like Masami's kprobe_events errors.  For those purposes, I
> assumed it would suffice to simply be able to show that last 8 or some
> similar small number of errors and constantly recycle the slots.

The errors are still in the files that have the errors right? Perhaps
just have a file that lists the files that contain errors. That way if
something goes wrong, you can examine that file and then look at the
file that contains the error?

And I'm not sure it being in the events directory is the best place
either, especially, if you plan to have it handle kprobe_events because
that's not in the events directory.

> 
> Basically it just splits the page into 16 strings, 2 per error, one for
> the actual error text, the other for the command the user entered.  The
> struct event_log_err just overlays a struct on top of 2 strings just to
> make it easier to manage.
> 
> Anyway, because it is such a small number, and we start with a zeroed
> page, whenever we print the error log, we print all 16 strings even if
> we only have one error (2 strings).  The rest are NULL and print
> nothing.  We start with the tail, which could also be thought of as the
> 'oldest' or the 'first' error in the buffer and just cycle through them
> all.  Hope that clears up some of the other questions you had about how
> a non-full log gets printed, etc...

OK, I was thinking a NULL entry would return NULL, but we are
returning a pointer to NULL. That's where I missed it.
 
> 
> > > +
> > > +struct event_log_err {
> > > + charerr[MAX_FILTER_STR_VAL];
> > > + charcmd[MAX_FILTER_STR_VAL];
> > > +};  
> > 
> > I like the event_log_err idea, but the above can be shrunk to:
> > 
> > struct err_info {
> > u8  type; /* I can only imagine 254 types */
> > u8  pos;  /* MAX_FILTER_STR_VAR = 256 */
> > };
> > 
> > struct event_log_err {
> > struct err_info info;
> > charcmd[MAX_FILTER_STR_VAL];
> > };
> > 
> > There's no reason to put in a bunch of text that's going to be static
> > anyway. Have a lookup table like we do for filters.
> > 
> > +   log_err("Variable name not unique, need to use 
> > fully qualified name (%s) for variable: ", fqvar(system, event_name, 
> > var_name, true));
> >   
> 
> Hmm, most of the log_errs use printf strings that get expanded, so need
> a destination buffer, the event_log_err->err string, but I think I see
> what you're getting at - that we can get rid of the format strings
> altogether and make them static strings if we use the method of simply
> printing the static string and putting a caret where the error is as
> below.
> 
> > 
> > Instead of making the fqvar, find the location of the variable, and add:
> > 
> > 

Re: [PATCH 2/6] tracing: Add trace event error log

2018-04-13 Thread Steven Rostedt
On Thu, 12 Apr 2018 18:52:13 -0500
Tom Zanussi  wrote:

> Hi Steve,
> 
> On Thu, 2018-04-12 at 18:20 -0400, Steven Rostedt wrote:
> > On Thu, 12 Apr 2018 10:13:17 -0500
> > Tom Zanussi  wrote:
> >   
> > > diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
> > > index 6fb46a0..f2dc7e6 100644
> > > --- a/kernel/trace/trace.h
> > > +++ b/kernel/trace/trace.h
> > > @@ -1765,6 +1765,9 @@ extern ssize_t trace_parse_run_command(struct file 
> > > *file,
> > >   const char __user *buffer, size_t count, loff_t *ppos,
> > >   int (*createfn)(int, char**));
> > >  
> > > +extern void event_log_err(const char *loc, const char *cmd, const char 
> > > *fmt,
> > > +   ...);
> > > +
> > >  /*
> > >   * Normal trace_printk() and friends allocates special buffers
> > >   * to do the manipulation, as well as saves the print formats
> > > diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> > > index 05c7172..fd02e22 100644
> > > --- a/kernel/trace/trace_events.c
> > > +++ b/kernel/trace/trace_events.c
> > > @@ -1668,6 +1668,164 @@ static void ignore_task_cpu(void *data)
> > >   return ret;
> > >  }
> > >  
> > > +#define EVENT_LOG_ERRS_MAX   (PAGE_SIZE / sizeof(struct 
> > > event_log_err))  
> >   
> > > +#define EVENT_ERR_LOG_MASK   (EVENT_LOG_ERRS_MAX - 1)  
> > 
> > BTW, the above only works if EVENT_LOG_ERRS_MAX is a power of two,
> > which it's not guaranteed to be.
> >   
> 
> My assumption was that we'd only ever need a page or two for the
> error_log and so would always would be a power of two, since the size of
> the struct event_log_err is 512.

Assumptions are not what we want to rely on. There should be something
like:

BUILD_BUG_ON(EVENT_LOG_ERRS_MAX & EVENT_ERR_LOG_MASK);

Which would guarantee that your assumption is correct otherwise the
kernel wont build.


> 
> Anyway, I should probably have put comments about all this in the code,
> and I will, but the way it works kind of assumes a very small number of
> errors - it's replacing a simple 'last error' facility for the hist
> triggers and making it a common facility for other things that have
> similar needs like Masami's kprobe_events errors.  For those purposes, I
> assumed it would suffice to simply be able to show that last 8 or some
> similar small number of errors and constantly recycle the slots.

The errors are still in the files that have the errors right? Perhaps
just have a file that lists the files that contain errors. That way if
something goes wrong, you can examine that file and then look at the
file that contains the error?

And I'm not sure it being in the events directory is the best place
either, especially, if you plan to have it handle kprobe_events because
that's not in the events directory.

> 
> Basically it just splits the page into 16 strings, 2 per error, one for
> the actual error text, the other for the command the user entered.  The
> struct event_log_err just overlays a struct on top of 2 strings just to
> make it easier to manage.
> 
> Anyway, because it is such a small number, and we start with a zeroed
> page, whenever we print the error log, we print all 16 strings even if
> we only have one error (2 strings).  The rest are NULL and print
> nothing.  We start with the tail, which could also be thought of as the
> 'oldest' or the 'first' error in the buffer and just cycle through them
> all.  Hope that clears up some of the other questions you had about how
> a non-full log gets printed, etc...

OK, I was thinking a NULL entry would return NULL, but we are
returning a pointer to NULL. That's where I missed it.
 
> 
> > > +
> > > +struct event_log_err {
> > > + charerr[MAX_FILTER_STR_VAL];
> > > + charcmd[MAX_FILTER_STR_VAL];
> > > +};  
> > 
> > I like the event_log_err idea, but the above can be shrunk to:
> > 
> > struct err_info {
> > u8  type; /* I can only imagine 254 types */
> > u8  pos;  /* MAX_FILTER_STR_VAR = 256 */
> > };
> > 
> > struct event_log_err {
> > struct err_info info;
> > charcmd[MAX_FILTER_STR_VAL];
> > };
> > 
> > There's no reason to put in a bunch of text that's going to be static
> > anyway. Have a lookup table like we do for filters.
> > 
> > +   log_err("Variable name not unique, need to use 
> > fully qualified name (%s) for variable: ", fqvar(system, event_name, 
> > var_name, true));
> >   
> 
> Hmm, most of the log_errs use printf strings that get expanded, so need
> a destination buffer, the event_log_err->err string, but I think I see
> what you're getting at - that we can get rid of the format strings
> altogether and make them static strings if we use the method of simply
> printing the static string and putting a caret where the error is as
> below.
> 
> > 
> > Instead of making the fqvar, find the location of the variable, and add:
> > 
> >  blah blah $var blah blah
> > ^
> >   

Re: [PATCH 2/6] tracing: Add trace event error log

2018-04-12 Thread Tom Zanussi
Hi Steve,

On Thu, 2018-04-12 at 18:20 -0400, Steven Rostedt wrote:
> On Thu, 12 Apr 2018 10:13:17 -0500
> Tom Zanussi  wrote:
> 
> > diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
> > index 6fb46a0..f2dc7e6 100644
> > --- a/kernel/trace/trace.h
> > +++ b/kernel/trace/trace.h
> > @@ -1765,6 +1765,9 @@ extern ssize_t trace_parse_run_command(struct file 
> > *file,
> > const char __user *buffer, size_t count, loff_t *ppos,
> > int (*createfn)(int, char**));
> >  
> > +extern void event_log_err(const char *loc, const char *cmd, const char 
> > *fmt,
> > + ...);
> > +
> >  /*
> >   * Normal trace_printk() and friends allocates special buffers
> >   * to do the manipulation, as well as saves the print formats
> > diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> > index 05c7172..fd02e22 100644
> > --- a/kernel/trace/trace_events.c
> > +++ b/kernel/trace/trace_events.c
> > @@ -1668,6 +1668,164 @@ static void ignore_task_cpu(void *data)
> > return ret;
> >  }
> >  
> > +#define EVENT_LOG_ERRS_MAX (PAGE_SIZE / sizeof(struct event_log_err))
> 
> > +#define EVENT_ERR_LOG_MASK (EVENT_LOG_ERRS_MAX - 1)
> 
> BTW, the above only works if EVENT_LOG_ERRS_MAX is a power of two,
> which it's not guaranteed to be.
> 

My assumption was that we'd only ever need a page or two for the
error_log and so would always would be a power of two, since the size of
the struct event_log_err is 512.

Anyway, I should probably have put comments about all this in the code,
and I will, but the way it works kind of assumes a very small number of
errors - it's replacing a simple 'last error' facility for the hist
triggers and making it a common facility for other things that have
similar needs like Masami's kprobe_events errors.  For those purposes, I
assumed it would suffice to simply be able to show that last 8 or some
similar small number of errors and constantly recycle the slots.

Basically it just splits the page into 16 strings, 2 per error, one for
the actual error text, the other for the command the user entered.  The
struct event_log_err just overlays a struct on top of 2 strings just to
make it easier to manage.

Anyway, because it is such a small number, and we start with a zeroed
page, whenever we print the error log, we print all 16 strings even if
we only have one error (2 strings).  The rest are NULL and print
nothing.  We start with the tail, which could also be thought of as the
'oldest' or the 'first' error in the buffer and just cycle through them
all.  Hope that clears up some of the other questions you had about how
a non-full log gets printed, etc... 

> > +
> > +struct event_log_err {
> > +   charerr[MAX_FILTER_STR_VAL];
> > +   charcmd[MAX_FILTER_STR_VAL];
> > +};
> 
> I like the event_log_err idea, but the above can be shrunk to:
> 
> struct err_info {
>   u8  type; /* I can only imagine 254 types */
>   u8  pos;  /* MAX_FILTER_STR_VAR = 256 */
> };
> 
> struct event_log_err {
>   struct err_info info;
>   charcmd[MAX_FILTER_STR_VAL];
> };
> 
> There's no reason to put in a bunch of text that's going to be static
> anyway. Have a lookup table like we do for filters.
> 
> + log_err("Variable name not unique, need to use 
> fully qualified name (%s) for variable: ", fqvar(system, event_name, 
> var_name, true));
> 

Hmm, most of the log_errs use printf strings that get expanded, so need
a destination buffer, the event_log_err->err string, but I think I see
what you're getting at - that we can get rid of the format strings
altogether and make them static strings if we use the method of simply
printing the static string and putting a caret where the error is as
below.

> 
> Instead of making the fqvar, find the location of the variable, and add:
> 
>  blah blah $var blah blah
> ^
>   Variable name not unique, need to use fully qualified name for variable
> 

OK, if we can do this for every error type, then we can use the lookup
table and the caret position instead of format strings.  Which means
getting rid of the simple ring of strings, but whatever..

> > +
> > +static char *event_err_log;
> > +static unsigned int event_err_log_tail;
> > +
> > +struct event_log_err *get_event_log_err(void)
> > +{
> > +   struct event_log_err *err;
> > +   char *errpos;
> > +
> > +   if (!event_err_log) {
> > +   event_err_log = (char *)get_zeroed_page(GFP_KERNEL);
> > +   if (!event_err_log)
> > +   return NULL;
> > +   }
> > +
> > +   errpos = event_err_log + event_err_log_tail * sizeof(*err);
> > +   err = (struct event_log_err *)errpos;
> > +
> > +   event_err_log_tail = (event_err_log_tail + 1) & EVENT_ERR_LOG_MASK;
> 
> So you add errors one after the other:
> 
> First error:
> 
>  err1,NULL,NULL,NULL,...
>   ^
>  tail
> 
> Second error:
> 
>  

Re: [PATCH 2/6] tracing: Add trace event error log

2018-04-12 Thread Tom Zanussi
Hi Steve,

On Thu, 2018-04-12 at 18:20 -0400, Steven Rostedt wrote:
> On Thu, 12 Apr 2018 10:13:17 -0500
> Tom Zanussi  wrote:
> 
> > diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
> > index 6fb46a0..f2dc7e6 100644
> > --- a/kernel/trace/trace.h
> > +++ b/kernel/trace/trace.h
> > @@ -1765,6 +1765,9 @@ extern ssize_t trace_parse_run_command(struct file 
> > *file,
> > const char __user *buffer, size_t count, loff_t *ppos,
> > int (*createfn)(int, char**));
> >  
> > +extern void event_log_err(const char *loc, const char *cmd, const char 
> > *fmt,
> > + ...);
> > +
> >  /*
> >   * Normal trace_printk() and friends allocates special buffers
> >   * to do the manipulation, as well as saves the print formats
> > diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> > index 05c7172..fd02e22 100644
> > --- a/kernel/trace/trace_events.c
> > +++ b/kernel/trace/trace_events.c
> > @@ -1668,6 +1668,164 @@ static void ignore_task_cpu(void *data)
> > return ret;
> >  }
> >  
> > +#define EVENT_LOG_ERRS_MAX (PAGE_SIZE / sizeof(struct event_log_err))
> 
> > +#define EVENT_ERR_LOG_MASK (EVENT_LOG_ERRS_MAX - 1)
> 
> BTW, the above only works if EVENT_LOG_ERRS_MAX is a power of two,
> which it's not guaranteed to be.
> 

My assumption was that we'd only ever need a page or two for the
error_log and so would always would be a power of two, since the size of
the struct event_log_err is 512.

Anyway, I should probably have put comments about all this in the code,
and I will, but the way it works kind of assumes a very small number of
errors - it's replacing a simple 'last error' facility for the hist
triggers and making it a common facility for other things that have
similar needs like Masami's kprobe_events errors.  For those purposes, I
assumed it would suffice to simply be able to show that last 8 or some
similar small number of errors and constantly recycle the slots.

Basically it just splits the page into 16 strings, 2 per error, one for
the actual error text, the other for the command the user entered.  The
struct event_log_err just overlays a struct on top of 2 strings just to
make it easier to manage.

Anyway, because it is such a small number, and we start with a zeroed
page, whenever we print the error log, we print all 16 strings even if
we only have one error (2 strings).  The rest are NULL and print
nothing.  We start with the tail, which could also be thought of as the
'oldest' or the 'first' error in the buffer and just cycle through them
all.  Hope that clears up some of the other questions you had about how
a non-full log gets printed, etc... 

> > +
> > +struct event_log_err {
> > +   charerr[MAX_FILTER_STR_VAL];
> > +   charcmd[MAX_FILTER_STR_VAL];
> > +};
> 
> I like the event_log_err idea, but the above can be shrunk to:
> 
> struct err_info {
>   u8  type; /* I can only imagine 254 types */
>   u8  pos;  /* MAX_FILTER_STR_VAR = 256 */
> };
> 
> struct event_log_err {
>   struct err_info info;
>   charcmd[MAX_FILTER_STR_VAL];
> };
> 
> There's no reason to put in a bunch of text that's going to be static
> anyway. Have a lookup table like we do for filters.
> 
> + log_err("Variable name not unique, need to use 
> fully qualified name (%s) for variable: ", fqvar(system, event_name, 
> var_name, true));
> 

Hmm, most of the log_errs use printf strings that get expanded, so need
a destination buffer, the event_log_err->err string, but I think I see
what you're getting at - that we can get rid of the format strings
altogether and make them static strings if we use the method of simply
printing the static string and putting a caret where the error is as
below.

> 
> Instead of making the fqvar, find the location of the variable, and add:
> 
>  blah blah $var blah blah
> ^
>   Variable name not unique, need to use fully qualified name for variable
> 

OK, if we can do this for every error type, then we can use the lookup
table and the caret position instead of format strings.  Which means
getting rid of the simple ring of strings, but whatever..

> > +
> > +static char *event_err_log;
> > +static unsigned int event_err_log_tail;
> > +
> > +struct event_log_err *get_event_log_err(void)
> > +{
> > +   struct event_log_err *err;
> > +   char *errpos;
> > +
> > +   if (!event_err_log) {
> > +   event_err_log = (char *)get_zeroed_page(GFP_KERNEL);
> > +   if (!event_err_log)
> > +   return NULL;
> > +   }
> > +
> > +   errpos = event_err_log + event_err_log_tail * sizeof(*err);
> > +   err = (struct event_log_err *)errpos;
> > +
> > +   event_err_log_tail = (event_err_log_tail + 1) & EVENT_ERR_LOG_MASK;
> 
> So you add errors one after the other:
> 
> First error:
> 
>  err1,NULL,NULL,NULL,...
>   ^
>  tail
> 
> Second error:
> 
>  err1,err2,NULL,NULL,...
>^

Re: [PATCH 2/6] tracing: Add trace event error log

2018-04-12 Thread Steven Rostedt
On Thu, 12 Apr 2018 10:13:17 -0500
Tom Zanussi  wrote:

> diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
> index 6fb46a0..f2dc7e6 100644
> --- a/kernel/trace/trace.h
> +++ b/kernel/trace/trace.h
> @@ -1765,6 +1765,9 @@ extern ssize_t trace_parse_run_command(struct file 
> *file,
>   const char __user *buffer, size_t count, loff_t *ppos,
>   int (*createfn)(int, char**));
>  
> +extern void event_log_err(const char *loc, const char *cmd, const char *fmt,
> +   ...);
> +
>  /*
>   * Normal trace_printk() and friends allocates special buffers
>   * to do the manipulation, as well as saves the print formats
> diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> index 05c7172..fd02e22 100644
> --- a/kernel/trace/trace_events.c
> +++ b/kernel/trace/trace_events.c
> @@ -1668,6 +1668,164 @@ static void ignore_task_cpu(void *data)
>   return ret;
>  }
>  
> +#define EVENT_LOG_ERRS_MAX   (PAGE_SIZE / sizeof(struct event_log_err))

> +#define EVENT_ERR_LOG_MASK   (EVENT_LOG_ERRS_MAX - 1)

BTW, the above only works if EVENT_LOG_ERRS_MAX is a power of two,
which it's not guaranteed to be.

> +
> +struct event_log_err {
> + charerr[MAX_FILTER_STR_VAL];
> + charcmd[MAX_FILTER_STR_VAL];
> +};

I like the event_log_err idea, but the above can be shrunk to:

struct err_info {
u8  type; /* I can only imagine 254 types */
u8  pos;  /* MAX_FILTER_STR_VAR = 256 */
};

struct event_log_err {
struct err_info info;
charcmd[MAX_FILTER_STR_VAL];
};

There's no reason to put in a bunch of text that's going to be static
anyway. Have a lookup table like we do for filters.

+   log_err("Variable name not unique, need to use 
fully qualified name (%s) for variable: ", fqvar(system, event_name, var_name, 
true));


Instead of making the fqvar, find the location of the variable, and add:

 blah blah $var blah blah
^
  Variable name not unique, need to use fully qualified name for variable

> +
> +static char *event_err_log;
> +static unsigned int event_err_log_tail;
> +
> +struct event_log_err *get_event_log_err(void)
> +{
> + struct event_log_err *err;
> + char *errpos;
> +
> + if (!event_err_log) {
> + event_err_log = (char *)get_zeroed_page(GFP_KERNEL);
> + if (!event_err_log)
> + return NULL;
> + }
> +
> + errpos = event_err_log + event_err_log_tail * sizeof(*err);
> + err = (struct event_log_err *)errpos;
> +
> + event_err_log_tail = (event_err_log_tail + 1) & EVENT_ERR_LOG_MASK;

So you add errors one after the other:

First error:

 err1,NULL,NULL,NULL,...
  ^
 tail

Second error:

 err1,err2,NULL,NULL,...
   ^
  tail

Third error:

 err1,err2,err3,NULL,
^
   tail

> +
> + return err;
> +}
> +
> +/**
> + * event_log_err - write an error to the trace event error log
> + * @loc: A string describing where the error occurred
> + * @cmd: The trace event command that caused the error
> + * @fmt: snprintf format string
> + * @...: variable length list of snprintf args
> + *
> + * Writes an error into tracing/events/error_log of the form:
> + *
> + * ERROR(): 
> + *   Command: 
> + *
> + * tracing/events/error_log is a small log file containing the last
> + * EVENT_LOG_ERRS_MAX errors (8).  Memory for the error log isn't
> + * allocated unless there has been a trace event error, and the error
> + * log can be cleared and have its memory freed by writing the empty
> + * string in truncation mode to it i.e. echo > error_log.
> + *
> + * Must be called with event_mutex held.
> + */
> +void event_log_err(const char *loc, const char *cmd, const char *fmt, ...)
> +{
> + struct event_log_err *err;
> + va_list args;
> + int len;
> +
> + err = get_event_log_err();
> + if (!err)
> + return;
> +
> + snprintf(err->cmd, MAX_FILTER_STR_VAL,"\n  Command: %s\n", cmd);
> +
> + len = snprintf(err->err, MAX_FILTER_STR_VAL, "ERROR(%s): ", loc);
> + if (len >= MAX_FILTER_STR_VAL)
> + return;
> +
> + va_start(args, fmt);
> + vsnprintf(err->err + len, MAX_FILTER_STR_VAL - len, fmt, args);
> + va_end(args);
> +}
> +
> +static void clear_event_err_log(void)
> +{
> + free_page((long unsigned int)event_err_log);
> + event_err_log_tail = 0;
> + event_err_log = NULL;
> +}
> +
> +static void *event_err_log_inc(loff_t *pos)
> +{
> + struct event_log_err *err = NULL;
> + char *errpos = NULL;
> + int i = *pos;
> +
> + ++*pos;
> +
> + if (i >= EVENT_LOG_ERRS_MAX)
> + return NULL;
> +
> + i += event_err_log_tail;
> + i &= EVENT_ERR_LOG_MASK;
> +
> + errpos = event_err_log + (i * sizeof(*err));
> + err = (struct event_log_err *)errpos;

Now I'm confused. i += tail, 

Re: [PATCH 2/6] tracing: Add trace event error log

2018-04-12 Thread Steven Rostedt
On Thu, 12 Apr 2018 10:13:17 -0500
Tom Zanussi  wrote:

> diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
> index 6fb46a0..f2dc7e6 100644
> --- a/kernel/trace/trace.h
> +++ b/kernel/trace/trace.h
> @@ -1765,6 +1765,9 @@ extern ssize_t trace_parse_run_command(struct file 
> *file,
>   const char __user *buffer, size_t count, loff_t *ppos,
>   int (*createfn)(int, char**));
>  
> +extern void event_log_err(const char *loc, const char *cmd, const char *fmt,
> +   ...);
> +
>  /*
>   * Normal trace_printk() and friends allocates special buffers
>   * to do the manipulation, as well as saves the print formats
> diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> index 05c7172..fd02e22 100644
> --- a/kernel/trace/trace_events.c
> +++ b/kernel/trace/trace_events.c
> @@ -1668,6 +1668,164 @@ static void ignore_task_cpu(void *data)
>   return ret;
>  }
>  
> +#define EVENT_LOG_ERRS_MAX   (PAGE_SIZE / sizeof(struct event_log_err))

> +#define EVENT_ERR_LOG_MASK   (EVENT_LOG_ERRS_MAX - 1)

BTW, the above only works if EVENT_LOG_ERRS_MAX is a power of two,
which it's not guaranteed to be.

> +
> +struct event_log_err {
> + charerr[MAX_FILTER_STR_VAL];
> + charcmd[MAX_FILTER_STR_VAL];
> +};

I like the event_log_err idea, but the above can be shrunk to:

struct err_info {
u8  type; /* I can only imagine 254 types */
u8  pos;  /* MAX_FILTER_STR_VAR = 256 */
};

struct event_log_err {
struct err_info info;
charcmd[MAX_FILTER_STR_VAL];
};

There's no reason to put in a bunch of text that's going to be static
anyway. Have a lookup table like we do for filters.

+   log_err("Variable name not unique, need to use 
fully qualified name (%s) for variable: ", fqvar(system, event_name, var_name, 
true));


Instead of making the fqvar, find the location of the variable, and add:

 blah blah $var blah blah
^
  Variable name not unique, need to use fully qualified name for variable

> +
> +static char *event_err_log;
> +static unsigned int event_err_log_tail;
> +
> +struct event_log_err *get_event_log_err(void)
> +{
> + struct event_log_err *err;
> + char *errpos;
> +
> + if (!event_err_log) {
> + event_err_log = (char *)get_zeroed_page(GFP_KERNEL);
> + if (!event_err_log)
> + return NULL;
> + }
> +
> + errpos = event_err_log + event_err_log_tail * sizeof(*err);
> + err = (struct event_log_err *)errpos;
> +
> + event_err_log_tail = (event_err_log_tail + 1) & EVENT_ERR_LOG_MASK;

So you add errors one after the other:

First error:

 err1,NULL,NULL,NULL,...
  ^
 tail

Second error:

 err1,err2,NULL,NULL,...
   ^
  tail

Third error:

 err1,err2,err3,NULL,
^
   tail

> +
> + return err;
> +}
> +
> +/**
> + * event_log_err - write an error to the trace event error log
> + * @loc: A string describing where the error occurred
> + * @cmd: The trace event command that caused the error
> + * @fmt: snprintf format string
> + * @...: variable length list of snprintf args
> + *
> + * Writes an error into tracing/events/error_log of the form:
> + *
> + * ERROR(): 
> + *   Command: 
> + *
> + * tracing/events/error_log is a small log file containing the last
> + * EVENT_LOG_ERRS_MAX errors (8).  Memory for the error log isn't
> + * allocated unless there has been a trace event error, and the error
> + * log can be cleared and have its memory freed by writing the empty
> + * string in truncation mode to it i.e. echo > error_log.
> + *
> + * Must be called with event_mutex held.
> + */
> +void event_log_err(const char *loc, const char *cmd, const char *fmt, ...)
> +{
> + struct event_log_err *err;
> + va_list args;
> + int len;
> +
> + err = get_event_log_err();
> + if (!err)
> + return;
> +
> + snprintf(err->cmd, MAX_FILTER_STR_VAL,"\n  Command: %s\n", cmd);
> +
> + len = snprintf(err->err, MAX_FILTER_STR_VAL, "ERROR(%s): ", loc);
> + if (len >= MAX_FILTER_STR_VAL)
> + return;
> +
> + va_start(args, fmt);
> + vsnprintf(err->err + len, MAX_FILTER_STR_VAL - len, fmt, args);
> + va_end(args);
> +}
> +
> +static void clear_event_err_log(void)
> +{
> + free_page((long unsigned int)event_err_log);
> + event_err_log_tail = 0;
> + event_err_log = NULL;
> +}
> +
> +static void *event_err_log_inc(loff_t *pos)
> +{
> + struct event_log_err *err = NULL;
> + char *errpos = NULL;
> + int i = *pos;
> +
> + ++*pos;
> +
> + if (i >= EVENT_LOG_ERRS_MAX)
> + return NULL;
> +
> + i += event_err_log_tail;
> + i &= EVENT_ERR_LOG_MASK;
> +
> + errpos = event_err_log + (i * sizeof(*err));
> + err = (struct event_log_err *)errpos;

Now I'm confused. i += tail, so on *pos = 0, and tail = 3, 

[PATCH 2/6] tracing: Add trace event error log

2018-04-12 Thread Tom Zanussi
Introduce a new trace event file, tracing/events/error_log, for trace
event commands to log non-critical errors.  This is useful for
allowing more complex commands such as hist trigger and kprobe_event
commands to point out specifically where something may have gone wrong
without forcing them to resort to more ad hoc methods such as tacking
error messages onto existing output files.

To log a trace event error, call the event_log_err() function with the
event_mutex held, passing it a location string describing where it
came from e.g. kprobe_events or system:event, the command that caused
the error, and a format string and variable-length number of
corresponding snprintf args.

Reading the log displays the last (currently) 8 errors logged in the
following format:

  ERROR(): 
Command: 

Memory for the error log isn't allocated unless there has been a trace
event error, and the error log can be cleared and have its memory
freed by writing the empty string in truncation mode to it:

  # echo > error_log.

Signed-off-by: Tom Zanussi 
Suggested-by: Masami Hiramatsu 
---
 kernel/trace/trace.h|   3 +
 kernel/trace/trace_events.c | 172 
 2 files changed, 175 insertions(+)

diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 6fb46a0..f2dc7e6 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -1765,6 +1765,9 @@ extern ssize_t trace_parse_run_command(struct file *file,
const char __user *buffer, size_t count, loff_t *ppos,
int (*createfn)(int, char**));
 
+extern void event_log_err(const char *loc, const char *cmd, const char *fmt,
+ ...);
+
 /*
  * Normal trace_printk() and friends allocates special buffers
  * to do the manipulation, as well as saves the print formats
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 05c7172..fd02e22 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -1668,6 +1668,164 @@ static void ignore_task_cpu(void *data)
return ret;
 }
 
+#define EVENT_LOG_ERRS_MAX (PAGE_SIZE / sizeof(struct event_log_err))
+#define EVENT_ERR_LOG_MASK (EVENT_LOG_ERRS_MAX - 1)
+
+struct event_log_err {
+   charerr[MAX_FILTER_STR_VAL];
+   charcmd[MAX_FILTER_STR_VAL];
+};
+
+static char *event_err_log;
+static unsigned int event_err_log_tail;
+
+struct event_log_err *get_event_log_err(void)
+{
+   struct event_log_err *err;
+   char *errpos;
+
+   if (!event_err_log) {
+   event_err_log = (char *)get_zeroed_page(GFP_KERNEL);
+   if (!event_err_log)
+   return NULL;
+   }
+
+   errpos = event_err_log + event_err_log_tail * sizeof(*err);
+   err = (struct event_log_err *)errpos;
+
+   event_err_log_tail = (event_err_log_tail + 1) & EVENT_ERR_LOG_MASK;
+
+   return err;
+}
+
+/**
+ * event_log_err - write an error to the trace event error log
+ * @loc: A string describing where the error occurred
+ * @cmd: The trace event command that caused the error
+ * @fmt: snprintf format string
+ * @...: variable length list of snprintf args
+ *
+ * Writes an error into tracing/events/error_log of the form:
+ *
+ * ERROR(): 
+ *   Command: 
+ *
+ * tracing/events/error_log is a small log file containing the last
+ * EVENT_LOG_ERRS_MAX errors (8).  Memory for the error log isn't
+ * allocated unless there has been a trace event error, and the error
+ * log can be cleared and have its memory freed by writing the empty
+ * string in truncation mode to it i.e. echo > error_log.
+ *
+ * Must be called with event_mutex held.
+ */
+void event_log_err(const char *loc, const char *cmd, const char *fmt, ...)
+{
+   struct event_log_err *err;
+   va_list args;
+   int len;
+
+   err = get_event_log_err();
+   if (!err)
+   return;
+
+   snprintf(err->cmd, MAX_FILTER_STR_VAL,"\n  Command: %s\n", cmd);
+
+   len = snprintf(err->err, MAX_FILTER_STR_VAL, "ERROR(%s): ", loc);
+   if (len >= MAX_FILTER_STR_VAL)
+   return;
+
+   va_start(args, fmt);
+   vsnprintf(err->err + len, MAX_FILTER_STR_VAL - len, fmt, args);
+   va_end(args);
+}
+
+static void clear_event_err_log(void)
+{
+   free_page((long unsigned int)event_err_log);
+   event_err_log_tail = 0;
+   event_err_log = NULL;
+}
+
+static void *event_err_log_inc(loff_t *pos)
+{
+   struct event_log_err *err = NULL;
+   char *errpos = NULL;
+   int i = *pos;
+
+   ++*pos;
+
+   if (i >= EVENT_LOG_ERRS_MAX)
+   return NULL;
+
+   i += event_err_log_tail;
+   i &= EVENT_ERR_LOG_MASK;
+
+   errpos = event_err_log + (i * sizeof(*err));
+   err = (struct event_log_err *)errpos;
+
+   return err;
+}
+
+static void *event_err_log_seq_start(struct seq_file *m, loff_t *pos)
+{
+   

[PATCH 2/6] tracing: Add trace event error log

2018-04-12 Thread Tom Zanussi
Introduce a new trace event file, tracing/events/error_log, for trace
event commands to log non-critical errors.  This is useful for
allowing more complex commands such as hist trigger and kprobe_event
commands to point out specifically where something may have gone wrong
without forcing them to resort to more ad hoc methods such as tacking
error messages onto existing output files.

To log a trace event error, call the event_log_err() function with the
event_mutex held, passing it a location string describing where it
came from e.g. kprobe_events or system:event, the command that caused
the error, and a format string and variable-length number of
corresponding snprintf args.

Reading the log displays the last (currently) 8 errors logged in the
following format:

  ERROR(): 
Command: 

Memory for the error log isn't allocated unless there has been a trace
event error, and the error log can be cleared and have its memory
freed by writing the empty string in truncation mode to it:

  # echo > error_log.

Signed-off-by: Tom Zanussi 
Suggested-by: Masami Hiramatsu 
---
 kernel/trace/trace.h|   3 +
 kernel/trace/trace_events.c | 172 
 2 files changed, 175 insertions(+)

diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 6fb46a0..f2dc7e6 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -1765,6 +1765,9 @@ extern ssize_t trace_parse_run_command(struct file *file,
const char __user *buffer, size_t count, loff_t *ppos,
int (*createfn)(int, char**));
 
+extern void event_log_err(const char *loc, const char *cmd, const char *fmt,
+ ...);
+
 /*
  * Normal trace_printk() and friends allocates special buffers
  * to do the manipulation, as well as saves the print formats
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 05c7172..fd02e22 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -1668,6 +1668,164 @@ static void ignore_task_cpu(void *data)
return ret;
 }
 
+#define EVENT_LOG_ERRS_MAX (PAGE_SIZE / sizeof(struct event_log_err))
+#define EVENT_ERR_LOG_MASK (EVENT_LOG_ERRS_MAX - 1)
+
+struct event_log_err {
+   charerr[MAX_FILTER_STR_VAL];
+   charcmd[MAX_FILTER_STR_VAL];
+};
+
+static char *event_err_log;
+static unsigned int event_err_log_tail;
+
+struct event_log_err *get_event_log_err(void)
+{
+   struct event_log_err *err;
+   char *errpos;
+
+   if (!event_err_log) {
+   event_err_log = (char *)get_zeroed_page(GFP_KERNEL);
+   if (!event_err_log)
+   return NULL;
+   }
+
+   errpos = event_err_log + event_err_log_tail * sizeof(*err);
+   err = (struct event_log_err *)errpos;
+
+   event_err_log_tail = (event_err_log_tail + 1) & EVENT_ERR_LOG_MASK;
+
+   return err;
+}
+
+/**
+ * event_log_err - write an error to the trace event error log
+ * @loc: A string describing where the error occurred
+ * @cmd: The trace event command that caused the error
+ * @fmt: snprintf format string
+ * @...: variable length list of snprintf args
+ *
+ * Writes an error into tracing/events/error_log of the form:
+ *
+ * ERROR(): 
+ *   Command: 
+ *
+ * tracing/events/error_log is a small log file containing the last
+ * EVENT_LOG_ERRS_MAX errors (8).  Memory for the error log isn't
+ * allocated unless there has been a trace event error, and the error
+ * log can be cleared and have its memory freed by writing the empty
+ * string in truncation mode to it i.e. echo > error_log.
+ *
+ * Must be called with event_mutex held.
+ */
+void event_log_err(const char *loc, const char *cmd, const char *fmt, ...)
+{
+   struct event_log_err *err;
+   va_list args;
+   int len;
+
+   err = get_event_log_err();
+   if (!err)
+   return;
+
+   snprintf(err->cmd, MAX_FILTER_STR_VAL,"\n  Command: %s\n", cmd);
+
+   len = snprintf(err->err, MAX_FILTER_STR_VAL, "ERROR(%s): ", loc);
+   if (len >= MAX_FILTER_STR_VAL)
+   return;
+
+   va_start(args, fmt);
+   vsnprintf(err->err + len, MAX_FILTER_STR_VAL - len, fmt, args);
+   va_end(args);
+}
+
+static void clear_event_err_log(void)
+{
+   free_page((long unsigned int)event_err_log);
+   event_err_log_tail = 0;
+   event_err_log = NULL;
+}
+
+static void *event_err_log_inc(loff_t *pos)
+{
+   struct event_log_err *err = NULL;
+   char *errpos = NULL;
+   int i = *pos;
+
+   ++*pos;
+
+   if (i >= EVENT_LOG_ERRS_MAX)
+   return NULL;
+
+   i += event_err_log_tail;
+   i &= EVENT_ERR_LOG_MASK;
+
+   errpos = event_err_log + (i * sizeof(*err));
+   err = (struct event_log_err *)errpos;
+
+   return err;
+}
+
+static void *event_err_log_seq_start(struct seq_file *m, loff_t *pos)
+{
+   mutex_lock(_mutex);
+
+   return