Re: [PATCH v2 0/7] perf trace pagefaults

2014-06-24 Thread Arnaldo Carvalho de Melo
Em Mon, Jun 23, 2014 at 03:41:39PM +0400, Stanislav Fomichev escreveu:
> > > But we then need to predefine many probes for decoding to work in the 
> > > form of
> > > func:offset, and then play catch-up with all the kernel changes.
> > > Or I miss something important here?

> > No you don't.

> > If we want to disturb the system in the least way possible, we need to
> > tag along the copying from userspace of those pointers, so that we get
> > them fresh and just stash it in our ring buffer and get out of the way
> > quickly.

> I just thought maybe you have some grand plan in mind about automagically
> adding probes so argument tracing works transparently. I like the
> approach though.

First we use what we have in place, then we optimize it.
 
> > Almost a year ago, and it still works, now lets see the cset you mention...
> > 
> > [acme@zoo linux]$ git describe c4ad8f98bef77c7356aa6a9ad9188a6acc6b849d
> > v3.14-rc1-14-gc4ad8f98bef7
> > [acme@zoo linux]$
> > [root@zoo ~]# uname -r
> > 3.15.0-rc8+
> > 
> > Humm, what is the problem?

> I thought that result->name was actually set on 65th line of
> getname_flags, so the above commit would move it to 66th. But it's not
> the case, sorry for confusion.
 
> > [1] And I feel like all of tools/perf/ is just that, reference 
> > implementations, but hopefully
> > done in a such a way that may well be useful as-is :-)

> I'd like perf to be a goto tool for all kind of performance analysis,

yay, and you're working for that, thanks!

> not just a reference implementation. I believe nobody looks at this
> reference, and we end up with tools like https://github.com/draios/sysdig

Never heard about it, will take a look, thanks for the pointer.

> which do their own events, ring buffer, etc.

There are several out there :)

- Arnaldo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] perf trace pagefaults

2014-06-24 Thread Namhyung Kim
Hi Arnaldo and Stanislav,

On Fri, 20 Jun 2014 10:21:05 -0300, Arnaldo Carvalho de Melo wrote:
> Em Fri, Jun 20, 2014 at 02:49:42PM +0400, Stanislav Fomichev escreveu:
>> This patch series adds support for pagefaults tracing to 'perf trace' 
>> command.
>> It seems this feature was planned by Namhyung Kim 
>> (http://events.linuxfoundation.org/images/stories/pdf/klf2012_n_kim.pdf page 
>> 17/28)
>> but I couldn't find any prior patches/discussion and started from scratch.
>
> Just to clarify here, those slides came from slides I made and in turn
> the whole idea about pagefaults tracing I got from the trace prototype
> that Thomas Gleixner implemented in his 'trace'  utility, described
> here:
>
>   Announcing a new utility: 'trace'
>   http://lwn.net/Articles/415728/

Right, I asked to Arnaldo to suggest some cool topics to introduce in
KLF 2012 and that was it.   I had nothing with the features. :)

Keep going nice works, guys!

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] perf trace pagefaults

2014-06-24 Thread Namhyung Kim
Hi Arnaldo and Stanislav,

On Fri, 20 Jun 2014 10:21:05 -0300, Arnaldo Carvalho de Melo wrote:
 Em Fri, Jun 20, 2014 at 02:49:42PM +0400, Stanislav Fomichev escreveu:
 This patch series adds support for pagefaults tracing to 'perf trace' 
 command.
 It seems this feature was planned by Namhyung Kim 
 (http://events.linuxfoundation.org/images/stories/pdf/klf2012_n_kim.pdf page 
 17/28)
 but I couldn't find any prior patches/discussion and started from scratch.

 Just to clarify here, those slides came from slides I made and in turn
 the whole idea about pagefaults tracing I got from the trace prototype
 that Thomas Gleixner implemented in his 'trace'  utility, described
 here:

   Announcing a new utility: 'trace'
   http://lwn.net/Articles/415728/

Right, I asked to Arnaldo to suggest some cool topics to introduce in
KLF 2012 and that was it.   I had nothing with the features. :)

Keep going nice works, guys!

Thanks,
Namhyung
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] perf trace pagefaults

2014-06-24 Thread Arnaldo Carvalho de Melo
Em Mon, Jun 23, 2014 at 03:41:39PM +0400, Stanislav Fomichev escreveu:
   But we then need to predefine many probes for decoding to work in the 
   form of
   func:offset, and then play catch-up with all the kernel changes.
   Or I miss something important here?

  No you don't.

  If we want to disturb the system in the least way possible, we need to
  tag along the copying from userspace of those pointers, so that we get
  them fresh and just stash it in our ring buffer and get out of the way
  quickly.

 I just thought maybe you have some grand plan in mind about automagically
 adding probes so argument tracing works transparently. I like the
 approach though.

First we use what we have in place, then we optimize it.
 
  Almost a year ago, and it still works, now lets see the cset you mention...
  
  [acme@zoo linux]$ git describe c4ad8f98bef77c7356aa6a9ad9188a6acc6b849d
  v3.14-rc1-14-gc4ad8f98bef7
  [acme@zoo linux]$
  [root@zoo ~]# uname -r
  3.15.0-rc8+
  
  Humm, what is the problem?

 I thought that result-name was actually set on 65th line of
 getname_flags, so the above commit would move it to 66th. But it's not
 the case, sorry for confusion.
 
  [1] And I feel like all of tools/perf/ is just that, reference 
  implementations, but hopefully
  done in a such a way that may well be useful as-is :-)

 I'd like perf to be a goto tool for all kind of performance analysis,

yay, and you're working for that, thanks!

 not just a reference implementation. I believe nobody looks at this
 reference, and we end up with tools like https://github.com/draios/sysdig

Never heard about it, will take a look, thanks for the pointer.

 which do their own events, ring buffer, etc.

There are several out there :)

- Arnaldo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] perf trace pagefaults

2014-06-23 Thread David Ahern

On 6/20/14, 9:24 AM, Arnaldo Carvalho de Melo wrote:

Right now it is too simple, but I was starting to work (when you jumped
right in with your work making me stop and go on testing/reviewing :) )
on making it more generic so that we could defer pretty printing the
arguments from sys_enter to sys_exit, when, by then, we would already
have an association of a user level pointer in some specific thread to
its contents.

This will allow us to to resolve the pathname pointer in things like
open() (i.e. not just after that, in the fd syscalls (write, etc)) and
as well any other pointer of interest.

By librarizing 'builtin-probe.c', that now uses lots of global
variables, etc, we would be able to insert probes where we want them to
capture the contents of pointers, check if the probes are already in
place, use just the ones that we managed to insert (i.e. that were not
invalid because the places where we wanted them to be were changed
across kernel releases, etc).

I.e. no need for actual tracepoints from day one, just wannabe
tracepoints using whatever probe inserting gizmo the kprobes_tracer used
by 'perf probe' now thinks its best to use.

Combine that with using DWARF descriptions (that could be pre cached
into something like CTF (the DTrace kind of CTF) or similar) like pahole
does and we would mostly automatically do all this work of prettyfing
syscall parameters.




That was so much handwaving you could keep cool at a World Cup game. :-)

David

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] perf trace pagefaults

2014-06-23 Thread Stanislav Fomichev
> > But we then need to predefine many probes for decoding to work in the form 
> > of
> > func:offset, and then play catch-up with all the kernel changes.
> > Or I miss something important here?
> 
> No you don't.
> 
> If we want to disturb the system in the least way possible, we need to
> tag along the copying from userspace of those pointers, so that we get
> them fresh and just stash it in our ring buffer and get out of the way
> quickly.
I just thought maybe you have some grand plan in mind about automagically
adding probes so argument tracing works transparently. I like the
approach though.

> Almost a year ago, and it still works, now lets see the cset you mention...
> 
> [acme@zoo linux]$ git describe c4ad8f98bef77c7356aa6a9ad9188a6acc6b849d
> v3.14-rc1-14-gc4ad8f98bef7
> [acme@zoo linux]$
> [root@zoo ~]# uname -r
> 3.15.0-rc8+
> 
> Humm, what is the problem?
I thought that result->name was actually set on 65th line of
getname_flags, so the above commit would move it to 66th. But it's not
the case, sorry for confusion.

> [1] And I feel like all of tools/perf/ is just that, reference 
> implementations, but hopefully
> done in a such a way that may well be useful as-is :-)
I'd like perf to be a goto tool for all kind of performance analysis,
not just a reference implementation. I believe nobody looks at this
reference, and we end up with tools like https://github.com/draios/sysdig
which do their own events, ring buffer, etc.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] perf trace pagefaults

2014-06-23 Thread David Ahern

On 6/20/14, 9:24 AM, Arnaldo Carvalho de Melo wrote:

Right now it is too simple, but I was starting to work (when you jumped
right in with your work making me stop and go on testing/reviewing :) )
on making it more generic so that we could defer pretty printing the
arguments from sys_enter to sys_exit, when, by then, we would already
have an association of a user level pointer in some specific thread to
its contents.

This will allow us to to resolve the pathname pointer in things like
open() (i.e. not just after that, in the fd syscalls (write, etc)) and
as well any other pointer of interest.

By librarizing 'builtin-probe.c', that now uses lots of global
variables, etc, we would be able to insert probes where we want them to
capture the contents of pointers, check if the probes are already in
place, use just the ones that we managed to insert (i.e. that were not
invalid because the places where we wanted them to be were changed
across kernel releases, etc).

I.e. no need for actual tracepoints from day one, just wannabe
tracepoints using whatever probe inserting gizmo the kprobes_tracer used
by 'perf probe' now thinks its best to use.

Combine that with using DWARF descriptions (that could be pre cached
into something like CTF (the DTrace kind of CTF) or similar) like pahole
does and we would mostly automatically do all this work of prettyfing
syscall parameters.

/handwave


That was so much handwaving you could keep cool at a World Cup game. :-)

David

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] perf trace pagefaults

2014-06-23 Thread Stanislav Fomichev
  But we then need to predefine many probes for decoding to work in the form 
  of
  func:offset, and then play catch-up with all the kernel changes.
  Or I miss something important here?
 
 No you don't.
 
 If we want to disturb the system in the least way possible, we need to
 tag along the copying from userspace of those pointers, so that we get
 them fresh and just stash it in our ring buffer and get out of the way
 quickly.
I just thought maybe you have some grand plan in mind about automagically
adding probes so argument tracing works transparently. I like the
approach though.

 Almost a year ago, and it still works, now lets see the cset you mention...
 
 [acme@zoo linux]$ git describe c4ad8f98bef77c7356aa6a9ad9188a6acc6b849d
 v3.14-rc1-14-gc4ad8f98bef7
 [acme@zoo linux]$
 [root@zoo ~]# uname -r
 3.15.0-rc8+
 
 Humm, what is the problem?
I thought that result-name was actually set on 65th line of
getname_flags, so the above commit would move it to 66th. But it's not
the case, sorry for confusion.

 [1] And I feel like all of tools/perf/ is just that, reference 
 implementations, but hopefully
 done in a such a way that may well be useful as-is :-)
I'd like perf to be a goto tool for all kind of performance analysis,
not just a reference implementation. I believe nobody looks at this
reference, and we end up with tools like https://github.com/draios/sysdig
which do their own events, ring buffer, etc.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] perf trace pagefaults

2014-06-20 Thread Arnaldo Carvalho de Melo
Em Fri, Jun 20, 2014 at 08:18:59PM +0400, Stanislav Fomichev escreveu:
> > Hey, haven't you seen the vfs_getname probe? Idea is to hook on where
> > the relevant copy_from_user is done and insert that into the ring
> > buffer, as we already do for mapping fd -> pathname.

> I saw it but didn't actually try because it needs all the debugging
> stuff enabled and in place.

Touché, more on that below...
 
> > I.e. no need for actual tracepoints from day one, just wannabe
> > tracepoints using whatever probe inserting gizmo the kprobes_tracer used
> > by 'perf probe' now thinks its best to use.

> But we then need to predefine many probes for decoding to work in the form of
> func:offset, and then play catch-up with all the kernel changes.
> Or I miss something important here?

No you don't.

If we want to disturb the system in the least way possible, we need to
tag along the copying from userspace of those pointers, so that we get
them fresh and just stash it in our ring buffer and get out of the way
quickly.
 
> > For now try:
> > 
> >   perf probe 'vfs_getname=getname_flags:65 pathname=result->name:string'
> >   trace
> > 
> > And look at how it manages to decode fds.

> I will try, but does 65 still work after 
> c4ad8f98bef77c7356aa6a9ad9188a6acc6b849d? :-)

Well, when I prototyped this[1] the idea is that in some areas, there is
not that much code flux that before commiting to any kind of new
interface, be it tracepoints or something else, we may well just use
'perf probe' to get what we need, and this was done in...

  commit 75b757ca90469e990e6901f4a9497fe4161f7f5a
  Author: Arnaldo Carvalho de Melo 
  Date:   Tue Sep 24 11:04:32 2013 -0300

Almost a year ago, and it still works, now lets see the cset you mention...

[acme@zoo linux]$ git describe c4ad8f98bef77c7356aa6a9ad9188a6acc6b849d
v3.14-rc1-14-gc4ad8f98bef7
[acme@zoo linux]$
[root@zoo ~]# uname -r
3.15.0-rc8+

Humm, what is the problem?

[root@zoo ~]# perf probe -V getname_flags:65
Available variables at getname_flags:65
@
char*   filename
int len
int*empty
long intmax
struct filename*result
[root@zoo ~]#
[root@zoo ~]# perf probe 'vfs_getname=getname_flags:65 
pathname=result->name:string'
Added new event:
  probe:vfs_getname(on getname_flags:65 with pathname=result->name:string)

You can now use it in all perf tools, such as:

perf record -e probe:vfs_getname -aR sleep 1

[root@zoo ~]# perf record -e probe:vfs_getname -aR sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.133 MB perf.data (~49505 samples) ]
[root@zoo ~]# perf evlist
probe:vfs_getname
[root@zoo ~]# perf evlist -v
probe:vfs_getname: sample_freq=1, type: 2, config: 1317, size: 96, sample_type: 
IP|TID|TIME|CPU|PERIOD|RAW, disabled: 1, inherit: 1, mmap: 1, comm: 1, 
sample_id_all: 1, exclude_guest: 1
[root@zoo ~]# perf script 
perf 11255 [003] 156054.623210: probe:vfs_getname: 
(811c2e43) pathname="/home/acme/libexec/perf-core/sleep"
perf 11255 [003] 156054.624759: probe:vfs_getname: 
(811c2e43) pathname="/usr/lib64/qt-3.3/bin/sleep"
perf 11255 [003] 156054.624782: probe:vfs_getname: 
(811c2e43) pathname="/usr/lib64/ccache/sleep"
perf 11255 [003] 156054.624794: probe:vfs_getname: 
(811c2e43) pathname="/usr/local/sbin/sleep"
perf 11255 [003] 156054.624809: probe:vfs_getname: 
(811c2e43) pathname="/usr/local/bin/sleep"
perf 11255 [003] 156054.624818: probe:vfs_getname: 
(811c2e43) pathname="/sbin/sleep"
perf 11255 [003] 156054.625017: probe:vfs_getname: 
(811c2e43) pathname="/bin/sleep"
   sleep 11255 [002] 156054.626093: probe:vfs_getname: 
(811c2e43) pathname="/etc/ld.so.preload"
   sleep 11255 [002] 156054.626114: probe:vfs_getname: 
(811c2e43) pathname="/etc/ld.so.cache"
   sleep 11255 [002] 156054.626159: probe:vfs_getname: 
(811c2e43) pathname="/lib64/libc.so.6"
   sleep 11255 [002] 156054.626751: probe:vfs_getname: 
(811c2e43) pathname="/usr/lib/locale/locale-archive"
  goa-daemon  2082 [003] 156054.955138: probe:vfs_getname: 
(811c2e43) pathname="/etc/localtime"
  goa-daemon  2082 [003] 156054.955573: probe:vfs_getname: 
(811c2e43) pathname="/etc/localtime"
[root@zoo ~]# 

Best possible way to do this? Guess not, but I'm looking from a tooling
perspective, i.e. about using what is available, not about adding requirements
to the kernel or toolchain, that we can do after we prototype in the best way
possible with existing facilities.

- Arnaldo

[1] And I feel like all of tools/perf/ is just that, reference implementations, 
but hopefully
done in a such a way that may well be useful as-is :-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a 

Re: [PATCH v2 0/7] perf trace pagefaults

2014-06-20 Thread Stanislav Fomichev
> Hey, haven't you seen the vfs_getname probe? Idea is to hook on where
> the relevant copy_from_user is done and insert that into the ring
> buffer, as we already do for mapping fd -> pathname.
I saw it but didn't actually try because it needs all the debugging
stuff enabled and in place.

> I.e. no need for actual tracepoints from day one, just wannabe
> tracepoints using whatever probe inserting gizmo the kprobes_tracer used
> by 'perf probe' now thinks its best to use.
But we then need to predefine many probes for decoding to work in the form of
func:offset, and then play catch-up with all the kernel changes.
Or I miss something important here?

> For now try:
> 
>   perf probe 'vfs_getname=getname_flags:65 pathname=result->name:string'
>   trace
> 
> And look at how it manages to decode fds.
I will try, but does 65 still work after 
c4ad8f98bef77c7356aa6a9ad9188a6acc6b849d? :-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] perf trace pagefaults

2014-06-20 Thread Arnaldo Carvalho de Melo
Em Fri, Jun 20, 2014 at 07:03:18PM +0400, Stanislav Fomichev escreveu:
> > Just to clarify here, those slides came from slides I made and in turn
> > the whole idea about pagefaults tracing I got from the trace prototype
> > that Thomas Gleixner implemented in his 'trace'  utility, described
> > here:

> >   Announcing a new utility: 'trace'
> >   http://lwn.net/Articles/415728/

> > The comments section has lots of interesting ideas, some you may find
> > interesting to implement :-)

> > There is a branch in my tree with the branch tglx did his work on:

> > https://git.kernel.org/cgit/linux/kernel/git/acme/linux.git/log/?h=tmp.perf/trace2

> Wow, thanks, I tried to search lkml for any presence of
> patches/discussion about these slides, but couldn't find anything, thanks for
> pointing it out.
 
> I really like 'blocking/preempted' indication and of course I miss
> pointers decoding.
 
> Did anyone really think about decoding pointers and how we can
> implement it (like dumping them upon entering a syscall and then
> using inside the perf trace?)?

Hey, haven't you seen the vfs_getname probe? Idea is to hook on where
the relevant copy_from_user is done and insert that into the ring
buffer, as we already do for mapping fd -> pathname.

Right now it is too simple, but I was starting to work (when you jumped
right in with your work making me stop and go on testing/reviewing :) )
on making it more generic so that we could defer pretty printing the
arguments from sys_enter to sys_exit, when, by then, we would already
have an association of a user level pointer in some specific thread to
its contents.

This will allow us to to resolve the pathname pointer in things like
open() (i.e. not just after that, in the fd syscalls (write, etc)) and
as well any other pointer of interest.

By librarizing 'builtin-probe.c', that now uses lots of global
variables, etc, we would be able to insert probes where we want them to
capture the contents of pointers, check if the probes are already in
place, use just the ones that we managed to insert (i.e. that were not
invalid because the places where we wanted them to be were changed
across kernel releases, etc).

I.e. no need for actual tracepoints from day one, just wannabe
tracepoints using whatever probe inserting gizmo the kprobes_tracer used
by 'perf probe' now thinks its best to use.

Combine that with using DWARF descriptions (that could be pre cached
into something like CTF (the DTrace kind of CTF) or similar) like pahole
does and we would mostly automatically do all this work of prettyfing
syscall parameters.



:-)

For now try:

  perf probe 'vfs_getname=getname_flags:65 pathname=result->name:string'
  trace

And look at how it manages to decode fds.

- Arnaldo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] perf trace pagefaults

2014-06-20 Thread Stanislav Fomichev
> Just to clarify here, those slides came from slides I made and in turn
> the whole idea about pagefaults tracing I got from the trace prototype
> that Thomas Gleixner implemented in his 'trace'  utility, described
> here:
> 
>   Announcing a new utility: 'trace'
>   http://lwn.net/Articles/415728/
> 
> The comments section has lots of interesting ideas, some you may find
> interesting to implement :-)
> 
> There is a branch in my tree with the branch tglx did his work on:
> 
> https://git.kernel.org/cgit/linux/kernel/git/acme/linux.git/log/?h=tmp.perf/trace2
Wow, thanks, I tried to search lkml for any presence of
patches/discussion about these slides, but couldn't find anything, thanks for
pointing it out.

I really like 'blocking/preempted' indication and of course I miss
pointers decoding.

Did anyone really think about decoding pointers and how we can
implement it (like dumping them upon entering a syscall and then
using inside the perf trace?)?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] perf trace pagefaults

2014-06-20 Thread Arnaldo Carvalho de Melo
Em Fri, Jun 20, 2014 at 02:49:42PM +0400, Stanislav Fomichev escreveu:
> This patch series adds support for pagefaults tracing to 'perf trace' command.
> It seems this feature was planned by Namhyung Kim 
> (http://events.linuxfoundation.org/images/stories/pdf/klf2012_n_kim.pdf page 
> 17/28)
> but I couldn't find any prior patches/discussion and started from scratch.

Just to clarify here, those slides came from slides I made and in turn
the whole idea about pagefaults tracing I got from the trace prototype
that Thomas Gleixner implemented in his 'trace'  utility, described
here:

  Announcing a new utility: 'trace'
  http://lwn.net/Articles/415728/

The comments section has lots of interesting ideas, some you may find
interesting to implement :-)

There is a branch in my tree with the branch tglx did his work on:

https://git.kernel.org/cgit/linux/kernel/git/acme/linux.git/log/?h=tmp.perf/trace2

There you can take a look and compare what you're doing to what he did.

Now I'll go thru your current patches and will cherry pick whatever I
think its OK already, and will try and provide comments for whatever I
think needs more work.

- Arnaldo
 
> First three patches add the feature and options to enable faults and disable
> syscalls.
> Two last patches add events caching (like it's done in the perf kvm), so that
> we don't get fault events prior to mmap/comm events (makes sense only
> for live mode).
> 
> This is just a proof-of-concept, and I'd like to get some comments about
> where and what I got wrong and what additional useful information I can
> expose in the trace.
> 
> v2:
>   - added more info to the changelogs
>   - reworked options (-f -> -F, --pgfaults -> --pf=[all|min|maj])
>   - separated tracepoint_handler changes into additional patch
>   - separated record/replay into additional patch
>   - other fixes pointed out by Arnaldo Carvalho de Melo
> 
> Stanislav Fomichev (7):
>   perf trace: add perf_event parameter to tracepoint_handler
>   perf trace: add support for pagefault tracing
>   perf trace: add pagefaults record and replay support
>   perf trace: add pagefault statistics
>   perf trace: add possibility to switch off syscall events
>   perf kvm: move perf_kvm__mmap_read into session utils
>   perf trace: add events cache
> 
>  tools/perf/Documentation/perf-trace.txt |  19 ++
>  tools/perf/builtin-kvm.c|  88 +---
>  tools/perf/builtin-trace.c  | 350 
> ++--
>  tools/perf/util/session.c   |  85 
>  tools/perf/util/session.h   |   5 +
>  5 files changed, 357 insertions(+), 190 deletions(-)
> 
> -- 
> 1.8.3.2
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 0/7] perf trace pagefaults

2014-06-20 Thread Stanislav Fomichev
This patch series adds support for pagefaults tracing to 'perf trace' command.
It seems this feature was planned by Namhyung Kim 
(http://events.linuxfoundation.org/images/stories/pdf/klf2012_n_kim.pdf page 
17/28)
but I couldn't find any prior patches/discussion and started from scratch.

First three patches add the feature and options to enable faults and disable
syscalls.
Two last patches add events caching (like it's done in the perf kvm), so that
we don't get fault events prior to mmap/comm events (makes sense only
for live mode).

This is just a proof-of-concept, and I'd like to get some comments about
where and what I got wrong and what additional useful information I can
expose in the trace.

v2:
  - added more info to the changelogs
  - reworked options (-f -> -F, --pgfaults -> --pf=[all|min|maj])
  - separated tracepoint_handler changes into additional patch
  - separated record/replay into additional patch
  - other fixes pointed out by Arnaldo Carvalho de Melo

Stanislav Fomichev (7):
  perf trace: add perf_event parameter to tracepoint_handler
  perf trace: add support for pagefault tracing
  perf trace: add pagefaults record and replay support
  perf trace: add pagefault statistics
  perf trace: add possibility to switch off syscall events
  perf kvm: move perf_kvm__mmap_read into session utils
  perf trace: add events cache

 tools/perf/Documentation/perf-trace.txt |  19 ++
 tools/perf/builtin-kvm.c|  88 +---
 tools/perf/builtin-trace.c  | 350 ++--
 tools/perf/util/session.c   |  85 
 tools/perf/util/session.h   |   5 +
 5 files changed, 357 insertions(+), 190 deletions(-)

-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 0/7] perf trace pagefaults

2014-06-20 Thread Stanislav Fomichev
This patch series adds support for pagefaults tracing to 'perf trace' command.
It seems this feature was planned by Namhyung Kim 
(http://events.linuxfoundation.org/images/stories/pdf/klf2012_n_kim.pdf page 
17/28)
but I couldn't find any prior patches/discussion and started from scratch.

First three patches add the feature and options to enable faults and disable
syscalls.
Two last patches add events caching (like it's done in the perf kvm), so that
we don't get fault events prior to mmap/comm events (makes sense only
for live mode).

This is just a proof-of-concept, and I'd like to get some comments about
where and what I got wrong and what additional useful information I can
expose in the trace.

v2:
  - added more info to the changelogs
  - reworked options (-f - -F, --pgfaults - --pf=[all|min|maj])
  - separated tracepoint_handler changes into additional patch
  - separated record/replay into additional patch
  - other fixes pointed out by Arnaldo Carvalho de Melo

Stanislav Fomichev (7):
  perf trace: add perf_event parameter to tracepoint_handler
  perf trace: add support for pagefault tracing
  perf trace: add pagefaults record and replay support
  perf trace: add pagefault statistics
  perf trace: add possibility to switch off syscall events
  perf kvm: move perf_kvm__mmap_read into session utils
  perf trace: add events cache

 tools/perf/Documentation/perf-trace.txt |  19 ++
 tools/perf/builtin-kvm.c|  88 +---
 tools/perf/builtin-trace.c  | 350 ++--
 tools/perf/util/session.c   |  85 
 tools/perf/util/session.h   |   5 +
 5 files changed, 357 insertions(+), 190 deletions(-)

-- 
1.8.3.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] perf trace pagefaults

2014-06-20 Thread Arnaldo Carvalho de Melo
Em Fri, Jun 20, 2014 at 02:49:42PM +0400, Stanislav Fomichev escreveu:
 This patch series adds support for pagefaults tracing to 'perf trace' command.
 It seems this feature was planned by Namhyung Kim 
 (http://events.linuxfoundation.org/images/stories/pdf/klf2012_n_kim.pdf page 
 17/28)
 but I couldn't find any prior patches/discussion and started from scratch.

Just to clarify here, those slides came from slides I made and in turn
the whole idea about pagefaults tracing I got from the trace prototype
that Thomas Gleixner implemented in his 'trace'  utility, described
here:

  Announcing a new utility: 'trace'
  http://lwn.net/Articles/415728/

The comments section has lots of interesting ideas, some you may find
interesting to implement :-)

There is a branch in my tree with the branch tglx did his work on:

https://git.kernel.org/cgit/linux/kernel/git/acme/linux.git/log/?h=tmp.perf/trace2

There you can take a look and compare what you're doing to what he did.

Now I'll go thru your current patches and will cherry pick whatever I
think its OK already, and will try and provide comments for whatever I
think needs more work.

- Arnaldo
 
 First three patches add the feature and options to enable faults and disable
 syscalls.
 Two last patches add events caching (like it's done in the perf kvm), so that
 we don't get fault events prior to mmap/comm events (makes sense only
 for live mode).
 
 This is just a proof-of-concept, and I'd like to get some comments about
 where and what I got wrong and what additional useful information I can
 expose in the trace.
 
 v2:
   - added more info to the changelogs
   - reworked options (-f - -F, --pgfaults - --pf=[all|min|maj])
   - separated tracepoint_handler changes into additional patch
   - separated record/replay into additional patch
   - other fixes pointed out by Arnaldo Carvalho de Melo
 
 Stanislav Fomichev (7):
   perf trace: add perf_event parameter to tracepoint_handler
   perf trace: add support for pagefault tracing
   perf trace: add pagefaults record and replay support
   perf trace: add pagefault statistics
   perf trace: add possibility to switch off syscall events
   perf kvm: move perf_kvm__mmap_read into session utils
   perf trace: add events cache
 
  tools/perf/Documentation/perf-trace.txt |  19 ++
  tools/perf/builtin-kvm.c|  88 +---
  tools/perf/builtin-trace.c  | 350 
 ++--
  tools/perf/util/session.c   |  85 
  tools/perf/util/session.h   |   5 +
  5 files changed, 357 insertions(+), 190 deletions(-)
 
 -- 
 1.8.3.2
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] perf trace pagefaults

2014-06-20 Thread Stanislav Fomichev
 Just to clarify here, those slides came from slides I made and in turn
 the whole idea about pagefaults tracing I got from the trace prototype
 that Thomas Gleixner implemented in his 'trace'  utility, described
 here:
 
   Announcing a new utility: 'trace'
   http://lwn.net/Articles/415728/
 
 The comments section has lots of interesting ideas, some you may find
 interesting to implement :-)
 
 There is a branch in my tree with the branch tglx did his work on:
 
 https://git.kernel.org/cgit/linux/kernel/git/acme/linux.git/log/?h=tmp.perf/trace2
Wow, thanks, I tried to search lkml for any presence of
patches/discussion about these slides, but couldn't find anything, thanks for
pointing it out.

I really like 'blocking/preempted' indication and of course I miss
pointers decoding.

Did anyone really think about decoding pointers and how we can
implement it (like dumping them upon entering a syscall and then
using inside the perf trace?)?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] perf trace pagefaults

2014-06-20 Thread Arnaldo Carvalho de Melo
Em Fri, Jun 20, 2014 at 07:03:18PM +0400, Stanislav Fomichev escreveu:
  Just to clarify here, those slides came from slides I made and in turn
  the whole idea about pagefaults tracing I got from the trace prototype
  that Thomas Gleixner implemented in his 'trace'  utility, described
  here:

Announcing a new utility: 'trace'
http://lwn.net/Articles/415728/

  The comments section has lots of interesting ideas, some you may find
  interesting to implement :-)

  There is a branch in my tree with the branch tglx did his work on:

  https://git.kernel.org/cgit/linux/kernel/git/acme/linux.git/log/?h=tmp.perf/trace2

 Wow, thanks, I tried to search lkml for any presence of
 patches/discussion about these slides, but couldn't find anything, thanks for
 pointing it out.
 
 I really like 'blocking/preempted' indication and of course I miss
 pointers decoding.
 
 Did anyone really think about decoding pointers and how we can
 implement it (like dumping them upon entering a syscall and then
 using inside the perf trace?)?

Hey, haven't you seen the vfs_getname probe? Idea is to hook on where
the relevant copy_from_user is done and insert that into the ring
buffer, as we already do for mapping fd - pathname.

Right now it is too simple, but I was starting to work (when you jumped
right in with your work making me stop and go on testing/reviewing :) )
on making it more generic so that we could defer pretty printing the
arguments from sys_enter to sys_exit, when, by then, we would already
have an association of a user level pointer in some specific thread to
its contents.

This will allow us to to resolve the pathname pointer in things like
open() (i.e. not just after that, in the fd syscalls (write, etc)) and
as well any other pointer of interest.

By librarizing 'builtin-probe.c', that now uses lots of global
variables, etc, we would be able to insert probes where we want them to
capture the contents of pointers, check if the probes are already in
place, use just the ones that we managed to insert (i.e. that were not
invalid because the places where we wanted them to be were changed
across kernel releases, etc).

I.e. no need for actual tracepoints from day one, just wannabe
tracepoints using whatever probe inserting gizmo the kprobes_tracer used
by 'perf probe' now thinks its best to use.

Combine that with using DWARF descriptions (that could be pre cached
into something like CTF (the DTrace kind of CTF) or similar) like pahole
does and we would mostly automatically do all this work of prettyfing
syscall parameters.

/handwave

:-)

For now try:

  perf probe 'vfs_getname=getname_flags:65 pathname=result-name:string'
  trace

And look at how it manages to decode fds.

- Arnaldo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] perf trace pagefaults

2014-06-20 Thread Stanislav Fomichev
 Hey, haven't you seen the vfs_getname probe? Idea is to hook on where
 the relevant copy_from_user is done and insert that into the ring
 buffer, as we already do for mapping fd - pathname.
I saw it but didn't actually try because it needs all the debugging
stuff enabled and in place.

 I.e. no need for actual tracepoints from day one, just wannabe
 tracepoints using whatever probe inserting gizmo the kprobes_tracer used
 by 'perf probe' now thinks its best to use.
But we then need to predefine many probes for decoding to work in the form of
func:offset, and then play catch-up with all the kernel changes.
Or I miss something important here?

 For now try:
 
   perf probe 'vfs_getname=getname_flags:65 pathname=result-name:string'
   trace
 
 And look at how it manages to decode fds.
I will try, but does 65 still work after 
c4ad8f98bef77c7356aa6a9ad9188a6acc6b849d? :-)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] perf trace pagefaults

2014-06-20 Thread Arnaldo Carvalho de Melo
Em Fri, Jun 20, 2014 at 08:18:59PM +0400, Stanislav Fomichev escreveu:
  Hey, haven't you seen the vfs_getname probe? Idea is to hook on where
  the relevant copy_from_user is done and insert that into the ring
  buffer, as we already do for mapping fd - pathname.

 I saw it but didn't actually try because it needs all the debugging
 stuff enabled and in place.

Touché, more on that below...
 
  I.e. no need for actual tracepoints from day one, just wannabe
  tracepoints using whatever probe inserting gizmo the kprobes_tracer used
  by 'perf probe' now thinks its best to use.

 But we then need to predefine many probes for decoding to work in the form of
 func:offset, and then play catch-up with all the kernel changes.
 Or I miss something important here?

No you don't.

If we want to disturb the system in the least way possible, we need to
tag along the copying from userspace of those pointers, so that we get
them fresh and just stash it in our ring buffer and get out of the way
quickly.
 
  For now try:
  
perf probe 'vfs_getname=getname_flags:65 pathname=result-name:string'
trace
  
  And look at how it manages to decode fds.

 I will try, but does 65 still work after 
 c4ad8f98bef77c7356aa6a9ad9188a6acc6b849d? :-)

Well, when I prototyped this[1] the idea is that in some areas, there is
not that much code flux that before commiting to any kind of new
interface, be it tracepoints or something else, we may well just use
'perf probe' to get what we need, and this was done in...

  commit 75b757ca90469e990e6901f4a9497fe4161f7f5a
  Author: Arnaldo Carvalho de Melo a...@redhat.com
  Date:   Tue Sep 24 11:04:32 2013 -0300

Almost a year ago, and it still works, now lets see the cset you mention...

[acme@zoo linux]$ git describe c4ad8f98bef77c7356aa6a9ad9188a6acc6b849d
v3.14-rc1-14-gc4ad8f98bef7
[acme@zoo linux]$
[root@zoo ~]# uname -r
3.15.0-rc8+

Humm, what is the problem?

[root@zoo ~]# perf probe -V getname_flags:65
Available variables at getname_flags:65
@getname_flags+227
char*   filename
int len
int*empty
long intmax
struct filename*result
[root@zoo ~]#
[root@zoo ~]# perf probe 'vfs_getname=getname_flags:65 
pathname=result-name:string'
Added new event:
  probe:vfs_getname(on getname_flags:65 with pathname=result-name:string)

You can now use it in all perf tools, such as:

perf record -e probe:vfs_getname -aR sleep 1

[root@zoo ~]# perf record -e probe:vfs_getname -aR sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.133 MB perf.data (~49505 samples) ]
[root@zoo ~]# perf evlist
probe:vfs_getname
[root@zoo ~]# perf evlist -v
probe:vfs_getname: sample_freq=1, type: 2, config: 1317, size: 96, sample_type: 
IP|TID|TIME|CPU|PERIOD|RAW, disabled: 1, inherit: 1, mmap: 1, comm: 1, 
sample_id_all: 1, exclude_guest: 1
[root@zoo ~]# perf script 
perf 11255 [003] 156054.623210: probe:vfs_getname: 
(811c2e43) pathname=/home/acme/libexec/perf-core/sleep
perf 11255 [003] 156054.624759: probe:vfs_getname: 
(811c2e43) pathname=/usr/lib64/qt-3.3/bin/sleep
perf 11255 [003] 156054.624782: probe:vfs_getname: 
(811c2e43) pathname=/usr/lib64/ccache/sleep
perf 11255 [003] 156054.624794: probe:vfs_getname: 
(811c2e43) pathname=/usr/local/sbin/sleep
perf 11255 [003] 156054.624809: probe:vfs_getname: 
(811c2e43) pathname=/usr/local/bin/sleep
perf 11255 [003] 156054.624818: probe:vfs_getname: 
(811c2e43) pathname=/sbin/sleep
perf 11255 [003] 156054.625017: probe:vfs_getname: 
(811c2e43) pathname=/bin/sleep
   sleep 11255 [002] 156054.626093: probe:vfs_getname: 
(811c2e43) pathname=/etc/ld.so.preload
   sleep 11255 [002] 156054.626114: probe:vfs_getname: 
(811c2e43) pathname=/etc/ld.so.cache
   sleep 11255 [002] 156054.626159: probe:vfs_getname: 
(811c2e43) pathname=/lib64/libc.so.6
   sleep 11255 [002] 156054.626751: probe:vfs_getname: 
(811c2e43) pathname=/usr/lib/locale/locale-archive
  goa-daemon  2082 [003] 156054.955138: probe:vfs_getname: 
(811c2e43) pathname=/etc/localtime
  goa-daemon  2082 [003] 156054.955573: probe:vfs_getname: 
(811c2e43) pathname=/etc/localtime
[root@zoo ~]# 

Best possible way to do this? Guess not, but I'm looking from a tooling
perspective, i.e. about using what is available, not about adding requirements
to the kernel or toolchain, that we can do after we prototype in the best way
possible with existing facilities.

- Arnaldo

[1] And I feel like all of tools/perf/ is just that, reference implementations, 
but hopefully
done in a such a way that may well be useful as-is :-)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to