Re: [iovisor-dev] [LPC] User Space Dynamic Tracing

2016-12-20 Thread Joe Lawrence via iovisor-dev
On 12/18/2016 01:54 PM, Suchakra wrote:

> I spent some time researching dynamic instrumentation for tracing a
> few years back . We started with a ptrace based approach and developed
> kaji (https://github.com/5kg/kaji) as a demo for inserting precompiled
> lttng tracepoints. This was also tested with dyninst. Here are some
> observations (http://www.dorsal.polymtl.ca/fr/system/files/11Dec2013.pdf).
> Here is also some investigation on what happens underneath
> (https://github.com/tuxology/lttng-ubench/blob/master/analysis/instr/dyninst_kaji).
> 
> In addition, some other things to look out for would be :
> 1. Fast Tracepoints in GDB that use a similar trampoline approach
> 2. SystemTap's Stapdyn approach which uses dyninst
> (https://research.cs.wisc.edu/htcondor/HTCondorWeek2013/paradyn-slides/stone-stapdyn.pdf)
> 3. DynamoRIO (http://www.dynamorio.org/)

Thanks for the links, I'll check those out.

> Also, some folks I know have complained about dynins't huge memory
> consumption in a production simulation. I have not investigated that
> myself, but I can if we are heading in this direction. Also, it may be
> worthwhile to discuss if "spin-your-own" may be beneficial than an
> available and tested framework. For example, Dyninst does recursive
> trampoline checks and some other basic safety checks on snippets
> before inserting them. It seems overkill in some places, but sometimes
> it may add to robustness.

Using an existing framework that's already thought of many of the
pitfalls would definitely help.  Dyninst keeps popping up in various
discussions (esp in tracing circles) so will probably the first I'll
further investigate.

>>> Are the tracing folks discussing userspace upstream on the mailing lists
>>> yet?  If so, I'd like to participate :)
>>
>> sorry for delay and thanks a lot for all the links.
>> starting next year we're planning to invest into building
>> a prototype where we can inject the code into remote process,
>> fixup usdt-s in the remote process to point to injected code
>> and let this injected code interact with master process via
>> shared memory.
>> The goal is to build an alternative to uprobe+bpf at much higher speed.
> 
> I am really interested in this now that academic life is getting over :)
> Maybe we can try for a Dyninst based demo which can be quick to implement?
> 
>>> libcompel - execute arbitrary code in a context of a foreign process
>>> https://criu.org/Compel
> 
> Wow this is cool. I did not know about this. Time to play with this!

A few more links if you're interested:

Katana: An ELF/DWARF Manipulation Tool with Hotpatching Capabilities
http://katana.nongnu.org/doc/katana.html

Living on the Edge: Rapid-Toggling Probes with Cross-Modification on x86
http://www.cs.indiana.edu/~rrnewton/papers/pldi16-crossmod.pdf

-- Joe
___
iovisor-dev mailing list
iovisor-dev@lists.iovisor.org
https://lists.iovisor.org/mailman/listinfo/iovisor-dev


Re: [iovisor-dev] [LPC] User Space Dynamic Tracing

2016-12-20 Thread Suchakra via iovisor-dev
Hi,

> Suchakra, was this the same dyninst as http://www.dyninst.org/dyninst? Just
> curious.

Yes. Thats the one. I plan to work on a "mutator" prototype very soon
to test if this is a good idea. Dyninst also allows modification and
rewriting of instrumented binaries on disk as well. They also provide
a language called DynC which allows writing snippets in a C like
syntax. Here are some manuals to read :
http://www.dyninst.org/manuals/dyninstAPI

There is also a small demo they have :
http://www.paradyn.org/tracetool.html I don't know if its compatible
with new Dynisnt APIs but you get the picture what its capable of.
Another approach we can think of is the compile time -pg flag on which
uftrace/ftrace works on (but its not true dynamic instrumentation)

--
Suchakra
___
iovisor-dev mailing list
iovisor-dev@lists.iovisor.org
https://lists.iovisor.org/mailman/listinfo/iovisor-dev


Re: [iovisor-dev] [LPC] User Space Dynamic Tracing

2016-12-20 Thread Brendan Gregg via iovisor-dev
G'Day,

On Sun, Dec 18, 2016 at 10:54 AM, Suchakra  wrote:

> [...]

>> I added the patches/ directory to the project to try out this approach
> >> in applying a few CVE fixes.  I built a base patching shared library
> >> that reads and applies a patch description array.  New patched code is
> >> also included in this shared library.  The patching mechanism applies a
> >> code-trampoline at given locations routing the binary away from the old
> >> code and into the newly loaded patched code.
> >>
> >> It's only a proof-of-concept, but I did record a few .gif terminal
> >> sessions to show what's possible:
>

Great! Lowing overhead should open the door to new possibilities.


>
> I spent some time researching dynamic instrumentation for tracing a
> few years back . We started with a ptrace based approach and developed
> kaji (https://github.com/5kg/kaji) as a demo for inserting precompiled
> lttng tracepoints. This was also tested with dyninst. Here are some
> observations (http://www.dorsal.polymtl.ca/fr/system/files/11Dec2013.pdf).
> Here is also some investigation on what happens underneath
> (https://github.com/tuxology/lttng-ubench/blob/master/
> analysis/instr/dyninst_kaji).
>
> In addition, some other things to look out for would be :
> 1. Fast Tracepoints in GDB that use a similar trampoline approach
> 2. SystemTap's Stapdyn approach which uses dyninst
> (https://research.cs.wisc.edu/htcondor/HTCondorWeek2013/
> paradyn-slides/stone-stapdyn.pdf)
> 3. DynamoRIO (http://www.dynamorio.org/)
>
> Also, some folks I know have complained about dynins't huge memory
> consumption in a production simulation. I have not investigated that
> myself, but I can if we are heading in this direction. Also, it may be
> worthwhile to discuss if "spin-your-own" may be beneficial than an
> available and tested framework. For example, Dyninst does recursive
> trampoline checks and some other basic safety checks on snippets
> before inserting them. It seems overkill in some places, but sometimes
> it may add to robustness.
>

Suchakra, was this the same dyninst as http://www.dyninst.org/dyninst? Just
curious.

Brendan
___
iovisor-dev mailing list
iovisor-dev@lists.iovisor.org
https://lists.iovisor.org/mailman/listinfo/iovisor-dev


Re: [iovisor-dev] [LPC] User Space Dynamic Tracing

2016-12-19 Thread Alex Reece via iovisor-dev
> starting next year we're planning to invest into building
> a prototype where we can inject the code into remote process,
> fixup usdt-s in the remote process to point to injected code
> and let this injected code interact with master process via
> shared memory.
> The goal is to build an alternative to uprobe+bpf at much higher speed.

I'm definitely interested in learning more and contributing if possible --
our use case is that we have a database with USDT probes and would want to
dynamically trace production workloads, but cannot assume that we are
running very modern kernels.

~Alex

On Sun, Dec 18, 2016 at 1:54 PM Suchakra via iovisor-dev <
iovisor-dev@lists.iovisor.org> wrote:

> Hi Joe,
>
> >> What I did was fork a project called linux-inject here:
> >>
> >>   https://github.com/joe-lawrence/linux-inject
> >>
> >> This project demonstrates the use of ptrace to attach and then force the
> >> target to call __libc_dlopen_mode (this lives in libc unlike the dlopen
> >> wrapper in libdl) effectively injecting a given shared object into that
> >> program.
>
> Thanks for your work on linux-inject. Its really cool! You have
> already noted the ptrace caveat.
> In addition, processes which are security hardened with seccomp etc
> may not allow to be ptraced.
>
> >> I added the patches/ directory to the project to try out this approach
> >> in applying a few CVE fixes.  I built a base patching shared library
> >> that reads and applies a patch description array.  New patched code is
> >> also included in this shared library.  The patching mechanism applies a
> >> code-trampoline at given locations routing the binary away from the old
> >> code and into the newly loaded patched code.
> >>
> >> It's only a proof-of-concept, but I did record a few .gif terminal
> >> sessions to show what's possible:
>
> I spent some time researching dynamic instrumentation for tracing a
> few years back . We started with a ptrace based approach and developed
> kaji (https://github.com/5kg/kaji) as a demo for inserting precompiled
> lttng tracepoints. This was also tested with dyninst. Here are some
> observations (http://www.dorsal.polymtl.ca/fr/system/files/11Dec2013.pdf).
> Here is also some investigation on what happens underneath
> (
> https://github.com/tuxology/lttng-ubench/blob/master/analysis/instr/dyninst_kaji
> ).
>
> In addition, some other things to look out for would be :
> 1. Fast Tracepoints in GDB that use a similar trampoline approach
> 2. SystemTap's Stapdyn approach which uses dyninst
> (
> https://research.cs.wisc.edu/htcondor/HTCondorWeek2013/paradyn-slides/stone-stapdyn.pdf
> )
> 3. DynamoRIO (http://www.dynamorio.org/)
>
> Also, some folks I know have complained about dynins't huge memory
> consumption in a production simulation. I have not investigated that
> myself, but I can if we are heading in this direction. Also, it may be
> worthwhile to discuss if "spin-your-own" may be beneficial than an
> available and tested framework. For example, Dyninst does recursive
> trampoline checks and some other basic safety checks on snippets
> before inserting them. It seems overkill in some places, but sometimes
> it may add to robustness.
>
> >> Are the tracing folks discussing userspace upstream on the mailing lists
> >> yet?  If so, I'd like to participate :)
> >
> > sorry for delay and thanks a lot for all the links.
> > starting next year we're planning to invest into building
> > a prototype where we can inject the code into remote process,
> > fixup usdt-s in the remote process to point to injected code
> > and let this injected code interact with master process via
> > shared memory.
> > The goal is to build an alternative to uprobe+bpf at much higher speed.
>
> I am really interested in this now that academic life is getting over :)
> Maybe we can try for a Dyninst based demo which can be quick to implement?
>
> >> libcompel - execute arbitrary code in a context of a foreign process
> >> https://criu.org/Compel
>
> Wow this is cool. I did not know about this. Time to play with this!
>
> --
> Suchakra
> ___
> iovisor-dev mailing list
> iovisor-dev@lists.iovisor.org
> https://lists.iovisor.org/mailman/listinfo/iovisor-dev
>
___
iovisor-dev mailing list
iovisor-dev@lists.iovisor.org
https://lists.iovisor.org/mailman/listinfo/iovisor-dev


Re: [iovisor-dev] [LPC] User Space Dynamic Tracing

2016-12-18 Thread Suchakra via iovisor-dev
Hi Joe,

>> What I did was fork a project called linux-inject here:
>>
>>   https://github.com/joe-lawrence/linux-inject
>>
>> This project demonstrates the use of ptrace to attach and then force the
>> target to call __libc_dlopen_mode (this lives in libc unlike the dlopen
>> wrapper in libdl) effectively injecting a given shared object into that
>> program.

Thanks for your work on linux-inject. Its really cool! You have
already noted the ptrace caveat.
In addition, processes which are security hardened with seccomp etc
may not allow to be ptraced.

>> I added the patches/ directory to the project to try out this approach
>> in applying a few CVE fixes.  I built a base patching shared library
>> that reads and applies a patch description array.  New patched code is
>> also included in this shared library.  The patching mechanism applies a
>> code-trampoline at given locations routing the binary away from the old
>> code and into the newly loaded patched code.
>>
>> It's only a proof-of-concept, but I did record a few .gif terminal
>> sessions to show what's possible:

I spent some time researching dynamic instrumentation for tracing a
few years back . We started with a ptrace based approach and developed
kaji (https://github.com/5kg/kaji) as a demo for inserting precompiled
lttng tracepoints. This was also tested with dyninst. Here are some
observations (http://www.dorsal.polymtl.ca/fr/system/files/11Dec2013.pdf).
Here is also some investigation on what happens underneath
(https://github.com/tuxology/lttng-ubench/blob/master/analysis/instr/dyninst_kaji).

In addition, some other things to look out for would be :
1. Fast Tracepoints in GDB that use a similar trampoline approach
2. SystemTap's Stapdyn approach which uses dyninst
(https://research.cs.wisc.edu/htcondor/HTCondorWeek2013/paradyn-slides/stone-stapdyn.pdf)
3. DynamoRIO (http://www.dynamorio.org/)

Also, some folks I know have complained about dynins't huge memory
consumption in a production simulation. I have not investigated that
myself, but I can if we are heading in this direction. Also, it may be
worthwhile to discuss if "spin-your-own" may be beneficial than an
available and tested framework. For example, Dyninst does recursive
trampoline checks and some other basic safety checks on snippets
before inserting them. It seems overkill in some places, but sometimes
it may add to robustness.

>> Are the tracing folks discussing userspace upstream on the mailing lists
>> yet?  If so, I'd like to participate :)
>
> sorry for delay and thanks a lot for all the links.
> starting next year we're planning to invest into building
> a prototype where we can inject the code into remote process,
> fixup usdt-s in the remote process to point to injected code
> and let this injected code interact with master process via
> shared memory.
> The goal is to build an alternative to uprobe+bpf at much higher speed.

I am really interested in this now that academic life is getting over :)
Maybe we can try for a Dyninst based demo which can be quick to implement?

>> libcompel - execute arbitrary code in a context of a foreign process
>> https://criu.org/Compel

Wow this is cool. I did not know about this. Time to play with this!

--
Suchakra
___
iovisor-dev mailing list
iovisor-dev@lists.iovisor.org
https://lists.iovisor.org/mailman/listinfo/iovisor-dev


Re: [iovisor-dev] [LPC] User Space Dynamic Tracing

2016-12-18 Thread Alexei Starovoitov via iovisor-dev
On Mon, Nov 14, 2016 at 7:34 AM, Joe Lawrence  wrote:
> On 11/11/2016 09:41 PM, Alexei Starovoitov wrote:
>> On Thu, Nov 10, 2016 at 7:36 AM, Joe Lawrence  
>> wrote:
>>> On 11/09/2016 11:11 PM, Alexei Starovoitov wrote:
 On Wed, Nov 9, 2016 at 8:59 AM, Joe Lawrence  
 wrote:
> Hi Alexei -- I was at the Plumber's conference last week, but
> unfortunately missed your talk on User Space Dynamic Tracing...  A
> colleague mentioned that you had presented a slide deck --  I'd be
> interested in checking that out if it was still available (email or
> uploaded to LPC site).

 There was only one slide to illustrate the goal.
 Not very useful without hearing the discussion that we had.
 Here it is anyway.
 What is your particular interest in the tracing?

 Thanks

>>>
>>> Thanks for attaching the slides... My interest was from the POV of
>>> userspace live patching.  A Red Hat colleague, Josh Poimboeuf, was at
>>> your presentation and suggested touching base with you.
>>>
>>> Our group at Red Hat primary works on kpatch, which uses the ftrace
>>> infrastructure to hook functions... Perhaps a similar technique would be
>>> possible given a userspace tracing facility.
>>>
>>> In our early efforts to do userspace live patching, I've been tinkering
>>> with a ptrace-shared library injection-code trampoline approach that
>>> looks similar to the "Install program, binary blob" parts of your third
>>> slide.
>>
>> interesting. could you share the link to this lib?
>
> What I did was fork a project called linux-inject here:
>
>   https://github.com/joe-lawrence/linux-inject
>
> This project demonstrates the use of ptrace to attach and then force the
> target to call __libc_dlopen_mode (this lives in libc unlike the dlopen
> wrapper in libdl) effectively injecting a given shared object into that
> program.

looks great!

> I added the patches/ directory to the project to try out this approach
> in applying a few CVE fixes.  I built a base patching shared library
> that reads and applies a patch description array.  New patched code is
> also included in this shared library.  The patching mechanism applies a
> code-trampoline at given locations routing the binary away from the old
> code and into the newly loaded patched code.
>
> It's only a proof-of-concept, but I did record a few .gif terminal
> sessions to show what's possible:
>
>
> https://raw.githubusercontent.com/joe-lawrence/linux-inject/master/patches/demo.gif
>
> https://raw.githubusercontent.com/joe-lawrence/linux-inject/master/patches/ghost_demo.gif
>
> https://raw.githubusercontent.com/joe-lawrence/linux-inject/master/patches/cves.gif
>
>> we're trying to reuse as much as possible.
>> Like for shmem ring buffer we're thinking to reuse lttng per-cpu
>> ring buffer mechanism after rseq patches land.
>> For code injections many different ideas were proposed.
>> Including dyninst, but it looks to be an overkill.
>
> I'm still reading up on dyninst (and its PatchAPI) ... it's a lot more
> developed than my PoC, but offers a much richer API than I envisioned
> for livepatching.  That said, it's documented, tested, etc. so might
> still be useful for the livepatching use case.
>
>> I'm thinking something like ptrace but inode based like uprobe,
>> so that 'patch' of the program can be setup before the program
>> starts.
>
> This is similar to the discussion we had with Masami at one of the LPC
> evening events.  He didn't get too far into details, but suggested a
> uprobes approach where we may be able to stitch in replacement pages for
> target tasks.  Some kind of inode approach would be interesting as well
> to solve new program invocations.
>
> Are the tracing folks discussing userspace upstream on the mailing lists
> yet?  If so, I'd like to participate :)

sorry for delay and thanks a lot for all the links.
starting next year we're planning to invest into building
a prototype where we can inject the code into remote process,
fixup usdt-s in the remote process to point to injected code
and let this injected code interact with master process via
shared memory.
The goal is to build an alternative to uprobe+bpf at much higher speed.

Hence cc-ing all interested folks.

> libcompel - execute arbitrary code in a context of a foreign process
> https://criu.org/Compel

excellent find! this one looks the most advanced and ready to use.
Could you please share your finding with them?
Are you going to continue investing into your:
https://github.com/joe-lawrence/linux-inject
and how does it compare?

Thanks!
___
iovisor-dev mailing list
iovisor-dev@lists.iovisor.org
https://lists.iovisor.org/mailman/listinfo/iovisor-dev