subject:"Re\: \[Qemu\-devel\] \[PATCH 00\/13\] instrument\: Add basic event instrumentation"

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-08-29 Thread Peter Maydell

On 26 August 2017 at 01:02, Emilio G. Cota  wrote:
> An additional "nice to have" would be:
>
> * Allow inlining of TCG code by the instrumenter. Example use case:
>   the instrumenter wants to increment a counter every time a
>   basic block is executed. Instead of calling a callback function on every 
> block's
>   execution, we could just have a translation-time callback to emit at the 
> beginning
>   of the translated block the counter increment. This would be much faster, 
> and
>   is something that all other tools (e.g. DynamoRIO/Pin) implement.

This is a feature I would strongly prefer us not to implement.
It exposes too much of QEMU's internals (ie TCG) to the
instrumentation, and it would be pretty complicated to use.

thanks
-- PMM

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-08-25 Thread Emilio G. Cota

On Thu, Aug 03, 2017 at 12:54:57 +0100, Stefan Hajnoczi wrote:
> > > Please post an example of the API you'd like.
> > 
> > In my opinion, the instrumentation support in this series provides an API 
> > that
> > works in the opposite way you're suggesting (let's ignore the fact that it's
> > built on tracing events).
> > 
> > When QEMU loads an instrumentation library (which can happen at any time), 
> > some
> > initialization function is called on the library so that it can establish 
> > what
> > events to instrument. This also has the advantage that a user can hook into 
> > a
> > running QEMU instance at any time to perform some instrumentation.
> > 
> > I think this is the bare minimum necessary to make it work, and has the 
> > upside
> > of being completely orthogonal to the libqemu approach. We could reuse most 
> > of
> > the stable instrumentation API there too, except for the instrumentation 
> > code
> > initialization.
> > 
> > That being said, the libqemu approach *might* make it a bit easier to 
> > provide an
> > API for things such as "run for this many instructions and return control to
> > instrumentor", but I don't think that's mandatory for a first prototype 
> > (and can
> > definitely be implemented using both approaches).
> 
> The main concern I have is that a feature for loading shared libraries
> and hooking QEMU will be abused.  This can be mostly solved by offering
> only a stable API without the ability to hook trace events.  There are
> still ways to abuse this and that's why I prefer the libqemu approach.
> 
> The libqemu approach also avoids the "how do I enable my
> instrumentation?" step for new users because they just need to run their
> program after compiling it.  The workflow is simpler.

Widely used instrumentation tools, such as DynamoRIO and Pin, follow the
same approach Lluis took here, i.e. your plugin is loaded at run-time by the 
tool.
So from a usability viewpoint I don't think this approach would be 
controversial.

> As a next step I suggest defining the stable instrumentation API and
> dropping the tracing hooks.  We can still work out whether shared
> library loading or libqemu is okay later but the stable API is most
> important.

I agree with dropping the tracing hooks. However, I wonder whether we'd
want to add another field to struct TranslationBlock to keep track of
the dynamic instrumentation state, or just reuse trace_vcpu_dstate.

I gravitate towards the former approach--would be nice to have the
instrumentation code completely separate from the tracing code, and
hopefully not rely on python nor trace-events files :-)

Emilio

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-08-25 Thread Emilio G. Cota

On Fri, Jul 28, 2017 at 19:05:43 +0300, Lluís Vilanova wrote:
> As for the (minimum) requirements I've collected:
> 
> * Peek at register values and guest memory.
> * Enumerate guest cpus.
> * Control when events are raised to minimize overheads (e.g., avoid generating
>   TCG code to trace a guest code event we don't care about just now).
> * When a guest code event is raised at translation time, let instrumentor 
> decide
>   if it needs to be raised at execution time too (e.g., might be decided on
>   instruction opcode)
> * Minimum overhead when instrumenting (I think the proposed fifo/socket 
> approach
>   does not meet this; point 2 helps a lot here, which is what the current
>   tracing code does)
> * Let instrumentor know when TBs are being freed (i.e., to free internal 
> per-TB
>   structures; I have a patch queued adding this event).
> 
> Nice to have:
> 
> * Avoid recompilation for each new instrumentation logic.
> * Poke at register values and guest memory.
> * [synchronous for sure] Let user annotate guest programs to raise additional
>   events with guest-specified information (the hypertrace series I sent).
> * [synchronous for sure] Make guest breakpoints/watchpoints raise an event 
> (for
>   when we cannot annotate the guest code; I have some patches).
> * Trigger QEMU's event tracing routine when the instrumentation code
>   decides. This is what this series does now, as then lazy instrumentors don't
>   need to write their own tracing-to-disk code. Otherwise, per-event tracing
>   states would need to be independent of instrumentation states (but the logic
>   is exactly the same).
> * Let instrumentor define additional arguments to execution-time events during
>   translation-time (e.g., to quickly index into an instrumentor struct when 
> the
>   execution-time event comes that references info collected during 
> translation).
> * Attach an instrumentor-specified value to each guest CPU (e.g., pointing to
>   the instrumentor's cpu state). Might be less necessary if the point above is
>   met (if we only look at overall performance).

Agreed.

An additional "nice to have" would be:

* Allow inlining of TCG code by the instrumenter. Example use case:
  the instrumenter wants to increment a counter every time a
  basic block is executed. Instead of calling a callback function on every 
block's
  execution, we could just have a translation-time callback to emit at the 
beginning
  of the translated block the counter increment. This would be much faster, and
  is something that all other tools (e.g. DynamoRIO/Pin) implement.

E.

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-08-03 Thread Stefan Hajnoczi

On Wed, Aug 02, 2017 at 06:19:29PM +0300, Lluís Vilanova wrote:
> Stefan Hajnoczi writes:
> 
> > On Wed, Aug 02, 2017 at 12:10:14PM +0100, Peter Maydell wrote:
> >> On 2 August 2017 at 12:04, Stefan Hajnoczi  wrote:
> >> > On Tue, Aug 01, 2017 at 02:54:29PM +0100, Peter Maydell wrote:
> >> >> and I don't need the TCG engine to be a library to do that...
> >> >
> >> > You do need TCG APIs if you want TCG-level instrumentation, tuning
> >> > options, callbacks, etc.
> >> 
> >> I need an API; that doesn't necessarily look like the kind
> >> of API you want to be able to embed the TCG engine into
> >> other things, I think.
> >> 
> >> >> I agree that we want to provide something that is at least
> >> >> closer to a stable API than "just expose trace events",
> >> >> though.
> >> >
> >> > libqemu has at least three parts:
> >> >
> >> > 1. VM API (i.e. qemu_init(argc, argv), qemu_run(), qemu_vcpu_get_reg32())
> >> > 2. TCG engine
> >> > 3. Device models
> >> >
> >> > Like I said in my email, start with what matters for the instrumentation
> >> > use case (VM API at a minimum to control guest execution).  Other people
> >> > can flesh out the other parts later, as needed.
> >> >
> >> > Other attempts to provide a stable API will be essentially the same
> >> > thing as libqemu.
> >> 
> >> I don't think this is the case -- you could have a stable
> >> instrumentation API without it looking anything like
> >> libqemu. In particular I don't think you need to have
> >> something that sits at the top level and says 'run'.
> >> 
> >> In particular I think that pulling TCG out of QEMU
> >> is an enormous and painful undertaking that you just
> >> don't need to do at all to allow this kind of
> >> instrumentation API.
> 
> > Please post an example of the API you'd like.
> 
> In my opinion, the instrumentation support in this series provides an API that
> works in the opposite way you're suggesting (let's ignore the fact that it's
> built on tracing events).
> 
> When QEMU loads an instrumentation library (which can happen at any time), 
> some
> initialization function is called on the library so that it can establish what
> events to instrument. This also has the advantage that a user can hook into a
> running QEMU instance at any time to perform some instrumentation.
> 
> I think this is the bare minimum necessary to make it work, and has the upside
> of being completely orthogonal to the libqemu approach. We could reuse most of
> the stable instrumentation API there too, except for the instrumentation code
> initialization.
> 
> That being said, the libqemu approach *might* make it a bit easier to provide 
> an
> API for things such as "run for this many instructions and return control to
> instrumentor", but I don't think that's mandatory for a first prototype (and 
> can
> definitely be implemented using both approaches).

The main concern I have is that a feature for loading shared libraries
and hooking QEMU will be abused.  This can be mostly solved by offering
only a stable API without the ability to hook trace events.  There are
still ways to abuse this and that's why I prefer the libqemu approach.

The libqemu approach also avoids the "how do I enable my
instrumentation?" step for new users because they just need to run their
program after compiling it.  The workflow is simpler.

As a next step I suggest defining the stable instrumentation API and
dropping the tracing hooks.  We can still work out whether shared
library loading or libqemu is okay later but the stable API is most
important.

Stefan


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-08-02 Thread Lluís Vilanova

Stefan Hajnoczi writes:

> On Wed, Aug 02, 2017 at 12:10:14PM +0100, Peter Maydell wrote:
>> On 2 August 2017 at 12:04, Stefan Hajnoczi  wrote:
>> > On Tue, Aug 01, 2017 at 02:54:29PM +0100, Peter Maydell wrote:
>> >> and I don't need the TCG engine to be a library to do that...
>> >
>> > You do need TCG APIs if you want TCG-level instrumentation, tuning
>> > options, callbacks, etc.
>> 
>> I need an API; that doesn't necessarily look like the kind
>> of API you want to be able to embed the TCG engine into
>> other things, I think.
>> 
>> >> I agree that we want to provide something that is at least
>> >> closer to a stable API than "just expose trace events",
>> >> though.
>> >
>> > libqemu has at least three parts:
>> >
>> > 1. VM API (i.e. qemu_init(argc, argv), qemu_run(), qemu_vcpu_get_reg32())
>> > 2. TCG engine
>> > 3. Device models
>> >
>> > Like I said in my email, start with what matters for the instrumentation
>> > use case (VM API at a minimum to control guest execution).  Other people
>> > can flesh out the other parts later, as needed.
>> >
>> > Other attempts to provide a stable API will be essentially the same
>> > thing as libqemu.
>> 
>> I don't think this is the case -- you could have a stable
>> instrumentation API without it looking anything like
>> libqemu. In particular I don't think you need to have
>> something that sits at the top level and says 'run'.
>> 
>> In particular I think that pulling TCG out of QEMU
>> is an enormous and painful undertaking that you just
>> don't need to do at all to allow this kind of
>> instrumentation API.

> Please post an example of the API you'd like.

In my opinion, the instrumentation support in this series provides an API that
works in the opposite way you're suggesting (let's ignore the fact that it's
built on tracing events).

When QEMU loads an instrumentation library (which can happen at any time), some
initialization function is called on the library so that it can establish what
events to instrument. This also has the advantage that a user can hook into a
running QEMU instance at any time to perform some instrumentation.

I think this is the bare minimum necessary to make it work, and has the upside
of being completely orthogonal to the libqemu approach. We could reuse most of
the stable instrumentation API there too, except for the instrumentation code
initialization.

That being said, the libqemu approach *might* make it a bit easier to provide an
API for things such as "run for this many instructions and return control to
instrumentor", but I don't think that's mandatory for a first prototype (and can
definitely be implemented using both approaches).

Cheers,
  Lluis

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-08-02 Thread Stefan Hajnoczi

On Wed, Aug 02, 2017 at 12:10:14PM +0100, Peter Maydell wrote:
> On 2 August 2017 at 12:04, Stefan Hajnoczi  wrote:
> > On Tue, Aug 01, 2017 at 02:54:29PM +0100, Peter Maydell wrote:
> >> and I don't need the TCG engine to be a library to do that...
> >
> > You do need TCG APIs if you want TCG-level instrumentation, tuning
> > options, callbacks, etc.
> 
> I need an API; that doesn't necessarily look like the kind
> of API you want to be able to embed the TCG engine into
> other things, I think.
> 
> >> I agree that we want to provide something that is at least
> >> closer to a stable API than "just expose trace events",
> >> though.
> >
> > libqemu has at least three parts:
> >
> > 1. VM API (i.e. qemu_init(argc, argv), qemu_run(), qemu_vcpu_get_reg32())
> > 2. TCG engine
> > 3. Device models
> >
> > Like I said in my email, start with what matters for the instrumentation
> > use case (VM API at a minimum to control guest execution).  Other people
> > can flesh out the other parts later, as needed.
> >
> > Other attempts to provide a stable API will be essentially the same
> > thing as libqemu.
> 
> I don't think this is the case -- you could have a stable
> instrumentation API without it looking anything like
> libqemu. In particular I don't think you need to have
> something that sits at the top level and says 'run'.
> 
> In particular I think that pulling TCG out of QEMU
> is an enormous and painful undertaking that you just
> don't need to do at all to allow this kind of
> instrumentation API.

Please post an example of the API you'd like.

Stefan


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-08-02 Thread Peter Maydell

On 2 August 2017 at 12:04, Stefan Hajnoczi  wrote:
> On Tue, Aug 01, 2017 at 02:54:29PM +0100, Peter Maydell wrote:
>> and I don't need the TCG engine to be a library to do that...
>
> You do need TCG APIs if you want TCG-level instrumentation, tuning
> options, callbacks, etc.

I need an API; that doesn't necessarily look like the kind
of API you want to be able to embed the TCG engine into
other things, I think.

>> I agree that we want to provide something that is at least
>> closer to a stable API than "just expose trace events",
>> though.
>
> libqemu has at least three parts:
>
> 1. VM API (i.e. qemu_init(argc, argv), qemu_run(), qemu_vcpu_get_reg32())
> 2. TCG engine
> 3. Device models
>
> Like I said in my email, start with what matters for the instrumentation
> use case (VM API at a minimum to control guest execution).  Other people
> can flesh out the other parts later, as needed.
>
> Other attempts to provide a stable API will be essentially the same
> thing as libqemu.

I don't think this is the case -- you could have a stable
instrumentation API without it looking anything like
libqemu. In particular I don't think you need to have
something that sits at the top level and says 'run'.

In particular I think that pulling TCG out of QEMU
is an enormous and painful undertaking that you just
don't need to do at all to allow this kind of
instrumentation API.

thanks
-- PMM

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-08-02 Thread Stefan Hajnoczi

On Fri, Jul 28, 2017 at 07:21:04PM +0300, Lluís Vilanova wrote:
> Stefan Hajnoczi writes:
> 
> > On Thu, Jul 27, 2017 at 11:40:17AM +0100, Peter Maydell wrote:
> >> On 27 July 2017 at 11:32, Stefan Hajnoczi  wrote:
> >> > On Wed, Jul 26, 2017 at 03:44:39PM +0300, Lluís Vilanova wrote:
> >> >> And why exactly is this a threat? Because it can be used to "extend" 
> >> >> QEMU
> >> >> without touching its sources? Is this a realistic threat? (it's a 
> >> >> rather brittle
> >> >> way to do it, so I'm not sure it's practical)
> >> >
> >> > Unfortunately it is a problem.  I recently came across a product that
> >> > was using LD_PRELOAD= to "integrate" with QEMU.  People really abuse
> >> > these interfaces instead of integrating their features cleanly into
> >> > QEMU.
> >> 
> >> ...if people who want to do this kind of thing already can and
> >> do use LD_PRELOAD for it, I don't think we should worry too much
> >> about trying to make the instrumentation plugin API bulletproof
> >> against similar abuse.
> >> 
> >> > I see the use cases that Peter has been describing and am sure we can
> >> > come up with good solutions.  What I care about is that it doesn't allow
> >> > loading a .so that connects to arbitrary trace events.
> >> 
> >> That said, I agree that we don't really need an arbitrary-trace-event
> >> setup here, and we should probably design our API so that it isn't
> >> handing the trace plugin hooks pointers into QEMU's internals.
> >> We want an API that makes it easy for people to do things based on
> >> changes of the guest binary's state (registers, insns, etc etc)
> >> and which makes it hard for them to accidentally trip themselves up
> >> (eg by prodding around in QEMU internal data structures).
> >> This will have the secondary benefit that it's unlikely that future
> >> changes to QEMU will break plugin code.
> >> 
> >> >> As a side note, I find instrumentation to be most useful for guest code 
> >> >> events,
> >> >> which mostly contain non-pointer values (except for the CPUState*).
> >> 
> >> For instance we definitely should not be passing a CPUState* to
> >> any plugin function.
> 
> > The gdbstub protocol has relevant features for accessing guest memory,
> > registers, etc.  Perhaps a set of QEMU-specific events can be added
> > (e.g. tb generated) so it's possible to instrument and control the
> > guest from an instrumentation program (written in any language).
> 
> > Perhaps there is a fundamental reason why this isn't possible due to the
> > protocol design, because using gdbstub halts all vcpus, etc.  I don't
> > know.
> 
> > Do you think this is an interesting direction?  It definitely seems like
> > a powerful approach though performance would be less than running native
> > code inside the QEMU process.
> 
> That's the same approach someone else dubbed as using a fifo with 
> "synchronous"
> events, right? I have some measurements on this using a pipe, and overheads 
> are
> 1000x to 2300x for each communication event (compared to a function call, and
> depending on whether each process/thread is pinned to the same or different
> CPU).

You are right.  I understand the need for native code without
interprocess communication now.

Stefan


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-08-02 Thread Stefan Hajnoczi

On Tue, Aug 01, 2017 at 02:54:29PM +0100, Peter Maydell wrote:
> On 1 August 2017 at 14:48, Stefan Hajnoczi  wrote:
> > Thanks for sharing the requirements.  A stable API is necessary for
> > providing these features.
> >
> > We're essentially talking about libqemu.  That means QEMU in library
> > form with an API for JIT engine, reverse engineering, instrumentation,
> > etc tasks.
> 
> > Maintaining libqemu will take ongoing effort and no one has committed.
> > The last discussion about libqemu was here:
> > https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg04847.html
> 
> That thread seems to be focused on trying to extract TCG
> from the rest of QEMU, which definitely isn't a requirement
> for instrumentation, and I would suggest is something
> of a distraction from it.

You are right, it's a different use case.  I just wanted to reference
previous discussion about libqemu.  There have been more in the past for
different use cases.

> I want to be able to say "just
> instrument this setup that QEMU already provides as a
> board model", not have to write a driver that duplicates
> all the work vl.c and our board models do for us today,

Calling qemu_init(argc, argv) and qemu_run() isn't too onerous.  For the
price of that you get an environment where we can offer stable APIs.

> and I don't need the TCG engine to be a library to do that...

You do need TCG APIs if you want TCG-level instrumentation, tuning
options, callbacks, etc.

> I agree that we want to provide something that is at least
> closer to a stable API than "just expose trace events",
> though.

libqemu has at least three parts:

1. VM API (i.e. qemu_init(argc, argv), qemu_run(), qemu_vcpu_get_reg32())
2. TCG engine
3. Device models

Like I said in my email, start with what matters for the instrumentation
use case (VM API at a minimum to control guest execution).  Other people
can flesh out the other parts later, as needed.

Other attempts to provide a stable API will be essentially the same
thing as libqemu.

Stefan

signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-08-01 Thread Peter Maydell

On 1 August 2017 at 14:48, Stefan Hajnoczi  wrote:
> Thanks for sharing the requirements.  A stable API is necessary for
> providing these features.
>
> We're essentially talking about libqemu.  That means QEMU in library
> form with an API for JIT engine, reverse engineering, instrumentation,
> etc tasks.

> Maintaining libqemu will take ongoing effort and no one has committed.
> The last discussion about libqemu was here:
> https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg04847.html

That thread seems to be focused on trying to extract TCG
from the rest of QEMU, which definitely isn't a requirement
for instrumentation, and I would suggest is something
of a distraction from it. I want to be able to say "just
instrument this setup that QEMU already provides as a
board model", not have to write a driver that duplicates
all the work vl.c and our board models do for us today,
and I don't need the TCG engine to be a library to do that...

I agree that we want to provide something that is at least
closer to a stable API than "just expose trace events",
though.

thanks
-- PMM

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-08-01 Thread Stefan Hajnoczi

On Fri, Jul 28, 2017 at 07:05:43PM +0300, Lluís Vilanova wrote:
> Daniel P Berrange writes:
> 
> > On Fri, Jul 28, 2017 at 02:41:19PM +0100, Peter Maydell wrote:
> >> On 28 July 2017 at 14:34, Stefan Hajnoczi  wrote:
> >> > Lluís/Peter: What are the requirements for instrumentation code
> >> > interacting with the running QEMU instance?  simpletrace is
> >> > asynchronous, meaning it does not wait for anyone handle the trace event
> >> > before continuing execution, and is therefore not suitable for
> >> > SystemTap-style scripts that can interact with the program while
> >> > handling a trace event.
> >> 
> >> I think you'd probably want synchronous -- it's pretty helpful
> >> to be able to say "register a trace event hook that doesn't
> >> fire very often, and use that to get to the region of
> >> execution that's of interest to you, then enable more hooks
> >> to get more detail at that point". (For instance, "wait til
> >> we've executed 5,000,000 instructions, then turn on the
> >> tracing of all instruction execution, register modification
> >> and memory accesses".)
> 
> > Currently simpletrace probes have a fixed action when they are enabled,
> > namely to print state to the trace log file. Perhaps we can make the
> > action more flexible, if we create a more formal protocol for simpletrace
> > to let it talk over a UNIX socket. By default it could send probe data
> > asynchronously as now, but you could mark probes such that they require
> > a synchronous ACK, thus pausing execution until that ACK is received
> > from the instrumenting program.
> 
> > For that to be useful, we would need to have allow probes to be turned
> > on/off via this trace socket, since the normal HMP/QMP monitor execution
> > would be blocked while this probe is running.
> 
> This sounds like much more work to me for much lower performance (copy through
> fifo/socket & process synchronization). Especially if many interesting events
> become "synchronous". Also, with MTTCG using fifos becomes more complex on 
> both
> sides (select on the client & synchronization of events between streams on 
> both
> sides).
> 
> As a performance example, I've used the approach of this series to perform 
> some
> non-trivial instrumentation and I can get performance on par with PIN (a
> state-of-the-art tool from intel) on most SPEC benchmarks.
> 
> Honestly, I would try to see if there's a GPL-preserving way to allow native
> instrumentation libraries to run inside QEMU (either dynamic or static, that's
> less important to me).
> 
> As for the (minimum) requirements I've collected:
> 
> * Peek at register values and guest memory.
> * Enumerate guest cpus.
> * Control when events are raised to minimize overheads (e.g., avoid generating
>   TCG code to trace a guest code event we don't care about just now).
> * When a guest code event is raised at translation time, let instrumentor 
> decide
>   if it needs to be raised at execution time too (e.g., might be decided on
>   instruction opcode)
> * Minimum overhead when instrumenting (I think the proposed fifo/socket 
> approach
>   does not meet this; point 2 helps a lot here, which is what the current
>   tracing code does)
> * Let instrumentor know when TBs are being freed (i.e., to free internal 
> per-TB
>   structures; I have a patch queued adding this event).
> 
> Nice to have:
> 
> * Avoid recompilation for each new instrumentation logic.
> * Poke at register values and guest memory.
> * [synchronous for sure] Let user annotate guest programs to raise additional
>   events with guest-specified information (the hypertrace series I sent).
> * [synchronous for sure] Make guest breakpoints/watchpoints raise an event 
> (for
>   when we cannot annotate the guest code; I have some patches).
> * Trigger QEMU's event tracing routine when the instrumentation code
>   decides. This is what this series does now, as then lazy instrumentors don't
>   need to write their own tracing-to-disk code. Otherwise, per-event tracing
>   states would need to be independent of instrumentation states (but the logic
>   is exactly the same).
> * Let instrumentor define additional arguments to execution-time events during
>   translation-time (e.g., to quickly index into an instrumentor struct when 
> the
>   execution-time event comes that references info collected during 
> translation).
> * Attach an instrumentor-specified value to each guest CPU (e.g., pointing to
>   the instrumentor's cpu state). Might be less necessary if the point above is
>   met (if we only look at overall performance).

Thanks for sharing the requirements.  A stable API is necessary for
providing these features.

We're essentially talking about libqemu.  That means QEMU in library
form with an API for JIT engine, reverse engineering, instrumentation,
etc tasks.

The user must write a driver program:

  #include 
  #include 

  static void my_syscall(QemuVcpu *vcpu, uint64_t num, uint64_t arg1, ...)
  {
  if

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-08-01 Thread Stefan Hajnoczi

On Fri, Jul 28, 2017 at 07:14:33PM +0300, Lluís Vilanova wrote:
> Daniel P Berrange writes:
> 
> > On Fri, Jul 28, 2017 at 02:34:30PM +0100, Stefan Hajnoczi wrote:
> >> On Thu, Jul 27, 2017 at 04:45:35PM +0100, Daniel P. Berrange wrote:
> >> > On Thu, Jul 27, 2017 at 04:33:01PM +0100, Peter Maydell wrote:
> >> > > On 27 July 2017 at 16:21, Daniel P. Berrange  
> >> > > wrote:
> >> > > > On Thu, Jul 27, 2017 at 11:54:29AM +0100, Peter Maydell wrote:
> >> > > >> That said, yes, I was going to ask if we could do this via
> >> > > >> leveraging the tracepoint infrastructure and whatever scripting
> >> > > >> facilities it provides. Are there any good worked examples of
> >> > > >> this sort of thing? Can you do it as an ordinary non-root user?
> >> > > >
> >> > > > Do you have a particular thing you'd like to see an example of ?
> >> > > >
> >> > > > To dynamically probe a function which doesn't have a tracepoint
> >> > > > defined you can do:
> >> > > >
> >> > > > probe process("/usr/bin/qemu-x86_64").function("helper_syscall") {
> >> > > >   printf("syscall stasrt\n")
> >> > > > }
> >> > > >
> >> > > > but getting access to the function args is not as easy as with
> >> > > > pre-defined tracepoints.
> >> > > 
> >> > > How do I go about actually running that script? What I
> >> > > have in mind by "worked example" is something like a blog
> >> > > post that says "ok, here's a problem, we want to find out
> >> > > what QEMU is doing in situation X, here's how you do this
> >> > > with $TRACING_THINGY" and generally steps you through how
> >> > > it works assuming you know nothing at all about whatever
> >> > > the tracing facility you're using is.
> >> > 
> >> > Ok, so something like this example that I wrote for libvirt a
> >> > while back then
> >> > 
> >> >   
> >> > https://www.berrange.com/posts/2011/11/30/watching-the-libvirt-rpc-protocol-using-systemtap/
> >> > 
> >> > 
> >> > > > You can't typically run this as root,
> >> > > 
> >> > > Do you mean "non-root" ?
> >> > 
> >> > Sigh, yes, of course.
> >> > 
> >> > > > however, I don't think that's a
> >> > > > huge issue, because most QEMU deployments are not running as your own
> >> > > > user account anyway, so you can't directly interact with them no
> >> > > > matter what.
> >> > > 
> >> > > It is important, because almost all uses of TCG QEMU are
> >> > > running it from the command line as non-root normal users,
> >> > > especially if they're trying to debug what's going on with a
> >> > > guest binary. So any tracing solution for this kind of usecase
> >> > > must work without requiring root access, I think.
> >> > 
> >> > None of the Linux integrated tracing tools allow direct non-root access
> >> > afaik. systemtap has ability to launch probes as non-root, via a 
> >> > privileged
> >> > daemon, but it is restricted to probe scripts that the administrator has
> >> > pre-defined.
> >> 
> >> One exception is gdb's static userspace probes support.  If you can run
> >> gdb on QEMU then you can trace the same events as SystemTap.  I have
> >> never tried this GDB feature:
> >> 
> >> https://sourceware.org/gdb/onlinedocs/gdb/Static-Probe-Points.html
> >> 
> >> It should work out of the box if your distro builds QEMU with the
> >> 'dtrace' backend enabled.
> 
> > Wow, that's great to learn about. It does indeed work !
> 
> > If you knew alot about ptrace() you could probably build something
> > that use ptrace() and these probe points to call your dynamic
> > instrumentation code with reasonable low overheads.
> 
> I don't think so. Ptrace traps into the kernel and stops the process while a
> separate process decides what to do. That's between 3 and 4 orders of 
> magnitude
> slower than calling an instrumentor function.

Dan might be referring to dynamic patching a jump to the instrumentation
function.

A static userspace probe is a single nop instruction (plus metadata
stored in a separate ELF section).  Using ptrace you can binary patch
the nop instruction.

Unfortunately a single nop instruction cannot hold most x86
instructions.  uprobes places a breakpoint instruction (INT $3 - 0xcc)
there.  That works because it's just one byte.

This technique would be way out of scope for qemu.git but perhaps
perf(1) or a stand-alone tool could implement it.  There are libraries
for binary patching like http://www.dyninst.org/dyninst.

Stefan


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-07-28 Thread Lluís Vilanova

Stefan Hajnoczi writes:

> On Thu, Jul 27, 2017 at 11:40:17AM +0100, Peter Maydell wrote:
>> On 27 July 2017 at 11:32, Stefan Hajnoczi  wrote:
>> > On Wed, Jul 26, 2017 at 03:44:39PM +0300, Lluís Vilanova wrote:
>> >> And why exactly is this a threat? Because it can be used to "extend" QEMU
>> >> without touching its sources? Is this a realistic threat? (it's a rather 
>> >> brittle
>> >> way to do it, so I'm not sure it's practical)
>> >
>> > Unfortunately it is a problem.  I recently came across a product that
>> > was using LD_PRELOAD= to "integrate" with QEMU.  People really abuse
>> > these interfaces instead of integrating their features cleanly into
>> > QEMU.
>> 
>> ...if people who want to do this kind of thing already can and
>> do use LD_PRELOAD for it, I don't think we should worry too much
>> about trying to make the instrumentation plugin API bulletproof
>> against similar abuse.
>> 
>> > I see the use cases that Peter has been describing and am sure we can
>> > come up with good solutions.  What I care about is that it doesn't allow
>> > loading a .so that connects to arbitrary trace events.
>> 
>> That said, I agree that we don't really need an arbitrary-trace-event
>> setup here, and we should probably design our API so that it isn't
>> handing the trace plugin hooks pointers into QEMU's internals.
>> We want an API that makes it easy for people to do things based on
>> changes of the guest binary's state (registers, insns, etc etc)
>> and which makes it hard for them to accidentally trip themselves up
>> (eg by prodding around in QEMU internal data structures).
>> This will have the secondary benefit that it's unlikely that future
>> changes to QEMU will break plugin code.
>> 
>> >> As a side note, I find instrumentation to be most useful for guest code 
>> >> events,
>> >> which mostly contain non-pointer values (except for the CPUState*).
>> 
>> For instance we definitely should not be passing a CPUState* to
>> any plugin function.

> The gdbstub protocol has relevant features for accessing guest memory,
> registers, etc.  Perhaps a set of QEMU-specific events can be added
> (e.g. tb generated) so it's possible to instrument and control the
> guest from an instrumentation program (written in any language).

> Perhaps there is a fundamental reason why this isn't possible due to the
> protocol design, because using gdbstub halts all vcpus, etc.  I don't
> know.

> Do you think this is an interesting direction?  It definitely seems like
> a powerful approach though performance would be less than running native
> code inside the QEMU process.

That's the same approach someone else dubbed as using a fifo with "synchronous"
events, right? I have some measurements on this using a pipe, and overheads are
1000x to 2300x for each communication event (compared to a function call, and
depending on whether each process/thread is pinned to the same or different
CPU).


Cheers,
  Lluis

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-07-28 Thread Lluís Vilanova

Daniel P Berrange writes:

> On Fri, Jul 28, 2017 at 02:34:30PM +0100, Stefan Hajnoczi wrote:
>> On Thu, Jul 27, 2017 at 04:45:35PM +0100, Daniel P. Berrange wrote:
>> > On Thu, Jul 27, 2017 at 04:33:01PM +0100, Peter Maydell wrote:
>> > > On 27 July 2017 at 16:21, Daniel P. Berrange  wrote:
>> > > > On Thu, Jul 27, 2017 at 11:54:29AM +0100, Peter Maydell wrote:
>> > > >> That said, yes, I was going to ask if we could do this via
>> > > >> leveraging the tracepoint infrastructure and whatever scripting
>> > > >> facilities it provides. Are there any good worked examples of
>> > > >> this sort of thing? Can you do it as an ordinary non-root user?
>> > > >
>> > > > Do you have a particular thing you'd like to see an example of ?
>> > > >
>> > > > To dynamically probe a function which doesn't have a tracepoint
>> > > > defined you can do:
>> > > >
>> > > > probe process("/usr/bin/qemu-x86_64").function("helper_syscall") {
>> > > >   printf("syscall stasrt\n")
>> > > > }
>> > > >
>> > > > but getting access to the function args is not as easy as with
>> > > > pre-defined tracepoints.
>> > > 
>> > > How do I go about actually running that script? What I
>> > > have in mind by "worked example" is something like a blog
>> > > post that says "ok, here's a problem, we want to find out
>> > > what QEMU is doing in situation X, here's how you do this
>> > > with $TRACING_THINGY" and generally steps you through how
>> > > it works assuming you know nothing at all about whatever
>> > > the tracing facility you're using is.
>> > 
>> > Ok, so something like this example that I wrote for libvirt a
>> > while back then
>> > 
>> >   
>> > https://www.berrange.com/posts/2011/11/30/watching-the-libvirt-rpc-protocol-using-systemtap/
>> > 
>> > 
>> > > > You can't typically run this as root,
>> > > 
>> > > Do you mean "non-root" ?
>> > 
>> > Sigh, yes, of course.
>> > 
>> > > > however, I don't think that's a
>> > > > huge issue, because most QEMU deployments are not running as your own
>> > > > user account anyway, so you can't directly interact with them no
>> > > > matter what.
>> > > 
>> > > It is important, because almost all uses of TCG QEMU are
>> > > running it from the command line as non-root normal users,
>> > > especially if they're trying to debug what's going on with a
>> > > guest binary. So any tracing solution for this kind of usecase
>> > > must work without requiring root access, I think.
>> > 
>> > None of the Linux integrated tracing tools allow direct non-root access
>> > afaik. systemtap has ability to launch probes as non-root, via a privileged
>> > daemon, but it is restricted to probe scripts that the administrator has
>> > pre-defined.
>> 
>> One exception is gdb's static userspace probes support.  If you can run
>> gdb on QEMU then you can trace the same events as SystemTap.  I have
>> never tried this GDB feature:
>> 
>> https://sourceware.org/gdb/onlinedocs/gdb/Static-Probe-Points.html
>> 
>> It should work out of the box if your distro builds QEMU with the
>> 'dtrace' backend enabled.

> Wow, that's great to learn about. It does indeed work !

> If you knew alot about ptrace() you could probably build something
> that use ptrace() and these probe points to call your dynamic
> instrumentation code with reasonable low overheads.

I don't think so. Ptrace traps into the kernel and stops the process while a
separate process decides what to do. That's between 3 and 4 orders of magnitude
slower than calling an instrumentor function.


Cheers,
  Lluis

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-07-28 Thread Lluís Vilanova

Daniel P Berrange writes:

> On Fri, Jul 28, 2017 at 02:41:19PM +0100, Peter Maydell wrote:
>> On 28 July 2017 at 14:34, Stefan Hajnoczi  wrote:
>> > Lluís/Peter: What are the requirements for instrumentation code
>> > interacting with the running QEMU instance?  simpletrace is
>> > asynchronous, meaning it does not wait for anyone handle the trace event
>> > before continuing execution, and is therefore not suitable for
>> > SystemTap-style scripts that can interact with the program while
>> > handling a trace event.
>> 
>> I think you'd probably want synchronous -- it's pretty helpful
>> to be able to say "register a trace event hook that doesn't
>> fire very often, and use that to get to the region of
>> execution that's of interest to you, then enable more hooks
>> to get more detail at that point". (For instance, "wait til
>> we've executed 5,000,000 instructions, then turn on the
>> tracing of all instruction execution, register modification
>> and memory accesses".)

> Currently simpletrace probes have a fixed action when they are enabled,
> namely to print state to the trace log file. Perhaps we can make the
> action more flexible, if we create a more formal protocol for simpletrace
> to let it talk over a UNIX socket. By default it could send probe data
> asynchronously as now, but you could mark probes such that they require
> a synchronous ACK, thus pausing execution until that ACK is received
> from the instrumenting program.

> For that to be useful, we would need to have allow probes to be turned
> on/off via this trace socket, since the normal HMP/QMP monitor execution
> would be blocked while this probe is running.

This sounds like much more work to me for much lower performance (copy through
fifo/socket & process synchronization). Especially if many interesting events
become "synchronous". Also, with MTTCG using fifos becomes more complex on both
sides (select on the client & synchronization of events between streams on both
sides).

As a performance example, I've used the approach of this series to perform some
non-trivial instrumentation and I can get performance on par with PIN (a
state-of-the-art tool from intel) on most SPEC benchmarks.

Honestly, I would try to see if there's a GPL-preserving way to allow native
instrumentation libraries to run inside QEMU (either dynamic or static, that's
less important to me).

As for the (minimum) requirements I've collected:

* Peek at register values and guest memory.
* Enumerate guest cpus.
* Control when events are raised to minimize overheads (e.g., avoid generating
  TCG code to trace a guest code event we don't care about just now).
* When a guest code event is raised at translation time, let instrumentor decide
  if it needs to be raised at execution time too (e.g., might be decided on
  instruction opcode)
* Minimum overhead when instrumenting (I think the proposed fifo/socket approach
  does not meet this; point 2 helps a lot here, which is what the current
  tracing code does)
* Let instrumentor know when TBs are being freed (i.e., to free internal per-TB
  structures; I have a patch queued adding this event).

Nice to have:

* Avoid recompilation for each new instrumentation logic.
* Poke at register values and guest memory.
* [synchronous for sure] Let user annotate guest programs to raise additional
  events with guest-specified information (the hypertrace series I sent).
* [synchronous for sure] Make guest breakpoints/watchpoints raise an event (for
  when we cannot annotate the guest code; I have some patches).
* Trigger QEMU's event tracing routine when the instrumentation code
  decides. This is what this series does now, as then lazy instrumentors don't
  need to write their own tracing-to-disk code. Otherwise, per-event tracing
  states would need to be independent of instrumentation states (but the logic
  is exactly the same).
* Let instrumentor define additional arguments to execution-time events during
  translation-time (e.g., to quickly index into an instrumentor struct when the
  execution-time event comes that references info collected during translation).
* Attach an instrumentor-specified value to each guest CPU (e.g., pointing to
  the instrumentor's cpu state). Might be less necessary if the point above is
  met (if we only look at overall performance).

Cheers,
  Lluis

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-07-28 Thread Lluís Vilanova

Stefan Hajnoczi writes:

> On Thu, Jul 27, 2017 at 04:45:35PM +0100, Daniel P. Berrange wrote:
>> On Thu, Jul 27, 2017 at 04:33:01PM +0100, Peter Maydell wrote:
>> > On 27 July 2017 at 16:21, Daniel P. Berrange  wrote:
>> > > On Thu, Jul 27, 2017 at 11:54:29AM +0100, Peter Maydell wrote:
>> > >> That said, yes, I was going to ask if we could do this via
>> > >> leveraging the tracepoint infrastructure and whatever scripting
>> > >> facilities it provides. Are there any good worked examples of
>> > >> this sort of thing? Can you do it as an ordinary non-root user?
>> > >
>> > > Do you have a particular thing you'd like to see an example of ?
>> > >
>> > > To dynamically probe a function which doesn't have a tracepoint
>> > > defined you can do:
>> > >
>> > > probe process("/usr/bin/qemu-x86_64").function("helper_syscall") {
>> > >   printf("syscall stasrt\n")
>> > > }
>> > >
>> > > but getting access to the function args is not as easy as with
>> > > pre-defined tracepoints.
>> > 
>> > How do I go about actually running that script? What I
>> > have in mind by "worked example" is something like a blog
>> > post that says "ok, here's a problem, we want to find out
>> > what QEMU is doing in situation X, here's how you do this
>> > with $TRACING_THINGY" and generally steps you through how
>> > it works assuming you know nothing at all about whatever
>> > the tracing facility you're using is.
>> 
>> Ok, so something like this example that I wrote for libvirt a
>> while back then
>> 
>> https://www.berrange.com/posts/2011/11/30/watching-the-libvirt-rpc-protocol-using-systemtap/
>> 
>> 
>> > > You can't typically run this as root,
>> > 
>> > Do you mean "non-root" ?
>> 
>> Sigh, yes, of course.
>> 
>> > > however, I don't think that's a
>> > > huge issue, because most QEMU deployments are not running as your own
>> > > user account anyway, so you can't directly interact with them no
>> > > matter what.
>> > 
>> > It is important, because almost all uses of TCG QEMU are
>> > running it from the command line as non-root normal users,
>> > especially if they're trying to debug what's going on with a
>> > guest binary. So any tracing solution for this kind of usecase
>> > must work without requiring root access, I think.
>> 
>> None of the Linux integrated tracing tools allow direct non-root access
>> afaik. systemtap has ability to launch probes as non-root, via a privileged
>> daemon, but it is restricted to probe scripts that the administrator has
>> pre-defined.

> One exception is gdb's static userspace probes support.  If you can run
> gdb on QEMU then you can trace the same events as SystemTap.  I have
> never tried this GDB feature:

>   https://sourceware.org/gdb/onlinedocs/gdb/Static-Probe-Points.html

> It should work out of the box if your distro builds QEMU with the
> 'dtrace' backend enabled.

I tried it once long time ago and didn't work for me. Maybe I just missed
something.


Lluis

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-07-28 Thread Daniel P. Berrange

On Fri, Jul 28, 2017 at 02:41:19PM +0100, Peter Maydell wrote:
> On 28 July 2017 at 14:34, Stefan Hajnoczi  wrote:
> > Lluís/Peter: What are the requirements for instrumentation code
> > interacting with the running QEMU instance?  simpletrace is
> > asynchronous, meaning it does not wait for anyone handle the trace event
> > before continuing execution, and is therefore not suitable for
> > SystemTap-style scripts that can interact with the program while
> > handling a trace event.
> 
> I think you'd probably want synchronous -- it's pretty helpful
> to be able to say "register a trace event hook that doesn't
> fire very often, and use that to get to the region of
> execution that's of interest to you, then enable more hooks
> to get more detail at that point". (For instance, "wait til
> we've executed 5,000,000 instructions, then turn on the
> tracing of all instruction execution, register modification
> and memory accesses".)

Currently simpletrace probes have a fixed action when they are enabled,
namely to print state to the trace log file. Perhaps we can make the
action more flexible, if we create a more formal protocol for simpletrace
to let it talk over a UNIX socket. By default it could send probe data
asynchronously as now, but you could mark probes such that they require
a synchronous ACK, thus pausing execution until that ACK is received
from the instrumenting program.

For that to be useful, we would need to have allow probes to be turned
on/off via this trace socket, since the normal HMP/QMP monitor execution
would be blocked while this probe is running.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-07-28 Thread Daniel P. Berrange

On Fri, Jul 28, 2017 at 02:34:30PM +0100, Stefan Hajnoczi wrote:
> On Thu, Jul 27, 2017 at 04:45:35PM +0100, Daniel P. Berrange wrote:
> > On Thu, Jul 27, 2017 at 04:33:01PM +0100, Peter Maydell wrote:
> > > On 27 July 2017 at 16:21, Daniel P. Berrange  wrote:
> > > > On Thu, Jul 27, 2017 at 11:54:29AM +0100, Peter Maydell wrote:
> > > >> That said, yes, I was going to ask if we could do this via
> > > >> leveraging the tracepoint infrastructure and whatever scripting
> > > >> facilities it provides. Are there any good worked examples of
> > > >> this sort of thing? Can you do it as an ordinary non-root user?
> > > >
> > > > Do you have a particular thing you'd like to see an example of ?
> > > >
> > > > To dynamically probe a function which doesn't have a tracepoint
> > > > defined you can do:
> > > >
> > > > probe process("/usr/bin/qemu-x86_64").function("helper_syscall") {
> > > >   printf("syscall stasrt\n")
> > > > }
> > > >
> > > > but getting access to the function args is not as easy as with
> > > > pre-defined tracepoints.
> > > 
> > > How do I go about actually running that script? What I
> > > have in mind by "worked example" is something like a blog
> > > post that says "ok, here's a problem, we want to find out
> > > what QEMU is doing in situation X, here's how you do this
> > > with $TRACING_THINGY" and generally steps you through how
> > > it works assuming you know nothing at all about whatever
> > > the tracing facility you're using is.
> > 
> > Ok, so something like this example that I wrote for libvirt a
> > while back then
> > 
> >   
> > https://www.berrange.com/posts/2011/11/30/watching-the-libvirt-rpc-protocol-using-systemtap/
> > 
> > 
> > > > You can't typically run this as root,
> > > 
> > > Do you mean "non-root" ?
> > 
> > Sigh, yes, of course.
> > 
> > > > however, I don't think that's a
> > > > huge issue, because most QEMU deployments are not running as your own
> > > > user account anyway, so you can't directly interact with them no
> > > > matter what.
> > > 
> > > It is important, because almost all uses of TCG QEMU are
> > > running it from the command line as non-root normal users,
> > > especially if they're trying to debug what's going on with a
> > > guest binary. So any tracing solution for this kind of usecase
> > > must work without requiring root access, I think.
> > 
> > None of the Linux integrated tracing tools allow direct non-root access
> > afaik. systemtap has ability to launch probes as non-root, via a privileged
> > daemon, but it is restricted to probe scripts that the administrator has
> > pre-defined.
> 
> One exception is gdb's static userspace probes support.  If you can run
> gdb on QEMU then you can trace the same events as SystemTap.  I have
> never tried this GDB feature:
> 
>   https://sourceware.org/gdb/onlinedocs/gdb/Static-Probe-Points.html
> 
> It should work out of the box if your distro builds QEMU with the
> 'dtrace' backend enabled.

Wow, that's great to learn about. It does indeed work !

If you knew alot about ptrace() you could probably build something
that use ptrace() and these probe points to call your dynamic
instrumentation code with reasonable low overheads.

> > That pretty much leaves re-building QEMU, LD_PRELOADS, or something
> > ptrace(), or qemu's built-in simpletrace feature, as the remaining
> > options.  We have a scripts/simpletrace.py that lets you load a
> > trace file into python and process it, but as written that's aimed
> > as post-processing a tracefile you've previously collected.
> > 
> > It would be desirable to write a more advanced simpletrace python
> > module that could collect & process the trace data live, and also
> > interact with the qemu monitor to change what events are enabled
> > dynamically.  Basically we'd need a way for the simpletrace backend
> > to output its data to a fifo, instead of creating an file on disk,
> > then you could dynanically consume it.
> 
> That would be interesting, I know Alex Bennee has wrangled with large
> (~10 GB?) simpletrace files and it's not a pleasant experience :).
> 
> Lluís/Peter: What are the requirements for instrumentation code
> interacting with the running QEMU instance?  simpletrace is
> asynchronous, meaning it does not wait for anyone handle the trace event
> before continuing execution, and is therefore not suitable for
> SystemTap-style scripts that can interact with the program while
> handling a trace event.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-07-28 Thread Stefan Hajnoczi

On Thu, Jul 27, 2017 at 11:40:17AM +0100, Peter Maydell wrote:
> On 27 July 2017 at 11:32, Stefan Hajnoczi  wrote:
> > On Wed, Jul 26, 2017 at 03:44:39PM +0300, Lluís Vilanova wrote:
> >> And why exactly is this a threat? Because it can be used to "extend" QEMU
> >> without touching its sources? Is this a realistic threat? (it's a rather 
> >> brittle
> >> way to do it, so I'm not sure it's practical)
> >
> > Unfortunately it is a problem.  I recently came across a product that
> > was using LD_PRELOAD= to "integrate" with QEMU.  People really abuse
> > these interfaces instead of integrating their features cleanly into
> > QEMU.
> 
> ...if people who want to do this kind of thing already can and
> do use LD_PRELOAD for it, I don't think we should worry too much
> about trying to make the instrumentation plugin API bulletproof
> against similar abuse.
> 
> > I see the use cases that Peter has been describing and am sure we can
> > come up with good solutions.  What I care about is that it doesn't allow
> > loading a .so that connects to arbitrary trace events.
> 
> That said, I agree that we don't really need an arbitrary-trace-event
> setup here, and we should probably design our API so that it isn't
> handing the trace plugin hooks pointers into QEMU's internals.
> We want an API that makes it easy for people to do things based on
> changes of the guest binary's state (registers, insns, etc etc)
> and which makes it hard for them to accidentally trip themselves up
> (eg by prodding around in QEMU internal data structures).
> This will have the secondary benefit that it's unlikely that future
> changes to QEMU will break plugin code.
> 
> >> As a side note, I find instrumentation to be most useful for guest code 
> >> events,
> >> which mostly contain non-pointer values (except for the CPUState*).
> 
> For instance we definitely should not be passing a CPUState* to
> any plugin function.

The gdbstub protocol has relevant features for accessing guest memory,
registers, etc.  Perhaps a set of QEMU-specific events can be added
(e.g. tb generated) so it's possible to instrument and control the
guest from an instrumentation program (written in any language).

Perhaps there is a fundamental reason why this isn't possible due to the
protocol design, because using gdbstub halts all vcpus, etc.  I don't
know.

Do you think this is an interesting direction?  It definitely seems like
a powerful approach though performance would be less than running native
code inside the QEMU process.

Stefan


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-07-28 Thread Peter Maydell

On 28 July 2017 at 14:34, Stefan Hajnoczi  wrote:
> Lluís/Peter: What are the requirements for instrumentation code
> interacting with the running QEMU instance?  simpletrace is
> asynchronous, meaning it does not wait for anyone handle the trace event
> before continuing execution, and is therefore not suitable for
> SystemTap-style scripts that can interact with the program while
> handling a trace event.

I think you'd probably want synchronous -- it's pretty helpful
to be able to say "register a trace event hook that doesn't
fire very often, and use that to get to the region of
execution that's of interest to you, then enable more hooks
to get more detail at that point". (For instance, "wait til
we've executed 5,000,000 instructions, then turn on the
tracing of all instruction execution, register modification
and memory accesses".)

thanks
-- PMM

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-07-28 Thread Stefan Hajnoczi

On Thu, Jul 27, 2017 at 04:45:35PM +0100, Daniel P. Berrange wrote:
> On Thu, Jul 27, 2017 at 04:33:01PM +0100, Peter Maydell wrote:
> > On 27 July 2017 at 16:21, Daniel P. Berrange  wrote:
> > > On Thu, Jul 27, 2017 at 11:54:29AM +0100, Peter Maydell wrote:
> > >> That said, yes, I was going to ask if we could do this via
> > >> leveraging the tracepoint infrastructure and whatever scripting
> > >> facilities it provides. Are there any good worked examples of
> > >> this sort of thing? Can you do it as an ordinary non-root user?
> > >
> > > Do you have a particular thing you'd like to see an example of ?
> > >
> > > To dynamically probe a function which doesn't have a tracepoint
> > > defined you can do:
> > >
> > > probe process("/usr/bin/qemu-x86_64").function("helper_syscall") {
> > >   printf("syscall stasrt\n")
> > > }
> > >
> > > but getting access to the function args is not as easy as with
> > > pre-defined tracepoints.
> > 
> > How do I go about actually running that script? What I
> > have in mind by "worked example" is something like a blog
> > post that says "ok, here's a problem, we want to find out
> > what QEMU is doing in situation X, here's how you do this
> > with $TRACING_THINGY" and generally steps you through how
> > it works assuming you know nothing at all about whatever
> > the tracing facility you're using is.
> 
> Ok, so something like this example that I wrote for libvirt a
> while back then
> 
>   
> https://www.berrange.com/posts/2011/11/30/watching-the-libvirt-rpc-protocol-using-systemtap/
> 
> 
> > > You can't typically run this as root,
> > 
> > Do you mean "non-root" ?
> 
> Sigh, yes, of course.
> 
> > > however, I don't think that's a
> > > huge issue, because most QEMU deployments are not running as your own
> > > user account anyway, so you can't directly interact with them no
> > > matter what.
> > 
> > It is important, because almost all uses of TCG QEMU are
> > running it from the command line as non-root normal users,
> > especially if they're trying to debug what's going on with a
> > guest binary. So any tracing solution for this kind of usecase
> > must work without requiring root access, I think.
> 
> None of the Linux integrated tracing tools allow direct non-root access
> afaik. systemtap has ability to launch probes as non-root, via a privileged
> daemon, but it is restricted to probe scripts that the administrator has
> pre-defined.

One exception is gdb's static userspace probes support.  If you can run
gdb on QEMU then you can trace the same events as SystemTap.  I have
never tried this GDB feature:

  https://sourceware.org/gdb/onlinedocs/gdb/Static-Probe-Points.html

It should work out of the box if your distro builds QEMU with the
'dtrace' backend enabled.

> That pretty much leaves re-building QEMU, LD_PRELOADS, or something
> ptrace(), or qemu's built-in simpletrace feature, as the remaining
> options.  We have a scripts/simpletrace.py that lets you load a
> trace file into python and process it, but as written that's aimed
> as post-processing a tracefile you've previously collected.
> 
> It would be desirable to write a more advanced simpletrace python
> module that could collect & process the trace data live, and also
> interact with the qemu monitor to change what events are enabled
> dynamically.  Basically we'd need a way for the simpletrace backend
> to output its data to a fifo, instead of creating an file on disk,
> then you could dynanically consume it.

That would be interesting, I know Alex Bennee has wrangled with large
(~10 GB?) simpletrace files and it's not a pleasant experience :).

Lluís/Peter: What are the requirements for instrumentation code
interacting with the running QEMU instance?  simpletrace is
asynchronous, meaning it does not wait for anyone handle the trace event
before continuing execution, and is therefore not suitable for
SystemTap-style scripts that can interact with the program while
handling a trace event.

Stefan


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-07-27 Thread Lluís Vilanova

Daniel P Berrange writes:

> On Thu, Jul 27, 2017 at 11:54:29AM +0100, Peter Maydell wrote:
>> On 27 July 2017 at 11:43, Daniel P. Berrange  wrote:
>> > Maybe I'm missing something, but aren't all these things
>> > already possible via either the statically defined tracepoints
>> > QEMU exposes, or by placing dynamic tracepoints on arbitrary
>> > code locations using dtrace/systemtap/lttng-ust.
>> 
>> Last time I looked we didn't have tracepoints on most of
>> the events you'd be interested in.
>> 
>> That said, yes, I was going to ask if we could do this via
>> leveraging the tracepoint infrastructure and whatever scripting
>> facilities it provides. Are there any good worked examples of
>> this sort of thing? Can you do it as an ordinary non-root user?

> Do you have a particular thing you'd like to see an example of ?

> To dynamically probe a function which doesn't have a tracepoint
> defined you can do:

> probe process("/usr/bin/qemu-x86_64").function("helper_syscall") {
>   printf("syscall stasrt\n")
> }

> but getting access to the function args is not as easy as with
> pre-defined tracepoints.

> You can't typically run this as root, however, I don't think that's a
> huge issue, because most QEMU deployments are not running as your own
> user account anyway, so you can't directly interact with them no
> matter what.

> If the goal is to be easy to instrument without havig to rebuild
> QEMU, then I think using one of the existing trace backends is
> the best viable option, as those are already enabled by distros.
> I find it very unlikely that Fedora/RHEL would ever enable a
> trace backend that lets you load arbitrary code into the QEMU
> process, so you'd be back to having to rebuild QEMU again even
> with that approach.

My idea is to use the guest code tracing events to perform some complex analysis
of the executed guest code. E.g., tracing executed instructions to a file to
calculate SimPoints [1], analysing guest instructions and memory accesses to
track information flow or feeding that info to a processor simulator. In fact,
I've used my non-upstream version of QEMU+instrumentation to do the first two
and a few others.

My problem with dtrace is efficiency of the instrumentation code, and the lack
of APIs to perform a few more complex analysis like peek/poke on guest memory,
registers or even control event tracing states. I'm not proposing to have such
APIs on QEMU upstream (unless there's interest), but they are easy enough for me
to maintain for my particular needs if we're talking about the instrumentation
support in this series.

So Stefan, could you elaborate on your proposal of having to build the
instrumentation code into QEMU? I would prefer a dynamic library if we can craft
it in a way that ensures the proper license restrictions, but a "static library"
could work just as well for me (I initially had a branch with both solutions,
but it got too tedious to rebase).

Thanks,
  Lluis

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-07-27 Thread Daniel P. Berrange

On Thu, Jul 27, 2017 at 04:33:01PM +0100, Peter Maydell wrote:
> On 27 July 2017 at 16:21, Daniel P. Berrange  wrote:
> > On Thu, Jul 27, 2017 at 11:54:29AM +0100, Peter Maydell wrote:
> >> That said, yes, I was going to ask if we could do this via
> >> leveraging the tracepoint infrastructure and whatever scripting
> >> facilities it provides. Are there any good worked examples of
> >> this sort of thing? Can you do it as an ordinary non-root user?
> >
> > Do you have a particular thing you'd like to see an example of ?
> >
> > To dynamically probe a function which doesn't have a tracepoint
> > defined you can do:
> >
> > probe process("/usr/bin/qemu-x86_64").function("helper_syscall") {
> >   printf("syscall stasrt\n")
> > }
> >
> > but getting access to the function args is not as easy as with
> > pre-defined tracepoints.
> 
> How do I go about actually running that script? What I
> have in mind by "worked example" is something like a blog
> post that says "ok, here's a problem, we want to find out
> what QEMU is doing in situation X, here's how you do this
> with $TRACING_THINGY" and generally steps you through how
> it works assuming you know nothing at all about whatever
> the tracing facility you're using is.

Ok, so something like this example that I wrote for libvirt a
while back then

  
https://www.berrange.com/posts/2011/11/30/watching-the-libvirt-rpc-protocol-using-systemtap/


> > You can't typically run this as root,
> 
> Do you mean "non-root" ?

Sigh, yes, of course.

> > however, I don't think that's a
> > huge issue, because most QEMU deployments are not running as your own
> > user account anyway, so you can't directly interact with them no
> > matter what.
> 
> It is important, because almost all uses of TCG QEMU are
> running it from the command line as non-root normal users,
> especially if they're trying to debug what's going on with a
> guest binary. So any tracing solution for this kind of usecase
> must work without requiring root access, I think.

None of the Linux integrated tracing tools allow direct non-root access
afaik. systemtap has ability to launch probes as non-root, via a privileged
daemon, but it is restricted to probe scripts that the administrator has
pre-defined.

That pretty much leaves re-building QEMU, LD_PRELOADS, or something
ptrace(), or qemu's built-in simpletrace feature, as the remaining
options.  We have a scripts/simpletrace.py that lets you load a
trace file into python and process it, but as written that's aimed
as post-processing a tracefile you've previously collected.

It would be desirable to write a more advanced simpletrace python
module that could collect & process the trace data live, and also
interact with the qemu monitor to change what events are enabled
dynamically.  Basically we'd need a way for the simpletrace backend
to output its data to a fifo, instead of creating an file on disk,
then you could dynanically consume it.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-07-27 Thread Peter Maydell

On 27 July 2017 at 16:21, Daniel P. Berrange  wrote:
> On Thu, Jul 27, 2017 at 11:54:29AM +0100, Peter Maydell wrote:
>> That said, yes, I was going to ask if we could do this via
>> leveraging the tracepoint infrastructure and whatever scripting
>> facilities it provides. Are there any good worked examples of
>> this sort of thing? Can you do it as an ordinary non-root user?
>
> Do you have a particular thing you'd like to see an example of ?
>
> To dynamically probe a function which doesn't have a tracepoint
> defined you can do:
>
> probe process("/usr/bin/qemu-x86_64").function("helper_syscall") {
>   printf("syscall stasrt\n")
> }
>
> but getting access to the function args is not as easy as with
> pre-defined tracepoints.

How do I go about actually running that script? What I
have in mind by "worked example" is something like a blog
post that says "ok, here's a problem, we want to find out
what QEMU is doing in situation X, here's how you do this
with $TRACING_THINGY" and generally steps you through how
it works assuming you know nothing at all about whatever
the tracing facility you're using is.

> You can't typically run this as root,

Do you mean "non-root" ?

> however, I don't think that's a
> huge issue, because most QEMU deployments are not running as your own
> user account anyway, so you can't directly interact with them no
> matter what.

It is important, because almost all uses of TCG QEMU are
running it from the command line as non-root normal users,
especially if they're trying to debug what's going on with a
guest binary. So any tracing solution for this kind of usecase
must work without requiring root access, I think.

I think the users for this are pretty much completely
distinct from anybody who would use the term "deployment"
for their usage of QEMU :-)

thanks
-- PMM

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-07-27 Thread Daniel P. Berrange

On Thu, Jul 27, 2017 at 11:54:29AM +0100, Peter Maydell wrote:
> On 27 July 2017 at 11:43, Daniel P. Berrange  wrote:
> > Maybe I'm missing something, but aren't all these things
> > already possible via either the statically defined tracepoints
> > QEMU exposes, or by placing dynamic tracepoints on arbitrary
> > code locations using dtrace/systemtap/lttng-ust.
> 
> Last time I looked we didn't have tracepoints on most of
> the events you'd be interested in.
> 
> That said, yes, I was going to ask if we could do this via
> leveraging the tracepoint infrastructure and whatever scripting
> facilities it provides. Are there any good worked examples of
> this sort of thing? Can you do it as an ordinary non-root user?

Do you have a particular thing you'd like to see an example of ?

To dynamically probe a function which doesn't have a tracepoint
defined you can do:

probe process("/usr/bin/qemu-x86_64").function("helper_syscall") {
  printf("syscall stasrt\n")
}

but getting access to the function args is not as easy as with
pre-defined tracepoints.

You can't typically run this as root, however, I don't think that's a
huge issue, because most QEMU deployments are not running as your own
user account anyway, so you can't directly interact with them no
matter what.

If the goal is to be easy to instrument without havig to rebuild
QEMU, then I think using one of the existing trace backends is
the best viable option, as those are already enabled by distros.
I find it very unlikely that Fedora/RHEL would ever enable a
trace backend that lets you load arbitrary code into the QEMU
process, so you'd be back to having to rebuild QEMU again even
with that approach.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-07-27 Thread Lluís Vilanova

Peter Maydell writes:

> On 27 July 2017 at 11:43, Daniel P. Berrange  wrote:
>> Maybe I'm missing something, but aren't all these things
>> already possible via either the statically defined tracepoints
>> QEMU exposes, or by placing dynamic tracepoints on arbitrary
>> code locations using dtrace/systemtap/lttng-ust.

> Last time I looked we didn't have tracepoints on most of
> the events you'd be interested in.

> That said, yes, I was going to ask if we could do this via
> leveraging the tracepoint infrastructure and whatever scripting
> facilities it provides. Are there any good worked examples of
> this sort of thing? Can you do it as an ordinary non-root user?

The hypertrace series I just sent contains an example that uses dtrace to
trigger changes on the tracing state of events based on information from other
events [1].

AFAIK it requires root privileges, and it's much less efficient than loading the
instrumentation library proposed in this series (see that there's also an API to
control event tracing).

[1] https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg08458.html

Cheers,
  Lluis

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-07-27 Thread Peter Maydell

On 27 July 2017 at 11:43, Daniel P. Berrange  wrote:
> Maybe I'm missing something, but aren't all these things
> already possible via either the statically defined tracepoints
> QEMU exposes, or by placing dynamic tracepoints on arbitrary
> code locations using dtrace/systemtap/lttng-ust.

Last time I looked we didn't have tracepoints on most of
the events you'd be interested in.

That said, yes, I was going to ask if we could do this via
leveraging the tracepoint infrastructure and whatever scripting
facilities it provides. Are there any good worked examples of
this sort of thing? Can you do it as an ordinary non-root user?

thanks
-- PMM

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-07-27 Thread Daniel P. Berrange

On Wed, Jul 26, 2017 at 12:49:00PM +0100, Peter Maydell wrote:
> On 26 July 2017 at 12:26, Stefan Hajnoczi  wrote:
> > On Tue, Jul 25, 2017 at 02:30:06PM +0100, Peter Maydell wrote:
> >> Is your proposal that my-instrumentation.c gets compiled into
> >> QEMU at this point? That doesn't seem like a great idea to
> >> me, because it means you can only use this tracing if you
> >> build QEMU yourself, and distros won't enable it and so
> >> lots of our users won't have convenient access to it.
> >> I'd much rather see a cleanly designed plugin interface
> >> (which we should be able to implement in a manner that doesn't
> >> let the plugin monkey patch arbitrary parts of QEMU beyond
> >> what can already be achieved via LD_PRELOAD).
> >
> > Trace events are not a stable public API.  They change between releases
> > and need to be interpreted together with the QEMU source code.
> >
> > It doesn't make sense to be doing instrumentation inside the QEMU
> > process but not want to compile QEMU.
> >
> > What is the use case?
> 
> I want the continuous stream of people who come along
> and want to do interesting things (or even just straightforward
> things like "show me all the memory accesses") with tracing
> their guest binary to be able to do so without having to
> rebuild QEMU. We don't require people to rebuild QEMU
> themselves to attach gdb to it, and tracing of guest
> behaviour is a similar "I just want to introspect what
> my guest is doing" use case.
> 
> I think this is distinct from use of the trace API
> for qemu-internal events as used in a lot of the C
> code.
> 
> Some simple use cases:
>  * a plugin which prints out instruction execution traces
> by hooking into "insn executed" and printing something
> helpful (you can extend this into producing traces
> in whatever standard format some other program wants
> to inhale, but at a minimum it would be something we
> could point people at that makes more sense than our
> current "only print insns at translate time" debug);
> it's also easy to add bells and whistles for "start
> tracing at time T" and so on
>  * a plugin that make cachegrind-style guesses about
> cache usage by tracking memory accesses and simulating
> cache effects
>  * a plugin that prints all the system calls the
> guest makes (by hooking into "insn executed" and
> printing info when the insn is an SVC)
>  * trivial statistics like "count executed instructions
> and branches taken"

Maybe I'm missing something, but aren't all these things
already possible via either the statically defined tracepoints
QEMU exposes, or by placing dynamic tracepoints on arbitrary
code locations using dtrace/systemtap/lttng-ust. 

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-07-27 Thread Peter Maydell

On 27 July 2017 at 11:32, Stefan Hajnoczi  wrote:
> On Wed, Jul 26, 2017 at 03:44:39PM +0300, Lluís Vilanova wrote:
>> And why exactly is this a threat? Because it can be used to "extend" QEMU
>> without touching its sources? Is this a realistic threat? (it's a rather 
>> brittle
>> way to do it, so I'm not sure it's practical)
>
> Unfortunately it is a problem.  I recently came across a product that
> was using LD_PRELOAD= to "integrate" with QEMU.  People really abuse
> these interfaces instead of integrating their features cleanly into
> QEMU.

...if people who want to do this kind of thing already can and
do use LD_PRELOAD for it, I don't think we should worry too much
about trying to make the instrumentation plugin API bulletproof
against similar abuse.

> I see the use cases that Peter has been describing and am sure we can
> come up with good solutions.  What I care about is that it doesn't allow
> loading a .so that connects to arbitrary trace events.

That said, I agree that we don't really need an arbitrary-trace-event
setup here, and we should probably design our API so that it isn't
handing the trace plugin hooks pointers into QEMU's internals.
We want an API that makes it easy for people to do things based on
changes of the guest binary's state (registers, insns, etc etc)
and which makes it hard for them to accidentally trip themselves up
(eg by prodding around in QEMU internal data structures).
This will have the secondary benefit that it's unlikely that future
changes to QEMU will break plugin code.

>> As a side note, I find instrumentation to be most useful for guest code 
>> events,
>> which mostly contain non-pointer values (except for the CPUState*).

For instance we definitely should not be passing a CPUState* to
any plugin function.

thanks
-- PMM

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-07-27 Thread Stefan Hajnoczi

On Wed, Jul 26, 2017 at 03:44:39PM +0300, Lluís Vilanova wrote:
> Stefan Hajnoczi writes:
> 
> > On Tue, Jul 25, 2017 at 06:11:43PM +0300, Lluís Vilanova wrote:
> >> Peter Maydell writes:
> >> 
> >> > On 25 July 2017 at 14:19, Stefan Hajnoczi  wrote:
> >> >> Instead I suggest adding a trace backend generates calls to registered
> >> >> "callback" functions:
> >> >> 
> >> >> $ cat >my-instrumentation.c
> >> >> #include "trace/control.h"
> >> >> 
> >> >> static void my_cpu_in(unsigned int addr, char size, unsigned int val)
> >> >> {
> >> >> printf("my_cpu_in\n");
> >> >> }
> >> >> 
> >> >> static void my_init(void)
> >> >> {
> >> >> trace_register_event_callback("cpu_in", my_cpu_in);
> >> >> trace_enable_events("cpu_in");
> >> >> }
> >> >> trace_init(my_init);
> >> >> 
> >> >> $ ./configure --enable-trace-backends=log,callback && make -j4
> >> >> 
> >> >> This is still a clean interface that allows instrumentation code to be
> >> >> kept separate from the trace event call sites.
> >> >> 
> >> >> The instrumentation code gets compiled into QEMU, but that shouldn't be
> >> >> a huge burden since QEMU's Makefiles only recompile changed source
> >> >> files (only the first build is slow).
> >> 
> >> > Is your proposal that my-instrumentation.c gets compiled into
> >> > QEMU at this point? That doesn't seem like a great idea to
> >> > me, because it means you can only use this tracing if you
> >> > build QEMU yourself, and distros won't enable it and so
> >> > lots of our users won't have convenient access to it.
> >> > I'd much rather see a cleanly designed plugin interface
> >> > (which we should be able to implement in a manner that doesn't
> >> > let the plugin monkey patch arbitrary parts of QEMU beyond
> >> > what can already be achieved via LD_PRELOAD).
> >> 
> >> Just to be clear, what do you both mean by monkey-patching?
> >> 
> >> * Accessing unintended symbols in QEMU from the library (and from there
> >> modifying QEMU's behavior).
> >> * QEMU using symbols on the library instead of its own just because they 
> >> have
> >> the same name.
> >> 
> >> As I said, the former can be accomplished by compiling QEMU with
> >> "-fvisibility=hidden".
> >> 
> >> The latter is already achieved by using dlopen with RTLD_LOCAL (the 
> >> default).
> 
> > Instrumentation functions are invoked in the same memory space as QEMU,
> > so pointer arguments can be modified.
> 
> > Think of all the void *s arguments we trace.  Instrumentation code can
> > access those structs, hijack function pointers, etc.  No symbols are
> > required.
> 
> And why exactly is this a threat? Because it can be used to "extend" QEMU
> without touching its sources? Is this a realistic threat? (it's a rather 
> brittle
> way to do it, so I'm not sure it's practical)

Unfortunately it is a problem.  I recently came across a product that
was using LD_PRELOAD= to "integrate" with QEMU.  People really abuse
these interfaces instead of integrating their features cleanly into
QEMU.

I see the use cases that Peter has been describing and am sure we can
come up with good solutions.  What I care about is that it doesn't allow
loading a .so that connects to arbitrary trace events.

> If that's the case, forcing instrumentation libraries to be exclusively GPL
> seems like a solution to me, just as GPL protects QEMU itself.
> 
> Do we agree on that? Or did I miss something else?

How the license applies depends on how user instrumentation code
interfaces with QEMU.  We probably need to ask the QEMU project's
lawyers for confirmation.  I don't think it's as simple as saying "the
interface will be protected by GPL".

Let's discuss the use cases and how instrumentation should work in
another subthread first.  Then towards the end we can come back to GPL
again.

> As a side note, I find instrumentation to be most useful for guest code 
> events,
> which mostly contain non-pointer values (except for the CPUState*).

That's great.  I'll read the entire series to get a feel for what
instrumentation code needs to be able to do.  I'm wondering if a stable
public API is possible that does not leave room for abuse.

Stefan


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-07-26 Thread Lluís Vilanova

Stefan Hajnoczi writes:

> On Tue, Jul 25, 2017 at 06:11:43PM +0300, Lluís Vilanova wrote:
>> Peter Maydell writes:
>> 
>> > On 25 July 2017 at 14:19, Stefan Hajnoczi  wrote:
>> >> Instead I suggest adding a trace backend generates calls to registered
>> >> "callback" functions:
>> >> 
>> >> $ cat >my-instrumentation.c
>> >> #include "trace/control.h"
>> >> 
>> >> static void my_cpu_in(unsigned int addr, char size, unsigned int val)
>> >> {
>> >> printf("my_cpu_in\n");
>> >> }
>> >> 
>> >> static void my_init(void)
>> >> {
>> >> trace_register_event_callback("cpu_in", my_cpu_in);
>> >> trace_enable_events("cpu_in");
>> >> }
>> >> trace_init(my_init);
>> >> 
>> >> $ ./configure --enable-trace-backends=log,callback && make -j4
>> >> 
>> >> This is still a clean interface that allows instrumentation code to be
>> >> kept separate from the trace event call sites.
>> >> 
>> >> The instrumentation code gets compiled into QEMU, but that shouldn't be
>> >> a huge burden since QEMU's Makefiles only recompile changed source
>> >> files (only the first build is slow).
>> 
>> > Is your proposal that my-instrumentation.c gets compiled into
>> > QEMU at this point? That doesn't seem like a great idea to
>> > me, because it means you can only use this tracing if you
>> > build QEMU yourself, and distros won't enable it and so
>> > lots of our users won't have convenient access to it.
>> > I'd much rather see a cleanly designed plugin interface
>> > (which we should be able to implement in a manner that doesn't
>> > let the plugin monkey patch arbitrary parts of QEMU beyond
>> > what can already be achieved via LD_PRELOAD).
>> 
>> Just to be clear, what do you both mean by monkey-patching?
>> 
>> * Accessing unintended symbols in QEMU from the library (and from there
>> modifying QEMU's behavior).
>> * QEMU using symbols on the library instead of its own just because they have
>> the same name.
>> 
>> As I said, the former can be accomplished by compiling QEMU with
>> "-fvisibility=hidden".
>> 
>> The latter is already achieved by using dlopen with RTLD_LOCAL (the default).

> Instrumentation functions are invoked in the same memory space as QEMU,
> so pointer arguments can be modified.

> Think of all the void *s arguments we trace.  Instrumentation code can
> access those structs, hijack function pointers, etc.  No symbols are
> required.

And why exactly is this a threat? Because it can be used to "extend" QEMU
without touching its sources? Is this a realistic threat? (it's a rather brittle
way to do it, so I'm not sure it's practical)

If that's the case, forcing instrumentation libraries to be exclusively GPL
seems like a solution to me, just as GPL protects QEMU itself.

Do we agree on that? Or did I miss something else?

As a side note, I find instrumentation to be most useful for guest code events,
which mostly contain non-pointer values (except for the CPUState*).


Cheers,
  Lluis

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-07-26 Thread Lluís Vilanova

Stefan Hajnoczi writes:

> On Tue, Jul 25, 2017 at 05:47:08PM +0300, Lluís Vilanova wrote:
>> Stefan Hajnoczi writes:
>> 
>> > On Mon, Jul 24, 2017 at 08:02:24PM +0300, Lluís Vilanova wrote:
>> >> This series adds a basic interface to instrument tracing events and 
>> >> control
>> >> their tracing state.
>> >> 
>> >> The instrumentation code is dynamically loaded into QEMU (either when it 
>> >> starts
>> >> or later using its remote control interfaces).
>> >> 
>> >> All events can be instrumented, but the instrumentable events must be 
>> >> explicitly
>> >> specified at configure time.
>> >> 
>> >> Signed-off-by: Lluís Vilanova 
>> 
>> > Hi Lluís,
>> > I'm concerned that the shared library interface will be abused to monkey
>> > patch code into QEMU far beyond instrumentation use cases and/or avoid
>> > the responsibilities of the GPL license.
>> 
>> > Instead I suggest adding a trace backend generates calls to registered
>> > "callback" functions:
>> 
>> >   $ cat >my-instrumentation.c
>> >   #include "trace/control.h"
>> 
>> >   static void my_cpu_in(unsigned int addr, char size, unsigned int val)
>> >   {
>> >   printf("my_cpu_in\n");
>> >   }
>> 
>> >   static void my_init(void)
>> >   {
>> >   trace_register_event_callback("cpu_in", my_cpu_in);
>> >   trace_enable_events("cpu_in");
>> >   }
>> >   trace_init(my_init);
>> 
>> >   $ ./configure --enable-trace-backends=log,callback && make -j4
>> 
>> > This is still a clean interface that allows instrumentation code to be
>> > kept separate from the trace event call sites.
>> 
>> > The instrumentation code gets compiled into QEMU, but that shouldn't be
>> > a huge burden since QEMU's Makefiles only recompile changed source
>> > files (only the first build is slow).
>> 
>> > Does this alternative sound reasonable to you?
>> 
>> You mean to add a user-provided .c file to QEMU at compile-time? (I'm 
>> assuming
>> we can keep the "user API" proposed in this series, instead of the one you
>> showed).
>> 
>> First, a user might want to provide more than just a .c, so we might have to
>> accept a directory that produces a library that is included into QEMU at link
>> time (a bit more complicated to do portably).
>> 
>> Second, the user can still do the same actions you want to shield from,
>> regardless of whether it's a dynamically loaded library (i.e., access any
>> fuction in QEMU).
>> 
>> What I propose to do instead is:
>> 
>> * For the monkey-patch part, we can limit symbol resolution to the
>> instrumentation API functions when loading the library (e.g., compile QEMU
>> with -fvisibility=hidden).
>> 
>> * For the license part, that is a legal issue that can be handled by the API
>> header license, right? (the "public" headers I added are GPL, not
>> LGPL). Besides, if only the intended API is available, I'm not sure if that
>> matters (e.g., we don't care about the license of a dtrace script, since it
>> only has the API provided by QEMU+dtrace).
>> 
>> This would be similar to Linux's module support; only selected functions are
>> available to modules, and we could add a license check (e.g., 
>> QI_LICENSE("GPL")
>> must be on the instrumentation library or it won't be loaded).

> Proprietary Linux kernel modules are controversial and some still
> consider them license violations - especially when an "open source" shim
> module is used to interface between proprietary code and the kernel.

> What is the use case for this instrumentation?

As I said, we can require instrumentation libraries to be GPL (and nothing
else), so no proprietary code is possible without a license violation (then
we're on the same field as if someone ships a modified QEMU binary, which
technically is just as doable).

As for use-cases, see Peter's email and my response.


Cheers,
  Lluis

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-07-26 Thread Lluís Vilanova

Peter Maydell writes:

> On 26 July 2017 at 12:26, Stefan Hajnoczi  wrote:
>> On Tue, Jul 25, 2017 at 02:30:06PM +0100, Peter Maydell wrote:
>>> Is your proposal that my-instrumentation.c gets compiled into
>>> QEMU at this point? That doesn't seem like a great idea to
>>> me, because it means you can only use this tracing if you
>>> build QEMU yourself, and distros won't enable it and so
>>> lots of our users won't have convenient access to it.
>>> I'd much rather see a cleanly designed plugin interface
>>> (which we should be able to implement in a manner that doesn't
>>> let the plugin monkey patch arbitrary parts of QEMU beyond
>>> what can already be achieved via LD_PRELOAD).
>> 
>> Trace events are not a stable public API.  They change between releases
>> and need to be interpreted together with the QEMU source code.
>> 
>> It doesn't make sense to be doing instrumentation inside the QEMU
>> process but not want to compile QEMU.
>> 
>> What is the use case?

> I want the continuous stream of people who come along
> and want to do interesting things (or even just straightforward
> things like "show me all the memory accesses") with tracing
> their guest binary to be able to do so without having to
> rebuild QEMU. We don't require people to rebuild QEMU
> themselves to attach gdb to it, and tracing of guest
> behaviour is a similar "I just want to introspect what
> my guest is doing" use case.

> I think this is distinct from use of the trace API
> for qemu-internal events as used in a lot of the C
> code.

> Some simple use cases:
>  * a plugin which prints out instruction execution traces
> by hooking into "insn executed" and printing something
> helpful (you can extend this into producing traces
> in whatever standard format some other program wants
> to inhale, but at a minimum it would be something we
> could point people at that makes more sense than our
> current "only print insns at translate time" debug);
> it's also easy to add bells and whistles for "start
> tracing at time T" and so on
>  * a plugin that make cachegrind-style guesses about
> cache usage by tracking memory accesses and simulating
> cache effects
>  * a plugin that prints all the system calls the
> guest makes (by hooking into "insn executed" and
> printing info when the insn is an SVC)
>  * trivial statistics like "count executed instructions
> and branches taken"

> (I also think that we should aim for that API for
> things like guest register changes and guest memory
> changes to not be particularly unstable even if we
> allow ourselves to make non-back-compatible changes
> to it.)

That's what I was thinking about (i.e., try to keep a minimally useful and
stable set of guest_* tracepoints). And I guess this also answer's Stefan's
other email about instrumentation use-cases.

Cheers,
  Lluis

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-07-26 Thread Peter Maydell

On 26 July 2017 at 12:26, Stefan Hajnoczi  wrote:
> On Tue, Jul 25, 2017 at 02:30:06PM +0100, Peter Maydell wrote:
>> Is your proposal that my-instrumentation.c gets compiled into
>> QEMU at this point? That doesn't seem like a great idea to
>> me, because it means you can only use this tracing if you
>> build QEMU yourself, and distros won't enable it and so
>> lots of our users won't have convenient access to it.
>> I'd much rather see a cleanly designed plugin interface
>> (which we should be able to implement in a manner that doesn't
>> let the plugin monkey patch arbitrary parts of QEMU beyond
>> what can already be achieved via LD_PRELOAD).
>
> Trace events are not a stable public API.  They change between releases
> and need to be interpreted together with the QEMU source code.
>
> It doesn't make sense to be doing instrumentation inside the QEMU
> process but not want to compile QEMU.
>
> What is the use case?

I want the continuous stream of people who come along
and want to do interesting things (or even just straightforward
things like "show me all the memory accesses") with tracing
their guest binary to be able to do so without having to
rebuild QEMU. We don't require people to rebuild QEMU
themselves to attach gdb to it, and tracing of guest
behaviour is a similar "I just want to introspect what
my guest is doing" use case.

I think this is distinct from use of the trace API
for qemu-internal events as used in a lot of the C
code.

Some simple use cases:
 * a plugin which prints out instruction execution traces
by hooking into "insn executed" and printing something
helpful (you can extend this into producing traces
in whatever standard format some other program wants
to inhale, but at a minimum it would be something we
could point people at that makes more sense than our
current "only print insns at translate time" debug);
it's also easy to add bells and whistles for "start
tracing at time T" and so on
 * a plugin that make cachegrind-style guesses about
cache usage by tracking memory accesses and simulating
cache effects
 * a plugin that prints all the system calls the
guest makes (by hooking into "insn executed" and
printing info when the insn is an SVC)
 * trivial statistics like "count executed instructions
and branches taken"

(I also think that we should aim for that API for
things like guest register changes and guest memory
changes to not be particularly unstable even if we
allow ourselves to make non-back-compatible changes
to it.)

thanks
-- PMM

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-07-26 Thread Stefan Hajnoczi

On Tue, Jul 25, 2017 at 05:47:08PM +0300, Lluís Vilanova wrote:
> Stefan Hajnoczi writes:
> 
> > On Mon, Jul 24, 2017 at 08:02:24PM +0300, Lluís Vilanova wrote:
> >> This series adds a basic interface to instrument tracing events and control
> >> their tracing state.
> >> 
> >> The instrumentation code is dynamically loaded into QEMU (either when it 
> >> starts
> >> or later using its remote control interfaces).
> >> 
> >> All events can be instrumented, but the instrumentable events must be 
> >> explicitly
> >> specified at configure time.
> >> 
> >> Signed-off-by: Lluís Vilanova 
> 
> > Hi Lluís,
> > I'm concerned that the shared library interface will be abused to monkey
> > patch code into QEMU far beyond instrumentation use cases and/or avoid
> > the responsibilities of the GPL license.
> 
> > Instead I suggest adding a trace backend generates calls to registered
> > "callback" functions:
> 
> >   $ cat >my-instrumentation.c
> >   #include "trace/control.h"
> 
> >   static void my_cpu_in(unsigned int addr, char size, unsigned int val)
> >   {
> >   printf("my_cpu_in\n");
> >   }
> 
> >   static void my_init(void)
> >   {
> >   trace_register_event_callback("cpu_in", my_cpu_in);
> >   trace_enable_events("cpu_in");
> >   }
> >   trace_init(my_init);
> 
> >   $ ./configure --enable-trace-backends=log,callback && make -j4
> 
> > This is still a clean interface that allows instrumentation code to be
> > kept separate from the trace event call sites.
> 
> > The instrumentation code gets compiled into QEMU, but that shouldn't be
> > a huge burden since QEMU's Makefiles only recompile changed source
> > files (only the first build is slow).
> 
> > Does this alternative sound reasonable to you?
> 
> You mean to add a user-provided .c file to QEMU at compile-time? (I'm assuming
> we can keep the "user API" proposed in this series, instead of the one you
> showed).
> 
> First, a user might want to provide more than just a .c, so we might have to
> accept a directory that produces a library that is included into QEMU at link
> time (a bit more complicated to do portably).
> 
> Second, the user can still do the same actions you want to shield from,
> regardless of whether it's a dynamically loaded library (i.e., access any
> fuction in QEMU).
> 
> What I propose to do instead is:
> 
> * For the monkey-patch part, we can limit symbol resolution to the
>   instrumentation API functions when loading the library (e.g., compile QEMU
>   with -fvisibility=hidden).
> 
> * For the license part, that is a legal issue that can be handled by the API
>   header license, right? (the "public" headers I added are GPL, not
>   LGPL). Besides, if only the intended API is available, I'm not sure if that
>   matters (e.g., we don't care about the license of a dtrace script, since it
>   only has the API provided by QEMU+dtrace).
> 
> This would be similar to Linux's module support; only selected functions are
> available to modules, and we could add a license check (e.g., 
> QI_LICENSE("GPL")
> must be on the instrumentation library or it won't be loaded).

Proprietary Linux kernel modules are controversial and some still
consider them license violations - especially when an "open source" shim
module is used to interface between proprietary code and the kernel.

What is the use case for this instrumentation?

Stefan


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-07-26 Thread Stefan Hajnoczi

On Tue, Jul 25, 2017 at 02:30:06PM +0100, Peter Maydell wrote:
> On 25 July 2017 at 14:19, Stefan Hajnoczi  wrote:
> > Instead I suggest adding a trace backend generates calls to registered
> > "callback" functions:
> >
> >   $ cat >my-instrumentation.c
> >   #include "trace/control.h"
> >
> >   static void my_cpu_in(unsigned int addr, char size, unsigned int val)
> >   {
> >   printf("my_cpu_in\n");
> >   }
> >
> >   static void my_init(void)
> >   {
> >   trace_register_event_callback("cpu_in", my_cpu_in);
> >   trace_enable_events("cpu_in");
> >   }
> >   trace_init(my_init);
> >
> >   $ ./configure --enable-trace-backends=log,callback && make -j4
> >
> > This is still a clean interface that allows instrumentation code to be
> > kept separate from the trace event call sites.
> >
> > The instrumentation code gets compiled into QEMU, but that shouldn't be
> > a huge burden since QEMU's Makefiles only recompile changed source
> > files (only the first build is slow).
> 
> Is your proposal that my-instrumentation.c gets compiled into
> QEMU at this point? That doesn't seem like a great idea to
> me, because it means you can only use this tracing if you
> build QEMU yourself, and distros won't enable it and so
> lots of our users won't have convenient access to it.
> I'd much rather see a cleanly designed plugin interface
> (which we should be able to implement in a manner that doesn't
> let the plugin monkey patch arbitrary parts of QEMU beyond
> what can already be achieved via LD_PRELOAD).

Trace events are not a stable public API.  They change between releases
and need to be interpreted together with the QEMU source code.

It doesn't make sense to be doing instrumentation inside the QEMU
process but not want to compile QEMU.

What is the use case?

Stefan


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-07-26 Thread Stefan Hajnoczi

On Tue, Jul 25, 2017 at 06:11:43PM +0300, Lluís Vilanova wrote:
> Peter Maydell writes:
> 
> > On 25 July 2017 at 14:19, Stefan Hajnoczi  wrote:
> >> Instead I suggest adding a trace backend generates calls to registered
> >> "callback" functions:
> >> 
> >> $ cat >my-instrumentation.c
> >> #include "trace/control.h"
> >> 
> >> static void my_cpu_in(unsigned int addr, char size, unsigned int val)
> >> {
> >> printf("my_cpu_in\n");
> >> }
> >> 
> >> static void my_init(void)
> >> {
> >> trace_register_event_callback("cpu_in", my_cpu_in);
> >> trace_enable_events("cpu_in");
> >> }
> >> trace_init(my_init);
> >> 
> >> $ ./configure --enable-trace-backends=log,callback && make -j4
> >> 
> >> This is still a clean interface that allows instrumentation code to be
> >> kept separate from the trace event call sites.
> >> 
> >> The instrumentation code gets compiled into QEMU, but that shouldn't be
> >> a huge burden since QEMU's Makefiles only recompile changed source
> >> files (only the first build is slow).
> 
> > Is your proposal that my-instrumentation.c gets compiled into
> > QEMU at this point? That doesn't seem like a great idea to
> > me, because it means you can only use this tracing if you
> > build QEMU yourself, and distros won't enable it and so
> > lots of our users won't have convenient access to it.
> > I'd much rather see a cleanly designed plugin interface
> > (which we should be able to implement in a manner that doesn't
> > let the plugin monkey patch arbitrary parts of QEMU beyond
> > what can already be achieved via LD_PRELOAD).
> 
> Just to be clear, what do you both mean by monkey-patching?
> 
> * Accessing unintended symbols in QEMU from the library (and from there
>   modifying QEMU's behavior).
> * QEMU using symbols on the library instead of its own just because they have
>   the same name.
> 
> As I said, the former can be accomplished by compiling QEMU with
> "-fvisibility=hidden".
> 
> The latter is already achieved by using dlopen with RTLD_LOCAL (the default).

Instrumentation functions are invoked in the same memory space as QEMU,
so pointer arguments can be modified.

Think of all the void *s arguments we trace.  Instrumentation code can
access those structs, hijack function pointers, etc.  No symbols are
required.

Stefan


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-07-25 Thread Lluís Vilanova

Peter Maydell writes:

> On 25 July 2017 at 14:19, Stefan Hajnoczi  wrote:
>> Instead I suggest adding a trace backend generates calls to registered
>> "callback" functions:
>> 
>> $ cat >my-instrumentation.c
>> #include "trace/control.h"
>> 
>> static void my_cpu_in(unsigned int addr, char size, unsigned int val)
>> {
>> printf("my_cpu_in\n");
>> }
>> 
>> static void my_init(void)
>> {
>> trace_register_event_callback("cpu_in", my_cpu_in);
>> trace_enable_events("cpu_in");
>> }
>> trace_init(my_init);
>> 
>> $ ./configure --enable-trace-backends=log,callback && make -j4
>> 
>> This is still a clean interface that allows instrumentation code to be
>> kept separate from the trace event call sites.
>> 
>> The instrumentation code gets compiled into QEMU, but that shouldn't be
>> a huge burden since QEMU's Makefiles only recompile changed source
>> files (only the first build is slow).

> Is your proposal that my-instrumentation.c gets compiled into
> QEMU at this point? That doesn't seem like a great idea to
> me, because it means you can only use this tracing if you
> build QEMU yourself, and distros won't enable it and so
> lots of our users won't have convenient access to it.
> I'd much rather see a cleanly designed plugin interface
> (which we should be able to implement in a manner that doesn't
> let the plugin monkey patch arbitrary parts of QEMU beyond
> what can already be achieved via LD_PRELOAD).

Just to be clear, what do you both mean by monkey-patching?

* Accessing unintended symbols in QEMU from the library (and from there
  modifying QEMU's behavior).
* QEMU using symbols on the library instead of its own just because they have
  the same name.

As I said, the former can be accomplished by compiling QEMU with
"-fvisibility=hidden".

The latter is already achieved by using dlopen with RTLD_LOCAL (the default).


Cheers,
  Lluis

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-07-25 Thread Lluís Vilanova

Stefan Hajnoczi writes:

> On Mon, Jul 24, 2017 at 08:02:24PM +0300, Lluís Vilanova wrote:
>> This series adds a basic interface to instrument tracing events and control
>> their tracing state.
>> 
>> The instrumentation code is dynamically loaded into QEMU (either when it 
>> starts
>> or later using its remote control interfaces).
>> 
>> All events can be instrumented, but the instrumentable events must be 
>> explicitly
>> specified at configure time.
>> 
>> Signed-off-by: Lluís Vilanova 

> Hi Lluís,
> I'm concerned that the shared library interface will be abused to monkey
> patch code into QEMU far beyond instrumentation use cases and/or avoid
> the responsibilities of the GPL license.

> Instead I suggest adding a trace backend generates calls to registered
> "callback" functions:

>   $ cat >my-instrumentation.c
>   #include "trace/control.h"

>   static void my_cpu_in(unsigned int addr, char size, unsigned int val)
>   {
>   printf("my_cpu_in\n");
>   }

>   static void my_init(void)
>   {
>   trace_register_event_callback("cpu_in", my_cpu_in);
>   trace_enable_events("cpu_in");
>   }
>   trace_init(my_init);

>   $ ./configure --enable-trace-backends=log,callback && make -j4

> This is still a clean interface that allows instrumentation code to be
> kept separate from the trace event call sites.

> The instrumentation code gets compiled into QEMU, but that shouldn't be
> a huge burden since QEMU's Makefiles only recompile changed source
> files (only the first build is slow).

> Does this alternative sound reasonable to you?

You mean to add a user-provided .c file to QEMU at compile-time? (I'm assuming
we can keep the "user API" proposed in this series, instead of the one you
showed).

First, a user might want to provide more than just a .c, so we might have to
accept a directory that produces a library that is included into QEMU at link
time (a bit more complicated to do portably).

Second, the user can still do the same actions you want to shield from,
regardless of whether it's a dynamically loaded library (i.e., access any
fuction in QEMU).

What I propose to do instead is:

* For the monkey-patch part, we can limit symbol resolution to the
  instrumentation API functions when loading the library (e.g., compile QEMU
  with -fvisibility=hidden).

* For the license part, that is a legal issue that can be handled by the API
  header license, right? (the "public" headers I added are GPL, not
  LGPL). Besides, if only the intended API is available, I'm not sure if that
  matters (e.g., we don't care about the license of a dtrace script, since it
  only has the API provided by QEMU+dtrace).

This would be similar to Linux's module support; only selected functions are
available to modules, and we could add a license check (e.g., QI_LICENSE("GPL")
must be on the instrumentation library or it won't be loaded).

Thanks,
  Lluis

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-07-25 Thread Peter Maydell

On 25 July 2017 at 14:19, Stefan Hajnoczi  wrote:
> Instead I suggest adding a trace backend generates calls to registered
> "callback" functions:
>
>   $ cat >my-instrumentation.c
>   #include "trace/control.h"
>
>   static void my_cpu_in(unsigned int addr, char size, unsigned int val)
>   {
>   printf("my_cpu_in\n");
>   }
>
>   static void my_init(void)
>   {
>   trace_register_event_callback("cpu_in", my_cpu_in);
>   trace_enable_events("cpu_in");
>   }
>   trace_init(my_init);
>
>   $ ./configure --enable-trace-backends=log,callback && make -j4
>
> This is still a clean interface that allows instrumentation code to be
> kept separate from the trace event call sites.
>
> The instrumentation code gets compiled into QEMU, but that shouldn't be
> a huge burden since QEMU's Makefiles only recompile changed source
> files (only the first build is slow).

Is your proposal that my-instrumentation.c gets compiled into
QEMU at this point? That doesn't seem like a great idea to
me, because it means you can only use this tracing if you
build QEMU yourself, and distros won't enable it and so
lots of our users won't have convenient access to it.
I'd much rather see a cleanly designed plugin interface
(which we should be able to implement in a manner that doesn't
let the plugin monkey patch arbitrary parts of QEMU beyond
what can already be achieved via LD_PRELOAD).

thanks
-- PMM

Re: [Qemu-devel] [PATCH 00/13] instrument: Add basic event instrumentation

2017-07-25 Thread Stefan Hajnoczi

On Mon, Jul 24, 2017 at 08:02:24PM +0300, Lluís Vilanova wrote:
> This series adds a basic interface to instrument tracing events and control
> their tracing state.
> 
> The instrumentation code is dynamically loaded into QEMU (either when it 
> starts
> or later using its remote control interfaces).
> 
> All events can be instrumented, but the instrumentable events must be 
> explicitly
> specified at configure time.
> 
> Signed-off-by: Lluís Vilanova 

Hi Lluís,
I'm concerned that the shared library interface will be abused to monkey
patch code into QEMU far beyond instrumentation use cases and/or avoid
the responsibilities of the GPL license.

Instead I suggest adding a trace backend generates calls to registered
"callback" functions:

  $ cat >my-instrumentation.c
  #include "trace/control.h"

  static void my_cpu_in(unsigned int addr, char size, unsigned int val)
  {
  printf("my_cpu_in\n");
  }

  static void my_init(void)
  {
  trace_register_event_callback("cpu_in", my_cpu_in);
  trace_enable_events("cpu_in");
  }
  trace_init(my_init);

  $ ./configure --enable-trace-backends=log,callback && make -j4

This is still a clean interface that allows instrumentation code to be
kept separate from the trace event call sites.

The instrumentation code gets compiled into QEMU, but that shouldn't be
a huge burden since QEMU's Makefiles only recompile changed source
files (only the first build is slow).

Does this alternative sound reasonable to you?

Stefan

signature.asc
Description: PGP signature

41 matches

Mail list logo