You'd be well advised to play around with a standard build of
Valgrind, with "--tool=none --trace-flags=10001000 --trace-notbelow=0"
to get some idea what you're going to be able to get from vex.

> I've spent a bit of time looking at libvex.h and libvex_ir.h, and it
> seems as though this should be quite straightforward... as far as I
> understand it I can include libvex.h, set up the structs according to
> the architecture I am using, load an elf into memory and then call
> LibVEX_Translate with the appropriate VexTranslateArgs to translate a
> chunk of code. However, this does seem to be somewhat too good to be
> true, so I am wondering if I am missing something.

Well, it's more complex than that I think.  Some stuff to think about:

* vex as it stands doesn't have a way to give you the IR it generates,
  as it is set up to translate from machine code back to machine code.
  It would be easy to modify it to do that, though.

* at what point in the process do you want the IR?  The initial
  translation of each instruction is done independently, which means
  the semantics of each are precisely expressed in IR.  But the
  translations often contain a lot of redundant junk which can
  be folded out by ir_opt.c when the translations of a whole block
  of instructions are combined.  The effect is so marked that you'll
  often have a hard time relating the post-iropt version to the
  initial version.

* the initial translations contain lots of calls to helper functions
  for condition code evaluation.  These are mostly but not completely
  folded out in the post-iropt version.

* there are also calls to "dirty helper functions", for stuff which
  can't easily be expressed in IR.  You'll need to cope with those
  somehow.

* vex translates traces (short superblocks), so you'll wind up with
  duplicate translations of block tails, which you'll need to 
  figure out what to do about.  Also, since such block tail
  translations have different instructions preceding them, the post-iropt
  versions of the duplicate tails will in general be different.
  (since they are optimised in different initial contexts)

* loading elf files into memory is not what I'd call a barrel of
  laughs.  Once you've done that, you'll need to figure out what
  bits of it contain code, so you can offer those up to vex.

* if you're loading elf files yourself, you'll probably need to
  think about processing relocations yourself, else the jump, call
  and global data addresses will be nonsense.

Most of this is probably get-roundable, but requires careful attention
to myriad details.

> issues are due to the interpreted nature of the dynamic analysis in
> valgrind (please correct me if I'm wrong) and that VEX itself is
> probably quite efficient. Thanks!

The JIT itself isn't grossly inefficient, but it isn't super-fast
either.  The main emphasis is on correctness, instrumentability and
performance of the generated code.

J

------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
_______________________________________________
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users

Reply via email to