Re: [Qemu-devel] How to make shadow memory for a process? and how to trace the data propation from the instruction level in QEMU?

Lluís Mon, 15 Nov 2010 04:01:58 -0800

Mulyadi Santosa writes:

>> Yes, I have read that paper, it’s wonderful!
>> 
>> Besides the Argos, the bitblaze group, led by Dawn Song in Berkeley, has
>> achieved great success in the taint analysis. The website about their
>> dynamic analysis work (called TEMU) can be found at:
>> http://bitblaze.cs.berkeley.edu/temu.html
>> 
>> And TEMU is now open-source.


> Thanks for sharing that...it's new stuff for me. So, why don't you
> just pick TEMU and improve it instead of...uhm...sorry if I am wrong,
> working from scratch? After all, I believe in both Argos and TEMU (and
> maybe other similar projects), they share common codes here and there.

> But ehm...CMIIW, seems like TEMU is based on Qemu 0.9,x, right? So
> it's.... sorry I forgot the name, the generated code is mostly a
> constructed by fragments of small codes generated by gcc. Now, it is
> qemu which does it by itself. So, a lot of things change
> (substantially).

I haven't read the TEMU work, but from the problem description I think
you want something similar to "Practical Taint-Based Protection using
Demand Emulation" or many others (I remember reading some of them a few
years ago on the ISCA, MICRO and/or ASPLOS conferences).


>> Yes. For each process’s memory space A, I wanna make a shadow memory B. The
>> shadow memory is used to store the tag of data. In other words, if addr in
>> memory A is tainted, then the corresponding byte in B should be marked to
>> indicate that addr in A is tainted.

The main question here is... what is the granularity that you want to
track with? Bytes? Words? Pages? This will greatly influence which is
your best approach.

Now that I think of it, you could use the tracing points I sent for
guest virtual memory accesses, and instrument them instead of calling a
file-tracing backend (this should provide a hook for an arbitrary
granularity). Then, simply keep track also of address-space changes and
your instrumentation code can always know when to activate propagation.

This, together with the optimization I sent for dynamic control of trace
generation in TCG emulation code should get you on tracks.

Of course, you should still modify all register-accessing instructions
to propagate information passing through the register set. For that,
maybe you could start with the "fetch" tracing/instrumentation point I
sent long time ago, which keeps track of general-purpose register
usage/definition on x86 (although I'm sure I left some astray usages due
to the decoding complexity in x86).


>> The guest os collects “higher” semantic
>> from the OS level, and the QEMU collects “lower” semantic from the
>> instruction level. Combination of both semantics is necessary in the
>> analysis process.

> The question is, in a situation where malware already compromise "the
> higher semantic", could we trust the analysis?

Beware, I've read exactly this kind of scheme on previous top-tier
conferences (but I think tests were using an architectural simulator, so
it's not for a current production environment).

I've found it :)

     Secure program execution via dynamic information flow tracking
     ASPLOS 2004


>> The question is: how to communicate between the QEMU and the guest OS, so
>> that they can cooperate with each other?

A few choices here, but you should first define if the communication
must be based just on control signals, and/or providing memory storage:
  * virtual device : If you need some kind of storage that the guest OS
    must access, you could look at the ivshmem device
  * backdoor instruction : It's the simplest option; I sent some patch
    series recently with two different implementations for x86.


Lluis

-- 
 "And it's much the same thing with knowledge, for whenever you learn
 something new, the whole world becomes that much richer."
 -- The Princess of Pure Reason, as told by Norton Juster in The Phantom
 Tollbooth

Re: [Qemu-devel] How to make shadow memory for a process? and how to trace the data propation from the instruction level in QEMU?

Reply via email to