ftp://bochs.com/pub/freemware/prescan-2000_0222a.tar.gz

[from the README in the tarball]

This is the logic for the scan-before-execute technique
we have in our virtualization paper.  For now, it is a
standalone program, so that we can easily test it with
various cases, to see how accurately it performs.

It is important that this code work accurately, otherwise
we may lose control of execution, and therefore not properly
virtualize instructions, or other unpleasant side effect.

If you are interested in this kind of thing, please
help out by testing this code with as many different
cases as you can think of.  The prescan code is very
small, as instruction decoding is done in a different
function.  So now is a great time to dive in, help out,
and understand how this works, before it gets integrated
into the rest of FreeMWare.

Instructions for compiling and using:
=====================================

Type make in the top level directory, and
the change to tests/ and type make to assemble
the test cases.

The command line format is:

  prescan  filename  offset  is32

Filename is the *.bin file from tests/.  Offset is the
page offset to starting loading this file, generally
the same as the assembly origin.  Is32 is 0 for
16bit code, and 1 for 32bit code.

>From the top level directory, to run a test case named
'xyz.bin' which is 32bit with origin at 0, you might type:

  prescan  tests/xyz.bin 0 1

If you create a new test, add it to tests/ and include
an entry in tests/Makefile for it.  Just add it to
the 'tests:' line.  There aren't many tests yet.

The algorithm:
==============

The interesting part of the algorithm, is in prescan.c,
in the function prescan().  Please check it out.  The
actual fetch-decoding of an instruction is done in
fetchdecode.c.  This is just code from bochs, pared
down, since we don't need to do as much.  It can
be streamlined more later.

Anyhow, the algorith in prescan() is pretty short.  I'd
like some feedback on it, especially if there are edge
cases I missed etc.  It iterates through a consecutive
sequence of instructions starting at the given page
offset, marking bits in the code meta cache accordingly,
until a terminal condition is reached.

For local relative branches, if the branch target is within
the same page, prescan() is called recursively to the
branch target as well.  MAX_PRESCAN_DEPTH controls the
maximum recursion level.

Overlapping instructions are handled, or at least should be.

There are some decent comments in the code.

Currently, things are set up for a plain Pentium instruction
set.  Eventually, we will need some code that detects the
capabilities of the CPU, and modifies things accordingly, so
that we can scan additional instructions as well.

No scanned code is actually executed yet.  We could do
some limited execution, but that would be better done
when we merge this with the rest of the FreeMWare tree.

-Kevin

Reply via email to