On Sat, Dec 23, 2000 at 05:46:23PM -0500, Kevin Lawton wrote:
> [EMAIL PROTECTED] wrote:
>
> > Yes, a page fault is still time consuming. However, once a page of
> > code has been translated, the code that calls that page no longer
> > needs to fault for any future invocations.
> >
> > In other words, the scheme I outlined would result in a number of
> > faults proportional to the size of the code, where as the current
> > scheme results in a number of faults proportional to the running time
> > of the code.
> >
> > Since, for the majority of programs, the majority of running time is
> > spent in a few critical loops and functions, I think there would be a
> > large benefit to a translate once, run natively "many times" scheme.
>
>
> This doesn't work either. How do you differentiate what is code
> in a page, data, or junk due to code alignment?
>
> -Kevin
I fear my last response didn't fully express my idea.
Consider a page of code that contains three functions - foo, bar, and
baz.
0x0000 : foo:
i1
i2
...
0x0200 : bar:
x1
x2
...
0x0280 : return
...
0x0300 : baz:
y1
y2
...
To start with, the function bar is called. Since the page in question
has never been called before, the page tables do not permit the code
to be executed, and plex intercepts the 'call' instruction. Plex then
proceeds to translate the area of the page starting at 0x0200. It
successfully translates all the instructions between 0x0200 and 0x0280
within the page (the bar function), and then fills the rest of the
translated code page (the areas 0x0000-0x2000 and 0x0280-0x0fff) with
the one byte INT3 instruction.
After this has been completed, future calls to bar can happen without
any translation.
Later on, a call is made to the foo function. Since this area of the
translated page was filled with INT3 instructions, after the call
succeeds, an interrupt occurs and plex is invoked. Plex can then
determine that the interrupt was the result of a jump into a partially
translated page and proceeds to fill in the foo function. After foo
has been translated and the resulting translated code page has been
updated to contain the translated foo and bar code (but with INT3
still occupying the baz() and any other areas of the page) execution
resumes.
At this point, both foo and bar can execute seamlessly; baz can't but
as soon as an attempt is made to run it an interrupt will occur and
plex can do its work.
This example was simplistic (I don't consider jumps and branches
within bar) but I think the flood-filling INT3 trick can still work
within the functions themselves.
As I stated earlier, I'm not an expert in assembly, emulation,
compilers, etc. I accept that I may be missing an important step, but
I do not see what that step may be. :-) Please let me know if you see
this as a possible algorithm, or if there is an omission that I did
not think of.
-Kevin