Re: [fonc] Reverse OMeta and Emulation

Gerry J Tue, 22 Jun 2010 17:55:05 -0700

Hmm, i didn't explain myself properly at all.
Probably what i wrote is orthogonal to Casey's idea.
Picking up on where Casey wrote


"transform representations of various "heights" ultimately into machine code"

I was thinking of starting higher up than machine code and finishingwith something that could run reasonably well under emulation..The idea: move old code across to be compatible with fonc++-- when itemerges.

The -- meaning if it's statically typed now it would be then.

In a context of going from source to M/C, ( /for a different real orvirtual machine or to another source language (not exactly what Caseyraised)):

Then

/To leverage a common starting point or two why not take the C compilerIR (or any (GNU language compatible IR) or LLVM as an alternativestarting point) as the abstract machine /starting from/ the C <whateverlanguage> IR, and representing it in /either /some other (new higherlevel) or ((real or virtual) machine code language.So if there is a pattern such as a "forall loop" that Ometa or FermaTcan identify then you could either prepare a virtual machine ISA orindeed /up-level /it to something compatible with a typed version ofwhat emerges from fonc.Of course you would have to have the language spec for fonc++ before youset out on a particular case.


Maybe i dug a hole for myself. Oh well. :-[

Regards,
Gerry Jensen
02 9713 6004



BGB wrote:

my effort had not gone nearly so high "up" the abstraction tree, butinstead operated in a space more like an abstracted x86 machine.
moving to a much higher level model, such as that of GCC IR or LLVMIR, would likely be difficult to pull off effectively starting from"real" machine code, such as x86.
what I had done was essentially just partly inverting several of thelow-level stages in the process:assembly, since my assembler was mostly data-driven, disassembly isnot difficult using essentially the same data;
partly abstracting over matters of word-size and opcode argument forms;
...

then, it was this partially-abstracted form which was interpreted.
this was at a similar level of abstraction to that in my lower-levelcodegen (namely dealing with registers and values as handles), ratherthan making it all the way back up to a target-neutral IR.
converting to a higher-level IR would likely require somethinganalogous to a compiler+optimizer, namely to translate these decodedinstructions into generic IR sequences, and then try to optimize awayall the cruft which doesn't matter (such as all the "eflags" magic forsequences which don't actually care about eflags).
...
the "eflags" issue is mostly because, for example, in x86 nearly everyconventional opcode modifies eflags, but in the majority of cases,these changed flags are irrelevant (however, a forward scan andbit-masking could likely allow for detecting cases where the modifiedflags are known to be irrelevant).
also, x86 includes a small number of "very complex" opcodes, such as"cpuid", which could be awkward if trying to produce an entirelygeneric IR (since cpuid changes its behavior and results depending onthe values contained in certain registers), ...
this level of translation though is likely to either rule out orhinder the use of self-modifying code, since SMC would essentiallyinvalidate previously translated sequences.
(in my case I had dealt with SMC simply by flushing the entire opcodecache, which in this case was essentially just a big hash-tableholding "opcode" structures).
ideally, with a more complex "decode" process, the process of flushingon SMC could be done cheaply and incrementally, rather than, say,essentially having to recompile an app in memory continuously simplyas it happens to be self-modifying.
luckily, most executable code is marked as read-only, and SMC casesare fairly rare, and so attempts at SMC are more often grounds for asimulated GPF, rather than grounds for flushing the decode cache.
static translation, however, is likely to exclude the possiblity of SMC.


or such...
----- Original Message ----- From: "Monty Zukowski"<mo...@codetransform.com>
To: "Fundamentals of New Computing" <fonc@vpri.org>
Sent: Tuesday, June 22, 2010 8:37 AM
Subject: Re: [fonc] Reverse OMeta and Emulation
GNU C was explicitly designed to make its intermediate representation
hard to work with.  LLVM is a more practical choice.

Monty

On Mon, Jun 21, 2010 at 6:02 PM, Gerry J <geral...@tpg.com.au> wrote:
You may find the concept of semantic slicing relevant:
http://www.cse.dmu.ac.uk/~mward/martin/papers/csmr2005-t.pdf
There is software at:
http://www.cse.dmu.ac.uk/~mward/fermat.html

One possible path to explore is to take GNU C etc intermediate
representation of source as the "assembly language" of a VM andreverse from
that to a more portable VM, as in Squeak or Java.
Perhaps Ometa could be combined in some way with FermaT to recognise
patterns and port legacy code to a fonc VM ?

Regards,
Gerry Jensen
02 9713 6004
_______________________________________________
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc

_______________________________________________
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc

Re: [fonc] Reverse OMeta and Emulation

Reply via email to