Re: Blockade? (Re: [fonc] Reverse OMeta and Emulation)
Until early 2009 [1], FSF forbade anyone from creating a plug-in architecture for GCC. I mean, you could do it, but it would not be integrated upstream, since FSF policy is to have copyright assigned to them before they integrate your changes into the project. If you were to create a plug-in architecture for GCC and sign the copyright over to them, then they could just sit on the code until GCC evolved to the point where your plug-in architecture's interfaces no longer matched GCC's. See the GCC Plug-ins Wiki page for the following ace quote: *So, how do we permit plugins while prohibiting proprietary plugins, and how do we do it while staying within the bounds of copyright law which is the basis of the GPL?*[2] In a nutshell, FSF dislikes DRM, so their solution is to create their own DRM system. Increasingly, this political position is more and more irrelevant. Computer tools are so clever now that GPL v2 is simply not strong enough to mean anything in the sense FSF originally intended. It is so extremely sad that *compiler developers* do not understand that they can't truly complicate proprietary plug-ins simply by unstable plug-in API. Once a shop has developed enough plug-ins and value, there will be an inflection point reached where it will make more sense to create a module matching tool [3] that defines interface relations [4]. Compiler developers should understand this: a plug-in API primarily separates functionality (what the plug-in does) from integration (how the plug-in is connected to the core). At best an unstable plug-in API can be a performance overhead inconvenience, but with tools like distcc you can just map out your code to many cores. It goes without saying that Richard Stallman is incredibly closed-minded and visionary. [5] has the a picturesque definition [6]. See also my comments in [7] and Thomas Lord's follow-up on the same page in the thread. [1] http://www.sdtimes.com/link/33218 [2] http://gcc.gnu.org/wiki/GCC_Plugins [3] http://www.cl.cam.ac.uk/~srk31/research/papers/kell09mythical.pdf** [4] http://www.cl.cam.ac.uk/~srk31/research/papers/kell10component.pdf [5] http://www.wired.com/magazine/2010/04/ff_hackers/all/1 [6] Time has not softened him. In our original interview, Stallman said, “I’m the last survivor of a dead culture. And I don’t really belong in the world anymore. And in some ways I feel I ought to be dead.” Now, meeting over Chinese food, he reaffirms this. “I have certainly wished I had killed myself when I was born,” he says. “In terms of effect on the world, it’s very good that I’ve lived. And so I guess, if I could go back in time and prevent my birth, I wouldn’t do it. But I sure wish I hadn’t had so much pain.” [7] http://lambda-the-ultimate.org/node/3696#comment-52599 On Thu, Jun 24, 2010 at 1:23 AM, Casey Ransberger casey.obrie...@gmail.comwrote: Whoa, okay. Have to ask. GCC has an intermediate representation that's intentionally hard to work with, and you're saying that Stallman did this as a political blockade? I was under the impression that Clang got started up because some folks found GCC to be too crufty, not too political. This doesn't seem to make sense to me. Maybe I'm missing some context? Can you cite your sources or elaborate on your point? On Jun 23, 2010, at 2:25 PM, John Zabroski johnzabro...@gmail.com wrote: This is not entirely true. Basile Starynkévitch has written GNU MELT [1] as a way to circumvent hard to work with internal representation of GCC by letting you create GCC plug-ins in a Lisp dialect. This basically side-steps the traditional political blockade set-up by RMS. It is very clever, and starting to mature; Basile has fixed a number of issues in how he generates C code. I'm not sure about Gerry Jensen's idea, e.g. how much effort, whether it is worth the effort, etc. Creating a VM for legacy code is also difficult, since it will run rather slow unless you are a really good implementor (for example, the Hercules VM [2] for simulating IBM mainframes is rather slow last I checked due to how it has to intercept and translate instructions into the native architecture). [1] http://gcc.gnu.org/wiki/MiddleEndLispTranslator http://gcc.gnu.org/wiki/MiddleEndLispTranslator [2] http://www.hercules-390.org/http://www.hercules-390.org/ On Tue, Jun 22, 2010 at 11:37 AM, Monty Zukowski mo...@codetransform.com mo...@codetransform.com wrote: GNU C was explicitly designed to make its intermediate representation hard to work with. LLVM is a more practical choice. Monty On Mon, Jun 21, 2010 at 6:02 PM, Gerry J geral...@tpg.com.au geral...@tpg.com.au wrote: You may find the concept of semantic slicing relevant: http://www.cse.dmu.ac.uk/%7Emward/martin/papers/csmr2005-t.pdf http://www.cse.dmu.ac.uk/~mward/martin/papers/csmr2005-t.pdfhttp://www.cse.dmu.ac.uk/%7Emward/martin/papers/csmr2005-t.pdf There is software at: http://www.cse.dmu.ac.uk/%7Emward/fermat.html
Re: [fonc] Reverse OMeta and Emulation
This is not entirely true. Basile Starynkévitch has written GNU MELT [1] as a way to circumvent hard to work with internal representation of GCC by letting you create GCC plug-ins in a Lisp dialect. This basically side-steps the traditional political blockade set-up by RMS. It is very clever, and starting to mature; Basile has fixed a number of issues in how he generates C code. I'm not sure about Gerry Jensen's idea, e.g. how much effort, whether it is worth the effort, etc. Creating a VM for legacy code is also difficult, since it will run rather slow unless you are a really good implementor (for example, the Hercules VM [2] for simulating IBM mainframes is rather slow last I checked due to how it has to intercept and translate instructions into the native architecture). [1] http://gcc.gnu.org/wiki/MiddleEndLispTranslator [2] http://www.hercules-390.org/ On Tue, Jun 22, 2010 at 11:37 AM, Monty Zukowski mo...@codetransform.comwrote: GNU C was explicitly designed to make its intermediate representation hard to work with. LLVM is a more practical choice. Monty On Mon, Jun 21, 2010 at 6:02 PM, Gerry J geral...@tpg.com.au wrote: You may find the concept of semantic slicing relevant: http://www.cse.dmu.ac.uk/~mward/martin/papers/csmr2005-t.pdfhttp://www.cse.dmu.ac.uk/%7Emward/martin/papers/csmr2005-t.pdf There is software at: http://www.cse.dmu.ac.uk/~mward/fermat.htmlhttp://www.cse.dmu.ac.uk/%7Emward/fermat.html One possible path to explore is to take GNU C etc intermediate representation of source as the assembly language of a VM and reverse from that to a more portable VM, as in Squeak or Java. Perhaps Ometa could be combined in some way with FermaT to recognise patterns and port legacy code to a fonc VM ? Regards, Gerry Jensen 02 9713 6004 ___ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc ___ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc ___ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc
Blockade? (Re: [fonc] Reverse OMeta and Emulation)
Whoa, okay. Have to ask. GCC has an intermediate representation that's intentionally hard to work with, and you're saying that Stallman did this as a political blockade? I was under the impression that Clang got started up because some folks found GCC to be too crufty, not too political. This doesn't seem to make sense to me. Maybe I'm missing some context? Can you cite your sources or elaborate on your point? On Jun 23, 2010, at 2:25 PM, John Zabroski johnzabro...@gmail.com wrote: This is not entirely true. Basile Starynkévitch has written GNU MELT [1] as a way to circumvent hard to work with internal representation of GCC by letting you create GCC plug-ins in a Lisp dialect. This basically side-steps the traditional political blockade set-up by RMS. It is very clever, and starting to mature; Basile has fixed a number of issues in how he generates C code. I'm not sure about Gerry Jensen's idea, e.g. how much effort, whether it is worth the effort, etc. Creating a VM for legacy code is also difficult, since it will run rather slow unless you are a really good implementor (for example, the Hercules VM [2] for simulating IBM mainframes is rather slow last I checked due to how it has to intercept and translate instructions into the native architecture). [1] http://gcc.gnu.org/wiki/MiddleEndLispTranslator [2] http://www.hercules-390.org/ On Tue, Jun 22, 2010 at 11:37 AM, Monty Zukowski mo...@codetransform.com wrote: GNU C was explicitly designed to make its intermediate representation hard to work with. LLVM is a more practical choice. Monty On Mon, Jun 21, 2010 at 6:02 PM, Gerry J geral...@tpg.com.au wrote: You may find the concept of semantic slicing relevant: http://www.cse.dmu.ac.uk/~mward/martin/papers/csmr2005-t.pdf There is software at: http://www.cse.dmu.ac.uk/~mward/fermat.html One possible path to explore is to take GNU C etc intermediate representation of source as the assembly language of a VM and reverse from that to a more portable VM, as in Squeak or Java. Perhaps Ometa could be combined in some way with FermaT to recognise patterns and port legacy code to a fonc VM ? Regards, Gerry Jensen 02 9713 6004 ___ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc ___ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc ___ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc ___ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc
Re: [fonc] Reverse OMeta and Emulation
GNU C was explicitly designed to make its intermediate representation hard to work with. LLVM is a more practical choice. Monty On Mon, Jun 21, 2010 at 6:02 PM, Gerry J geral...@tpg.com.au wrote: You may find the concept of semantic slicing relevant: http://www.cse.dmu.ac.uk/~mward/martin/papers/csmr2005-t.pdf There is software at: http://www.cse.dmu.ac.uk/~mward/fermat.html One possible path to explore is to take GNU C etc intermediate representation of source as the assembly language of a VM and reverse from that to a more portable VM, as in Squeak or Java. Perhaps Ometa could be combined in some way with FermaT to recognise patterns and port legacy code to a fonc VM ? Regards, Gerry Jensen 02 9713 6004 ___ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc ___ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc
Re: [fonc] Reverse OMeta and Emulation
my effort had not gone nearly so high up the abstraction tree, but instead operated in a space more like an abstracted x86 machine. moving to a much higher level model, such as that of GCC IR or LLVM IR, would likely be difficult to pull off effectively starting from real machine code, such as x86. what I had done was essentially just partly inverting several of the low-level stages in the process: assembly, since my assembler was mostly data-driven, disassembly is not difficult using essentially the same data; partly abstracting over matters of word-size and opcode argument forms; ... then, it was this partially-abstracted form which was interpreted. this was at a similar level of abstraction to that in my lower-level codegen (namely dealing with registers and values as handles), rather than making it all the way back up to a target-neutral IR. converting to a higher-level IR would likely require something analogous to a compiler+optimizer, namely to translate these decoded instructions into generic IR sequences, and then try to optimize away all the cruft which doesn't matter (such as all the eflags magic for sequences which don't actually care about eflags). ... the eflags issue is mostly because, for example, in x86 nearly every conventional opcode modifies eflags, but in the majority of cases, these changed flags are irrelevant (however, a forward scan and bit-masking could likely allow for detecting cases where the modified flags are known to be irrelevant). also, x86 includes a small number of very complex opcodes, such as cpuid, which could be awkward if trying to produce an entirely generic IR (since cpuid changes its behavior and results depending on the values contained in certain registers), ... this level of translation though is likely to either rule out or hinder the use of self-modifying code, since SMC would essentially invalidate previously translated sequences. (in my case I had dealt with SMC simply by flushing the entire opcode cache, which in this case was essentially just a big hash-table holding opcode structures). ideally, with a more complex decode process, the process of flushing on SMC could be done cheaply and incrementally, rather than, say, essentially having to recompile an app in memory continuously simply as it happens to be self-modifying. luckily, most executable code is marked as read-only, and SMC cases are fairly rare, and so attempts at SMC are more often grounds for a simulated GPF, rather than grounds for flushing the decode cache. static translation, however, is likely to exclude the possiblity of SMC. or such... - Original Message - From: Monty Zukowski mo...@codetransform.com To: Fundamentals of New Computing fonc@vpri.org Sent: Tuesday, June 22, 2010 8:37 AM Subject: Re: [fonc] Reverse OMeta and Emulation GNU C was explicitly designed to make its intermediate representation hard to work with. LLVM is a more practical choice. Monty On Mon, Jun 21, 2010 at 6:02 PM, Gerry J geral...@tpg.com.au wrote: You may find the concept of semantic slicing relevant: http://www.cse.dmu.ac.uk/~mward/martin/papers/csmr2005-t.pdf There is software at: http://www.cse.dmu.ac.uk/~mward/fermat.html One possible path to explore is to take GNU C etc intermediate representation of source as the assembly language of a VM and reverse from that to a more portable VM, as in Squeak or Java. Perhaps Ometa could be combined in some way with FermaT to recognise patterns and port legacy code to a fonc VM ? Regards, Gerry Jensen 02 9713 6004 ___ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc
Re: [fonc] Reverse OMeta and Emulation
You may find the concept of semantic slicing relevant: http://www.cse.dmu.ac.uk/~mward/martin/papers/csmr2005-t.pdf There is software at: http://www.cse.dmu.ac.uk/~mward/fermat.html One possible path to explore is to take GNU C etc intermediate representation of source as the assembly language of a VM and reverse from that to a more portable VM, as in Squeak or Java. Perhaps Ometa could be combined in some way with FermaT to recognise patterns and port legacy code to a fonc VM ? Regards, Gerry Jensen 02 9713 6004 ___ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc
Re: [fonc] Reverse OMeta and Emulation
can't say exactly... but, before I wrote an interpreter for x86 (basically, like an emulator, but it doesn't bother with faking the entire system). but, in this case, I had basically used some logic from an assembler of mine to essentially disassemble the machine code into a higher-level virtual ISA (basically, it works like an assembler and compiler lower-end, but in reverse). this allowed a slightly simpler model on which to base the interpreter logic, rather than having do deal with all the issues of exact opcode encodings and various instruction forms at the level of the interpreter. in a few cases, certain groups of opcodes are handled as single larger units as well, ... (I guess it is more common to use big nested-switch statements to write such an interpreter, but I figured that this would be a painful experience, and so I didn't do so). actually, in the process I also noted that using linked-lists and function-pointers for opcode dispatch was cheaper than using a switch statement, and so this is how this interpreter is structured. likely the harder part in writing a full system emulator would be simulating all of the various pieces of hardware, ... which may be attached (as well as re-implementing probably certain pieces of firmware/... which may be present in the original machine), for which documentation may be sparse or absent, ... or such... - Original Message - From: Casey Ransberger casey.obrie...@gmail.com To: Fundamentals of New Computing fonc@vpri.org Sent: Sunday, June 20, 2010 11:45 AM Subject: [fonc] Reverse OMeta and Emulation First, I'd like to apologize for my last message to this list. I had intended to send it directly as a reply to Jecel's comments, but replied to the whole list by accident. I wouldn't ordinarily waste bandwidth on idle griping like that: mea culpa. On to the topic of the subject line... I've been thinking about learning computing history by doing. Emulation is expensive (in human terms.) Emulators are very complex, precision beasts. So many emulators are started and not finished, in part because commercial applications can be limited (leaving developers money-starved for their time,) and in part because the scope of the work is fairly large. An idea occurred to me: I was thinking about an example I'd seen from the FONC literature about using OMeta to transform representations of various heights ultimately into machine code, and I wondered: would it be possible to use the limited reversibility aspect of OMeta to reduce the complexity of an emulator by (to some extent?) translating machine code for one architecture to another? As I think it's fair to say that an emulator is a sort of virtual machine, is there something to be learned about VM complexity or portability here too? Just a thought! :) ___ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc ___ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc