Re: Blockade? (Re: [fonc] Reverse OMeta and Emulation)

2010-06-24 Thread John Zabroski
Until early 2009 [1], FSF forbade anyone from creating a plug-in
architecture for GCC.  I mean, you could do it, but it would not be
integrated upstream, since FSF policy is to have copyright assigned to them
before they integrate your changes into the project.  If you were to create
a plug-in architecture for GCC and sign the copyright over to them, then
they could just sit on the code until GCC evolved to the point where your
plug-in architecture's interfaces no longer matched GCC's.  See the GCC
Plug-ins Wiki page for the following ace quote: *So, how do we permit
plugins while prohibiting proprietary plugins, and how do we do it while
staying within the bounds of copyright law which is the basis of the
GPL?*[2]  In a nutshell, FSF dislikes DRM, so their solution is to
create their
own DRM system.

Increasingly, this political position is more and more irrelevant.  Computer
tools are so clever now that GPL v2 is simply not strong enough to mean
anything in the sense FSF originally intended.  It is so extremely sad that
*compiler developers* do not understand that they can't truly complicate
proprietary plug-ins simply by unstable plug-in API.  Once a shop has
developed enough plug-ins and value, there will be an inflection point
reached where it will make more sense to create a module matching tool [3]
that defines interface relations [4].  Compiler developers should understand
this: a plug-in API primarily separates functionality (what the plug-in
does) from integration (how the plug-in is connected to the core).  At best
an unstable plug-in API can be a performance overhead inconvenience, but
with tools like distcc you can just map out your code to many cores.

It goes without saying that Richard Stallman is incredibly closed-minded and
visionary. [5] has the a picturesque definition [6].

See also my comments in [7] and Thomas Lord's follow-up on the same page in
the thread.

[1] http://www.sdtimes.com/link/33218
[2] http://gcc.gnu.org/wiki/GCC_Plugins
[3] http://www.cl.cam.ac.uk/~srk31/research/papers/kell09mythical.pdf**
[4] http://www.cl.cam.ac.uk/~srk31/research/papers/kell10component.pdf
[5] http://www.wired.com/magazine/2010/04/ff_hackers/all/1
[6] Time has not softened him. In our original interview, Stallman said,
“I’m the last survivor of a dead culture. And I don’t really belong in the
world anymore. And in some ways I feel I ought to be dead.” Now, meeting
over Chinese food, he reaffirms this. “I have certainly wished I had killed
myself when I was born,” he says. “In terms of effect on the world, it’s
very good that I’ve lived. And so I guess, if I could go back in time and
prevent my birth, I wouldn’t do it. But I sure wish I hadn’t had so much
pain.”
[7] http://lambda-the-ultimate.org/node/3696#comment-52599

On Thu, Jun 24, 2010 at 1:23 AM, Casey Ransberger
casey.obrie...@gmail.comwrote:

 Whoa, okay. Have to ask. GCC has an intermediate representation that's
 intentionally hard to work with, and you're saying that Stallman did this as
 a political blockade?

 I was under the impression that Clang got started up because some folks
 found GCC to be too crufty, not too political. This doesn't seem to make
 sense to me. Maybe I'm missing some context? Can you cite your sources or
 elaborate on your point?

 On Jun 23, 2010, at 2:25 PM, John Zabroski johnzabro...@gmail.com wrote:

 This is not entirely true.  Basile Starynkévitch has written GNU MELT [1]
 as a way to circumvent hard to work with internal representation of GCC by
 letting you create GCC plug-ins in a Lisp dialect.  This basically
 side-steps the traditional political blockade set-up by RMS.  It is very
 clever, and starting to mature; Basile has fixed a number of issues in how
 he generates C code.

 I'm not sure about Gerry Jensen's idea, e.g. how much effort, whether it is
 worth the effort, etc.

 Creating a VM for legacy code is also difficult, since it will run rather
 slow unless you are a really good implementor (for example, the Hercules VM
 [2] for simulating IBM mainframes is rather slow last I checked due to how
 it has to intercept and translate instructions into the native
 architecture).

 [1] http://gcc.gnu.org/wiki/MiddleEndLispTranslator
 http://gcc.gnu.org/wiki/MiddleEndLispTranslator
 [2] http://www.hercules-390.org/http://www.hercules-390.org/

 On Tue, Jun 22, 2010 at 11:37 AM, Monty Zukowski mo...@codetransform.com
 mo...@codetransform.com wrote:

 GNU C was explicitly designed to make its intermediate representation
 hard to work with.  LLVM is a more practical choice.

 Monty

 On Mon, Jun 21, 2010 at 6:02 PM, Gerry J  geral...@tpg.com.au
 geral...@tpg.com.au wrote:
  You may find the concept of semantic slicing relevant:
  http://www.cse.dmu.ac.uk/%7Emward/martin/papers/csmr2005-t.pdf
 http://www.cse.dmu.ac.uk/~mward/martin/papers/csmr2005-t.pdfhttp://www.cse.dmu.ac.uk/%7Emward/martin/papers/csmr2005-t.pdf
  There is software at:
  http://www.cse.dmu.ac.uk/%7Emward/fermat.html
 

Re: [fonc] Reverse OMeta and Emulation

2010-06-23 Thread John Zabroski
This is not entirely true.  Basile Starynkévitch has written GNU MELT [1] as
a way to circumvent hard to work with internal representation of GCC by
letting you create GCC plug-ins in a Lisp dialect.  This basically
side-steps the traditional political blockade set-up by RMS.  It is very
clever, and starting to mature; Basile has fixed a number of issues in how
he generates C code.

I'm not sure about Gerry Jensen's idea, e.g. how much effort, whether it is
worth the effort, etc.

Creating a VM for legacy code is also difficult, since it will run rather
slow unless you are a really good implementor (for example, the Hercules VM
[2] for simulating IBM mainframes is rather slow last I checked due to how
it has to intercept and translate instructions into the native
architecture).

[1] http://gcc.gnu.org/wiki/MiddleEndLispTranslator
[2] http://www.hercules-390.org/

On Tue, Jun 22, 2010 at 11:37 AM, Monty Zukowski mo...@codetransform.comwrote:

 GNU C was explicitly designed to make its intermediate representation
 hard to work with.  LLVM is a more practical choice.

 Monty

 On Mon, Jun 21, 2010 at 6:02 PM, Gerry J geral...@tpg.com.au wrote:
  You may find the concept of semantic slicing relevant:
  http://www.cse.dmu.ac.uk/~mward/martin/papers/csmr2005-t.pdfhttp://www.cse.dmu.ac.uk/%7Emward/martin/papers/csmr2005-t.pdf
  There is software at:
  http://www.cse.dmu.ac.uk/~mward/fermat.htmlhttp://www.cse.dmu.ac.uk/%7Emward/fermat.html
 
  One possible path to explore is to take GNU C etc intermediate
  representation of source as the assembly language of a VM and reverse
 from
  that to a more portable VM, as in Squeak or Java.
  Perhaps Ometa could be combined in some way with FermaT to recognise
  patterns and port legacy code to a fonc VM ?
 
  Regards,
  Gerry Jensen
  02 9713 6004
 
 
 
 
 
 
 
 
 
 
  ___
  fonc mailing list
  fonc@vpri.org
  http://vpri.org/mailman/listinfo/fonc
 

 ___
 fonc mailing list
 fonc@vpri.org
 http://vpri.org/mailman/listinfo/fonc

___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Blockade? (Re: [fonc] Reverse OMeta and Emulation)

2010-06-23 Thread Casey Ransberger
Whoa, okay. Have to ask. GCC has an intermediate representation that's 
intentionally hard to work with, and you're saying that Stallman did this as a 
political blockade?

I was under the impression that Clang got started up because some folks found 
GCC to be too crufty, not too political. This doesn't seem to make sense to me. 
Maybe I'm missing some context? Can you cite your sources or elaborate on your 
point?

On Jun 23, 2010, at 2:25 PM, John Zabroski johnzabro...@gmail.com wrote:

 This is not entirely true.  Basile Starynkévitch has written GNU MELT [1] as 
 a way to circumvent hard to work with internal representation of GCC by 
 letting you create GCC plug-ins in a Lisp dialect.  This basically side-steps 
 the traditional political blockade set-up by RMS.  It is very clever, and 
 starting to mature; Basile has fixed a number of issues in how he generates C 
 code.
 
 I'm not sure about Gerry Jensen's idea, e.g. how much effort, whether it is 
 worth the effort, etc.
 
 Creating a VM for legacy code is also difficult, since it will run rather 
 slow unless you are a really good implementor (for example, the Hercules VM 
 [2] for simulating IBM mainframes is rather slow last I checked due to how it 
 has to intercept and translate instructions into the native architecture).
 
 [1] http://gcc.gnu.org/wiki/MiddleEndLispTranslator
 [2] http://www.hercules-390.org/
 
 On Tue, Jun 22, 2010 at 11:37 AM, Monty Zukowski mo...@codetransform.com 
 wrote:
 GNU C was explicitly designed to make its intermediate representation
 hard to work with.  LLVM is a more practical choice.
 
 Monty
 
 On Mon, Jun 21, 2010 at 6:02 PM, Gerry J geral...@tpg.com.au wrote:
  You may find the concept of semantic slicing relevant:
  http://www.cse.dmu.ac.uk/~mward/martin/papers/csmr2005-t.pdf
  There is software at:
  http://www.cse.dmu.ac.uk/~mward/fermat.html
 
  One possible path to explore is to take GNU C etc intermediate
  representation of source as the assembly language of a VM and reverse from
  that to a more portable VM, as in Squeak or Java.
  Perhaps Ometa could be combined in some way with FermaT to recognise
  patterns and port legacy code to a fonc VM ?
 
  Regards,
  Gerry Jensen
  02 9713 6004
 
 
 
 
 
 
 
 
 
 
  ___
  fonc mailing list
  fonc@vpri.org
  http://vpri.org/mailman/listinfo/fonc
 
 
 ___
 fonc mailing list
 fonc@vpri.org
 http://vpri.org/mailman/listinfo/fonc
 
 ___
 fonc mailing list
 fonc@vpri.org
 http://vpri.org/mailman/listinfo/fonc
___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Reverse OMeta and Emulation

2010-06-22 Thread Monty Zukowski
GNU C was explicitly designed to make its intermediate representation
hard to work with.  LLVM is a more practical choice.

Monty

On Mon, Jun 21, 2010 at 6:02 PM, Gerry J geral...@tpg.com.au wrote:
 You may find the concept of semantic slicing relevant:
 http://www.cse.dmu.ac.uk/~mward/martin/papers/csmr2005-t.pdf
 There is software at:
 http://www.cse.dmu.ac.uk/~mward/fermat.html

 One possible path to explore is to take GNU C etc intermediate
 representation of source as the assembly language of a VM and reverse from
 that to a more portable VM, as in Squeak or Java.
 Perhaps Ometa could be combined in some way with FermaT to recognise
 patterns and port legacy code to a fonc VM ?

 Regards,
 Gerry Jensen
 02 9713 6004










 ___
 fonc mailing list
 fonc@vpri.org
 http://vpri.org/mailman/listinfo/fonc


___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Reverse OMeta and Emulation

2010-06-22 Thread BGB
my effort had not gone nearly so high up the abstraction tree, but instead 
operated in a space more like an abstracted x86 machine.


moving to a much higher level model, such as that of GCC IR or LLVM IR, 
would likely be difficult to pull off effectively starting from real 
machine code, such as x86.


what I had done was essentially just partly inverting several of the 
low-level stages in the process:
assembly, since my assembler was mostly data-driven, disassembly is not 
difficult using essentially the same data;

partly abstracting over matters of word-size and opcode argument forms;
...

then, it was this partially-abstracted form which was interpreted.

this was at a similar level of abstraction to that in my lower-level codegen 
(namely dealing with registers and values as handles), rather than making it 
all the way back up to a target-neutral IR.



converting to a higher-level IR would likely require something analogous to 
a compiler+optimizer, namely to translate these decoded instructions into 
generic IR sequences, and then try to optimize away all the cruft which 
doesn't matter (such as all the eflags magic for sequences which don't 
actually care about eflags).

...

the eflags issue is mostly because, for example, in x86 nearly every 
conventional opcode modifies eflags, but in the majority of cases, these 
changed flags are irrelevant (however, a forward scan and bit-masking could 
likely allow for detecting cases where the modified flags are known to be 
irrelevant).


also, x86 includes a small number of very complex opcodes, such as 
cpuid, which could be awkward if trying to produce an entirely generic IR 
(since cpuid changes its behavior and results depending on the values 
contained in certain registers), ...



this level of translation though is likely to either rule out or hinder the 
use of self-modifying code, since SMC would essentially invalidate 
previously translated sequences.


(in my case I had dealt with SMC simply by flushing the entire opcode cache, 
which in this case was essentially just a big hash-table holding opcode 
structures).


ideally, with a more complex decode process, the process of flushing on 
SMC could be done cheaply and incrementally, rather than, say, essentially 
having to recompile an app in memory continuously simply as it happens to be 
self-modifying.


luckily, most executable code is marked as read-only, and SMC cases are 
fairly rare, and so attempts at SMC are more often grounds for a simulated 
GPF, rather than grounds for flushing the decode cache.


static translation, however, is likely to exclude the possiblity of SMC.


or such...


- Original Message - 
From: Monty Zukowski mo...@codetransform.com

To: Fundamentals of New Computing fonc@vpri.org
Sent: Tuesday, June 22, 2010 8:37 AM
Subject: Re: [fonc] Reverse OMeta and Emulation



GNU C was explicitly designed to make its intermediate representation
hard to work with.  LLVM is a more practical choice.

Monty

On Mon, Jun 21, 2010 at 6:02 PM, Gerry J geral...@tpg.com.au wrote:

You may find the concept of semantic slicing relevant:
http://www.cse.dmu.ac.uk/~mward/martin/papers/csmr2005-t.pdf
There is software at:
http://www.cse.dmu.ac.uk/~mward/fermat.html

One possible path to explore is to take GNU C etc intermediate
representation of source as the assembly language of a VM and reverse 
from

that to a more portable VM, as in Squeak or Java.
Perhaps Ometa could be combined in some way with FermaT to recognise
patterns and port legacy code to a fonc VM ?

Regards,
Gerry Jensen
02 9713 6004




___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Reverse OMeta and Emulation

2010-06-21 Thread Gerry J

You may find the concept of semantic slicing relevant:
http://www.cse.dmu.ac.uk/~mward/martin/papers/csmr2005-t.pdf
There is software at:
http://www.cse.dmu.ac.uk/~mward/fermat.html

One possible path to explore is to take GNU C etc intermediate 
representation of source as the assembly language of a VM and reverse 
from that to a more portable VM, as in Squeak or Java.
Perhaps Ometa could be combined in some way with FermaT to recognise 
patterns and port legacy code to a fonc VM ?


Regards,
Gerry Jensen
02 9713 6004












___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Reverse OMeta and Emulation

2010-06-20 Thread BGB

can't say exactly...


but, before I wrote an interpreter for x86 (basically, like an emulator, but 
it doesn't bother with faking the entire system).


but, in this case, I had basically used some logic from an assembler of mine 
to essentially disassemble the machine code into a higher-level virtual 
ISA (basically, it works like an assembler and compiler lower-end, but in 
reverse). this allowed a slightly simpler model on which to base the 
interpreter logic, rather than having do deal with all the issues of exact 
opcode encodings and various instruction forms at the level of the 
interpreter.


in a few cases, certain groups of opcodes are handled as single larger units 
as well, ...


(I guess it is more common to use big nested-switch statements to write such 
an interpreter, but I figured that this would be a painful experience, and 
so I didn't do so).


actually, in the process I also noted that using linked-lists and 
function-pointers for opcode dispatch was cheaper than using a switch 
statement, and so this is how this interpreter is structured.



likely the harder part in writing a full system emulator would be simulating 
all of the various pieces of hardware, ... which may be attached (as well as 
re-implementing probably certain pieces of firmware/... which may be present 
in the original machine), for which documentation may be sparse or absent, 
...


or such...



- Original Message - 
From: Casey Ransberger casey.obrie...@gmail.com

To: Fundamentals of New Computing fonc@vpri.org
Sent: Sunday, June 20, 2010 11:45 AM
Subject: [fonc] Reverse OMeta and Emulation



First, I'd like to apologize for my last message to this list. I had 
intended to send it directly as a reply to Jecel's comments, but replied to 
the whole list by accident. I wouldn't ordinarily waste bandwidth on idle 
griping like that: mea culpa.


On to the topic of the subject line...

I've been thinking about learning computing history by doing.

Emulation is expensive (in human terms.) Emulators are very complex, 
precision beasts. So many emulators are started and not finished, in part 
because commercial applications can be limited (leaving developers 
money-starved for their time,) and in part because the scope of the work is 
fairly large.


An idea occurred to me: I was thinking about an example I'd seen from the 
FONC literature about using OMeta to transform representations of various 
heights ultimately into machine code, and I wondered: would it be possible 
to use the limited reversibility aspect of OMeta to reduce the complexity of 
an emulator by (to some extent?) translating machine code for one 
architecture to another?


As I think it's fair to say that an emulator is a sort of virtual machine, 
is there something to be learned about VM complexity or portability here 
too?


Just a thought! :)
___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc