[fonc] maybe of interest: VM directions...

BGB Mon, 01 Dec 2008 18:23:56 -0800

dunno if this is the kind of thing that is on-topic/of-interest here, I willsee...


figured I would include it (this being from an email to someone I know):

---

did start a little bit implementing some of the things I mentioned afew days ago (restructuring the MI mechanism, making initial plans for theimplementation of delegation, ..).

I am wondering if a hybrid of class/instance and prototype OO (as Iam imagining it) is at all interesting

It can be if you are breaking new ground. I am wondering about all therestrictions and problems/bugs such a language has. Thats the first thing tolook at.


   I am not sure if I am breaking new ground or not...


 I have not really seen anything in this area before of late.

yeah, it is odd at least...

well, clearly at least some of the object system internals have become alittle horrid...

   class/instance and delegation don't really like mixing.

I believe V8 uses classes shadowed behind objects. This is the approach Iwould probably take, so an object methods or slots are altered in definitionthen a new cloned class is created with the modifications made to it, andthe objects class pointer is then pointed to this new class.

This seems the most logical and sensible approach and could also supportclasses.


yes.

I have anonymous classes, but I did not use them this way in my case, themain reason being that classes are not exactly free (they have a number offields, tables, and sub-pieces). so, creating lots of specialized classeswould waste lots of memory...

so, my trick is that the base class stores most of the fields (theclass-defined fields), and a special side-object holds any new slots. inthis way, the storage and performance overhead is not much worse than otheruse of prototype objects, however accessing prototype object slots, ingeneral, incur worse costs than class-slots (slot has to be looked up,rather than consisting simply of a few pointer operations).


granted though, much of this can be sped up via hash tables.

now, however, if one were going to make a whole lot of these objects(bunches of copies of a single tupple, ...), then the use of anonymousclasses makes more sense.

the use of prototype features then makes more sense for dictionaries anddynamically constructed/modified objects.

actually, the more 'elegant' option would be to basically implementclass/instance as an abstraction over a prototype system, exposing some ofthe internals at least as far as features go. my system is, however, goingthe reverse and hacking prototype features on top of a class-instance model.

I like V8's approach its the one I would take, it seems most logical forboth prototype and class based approaches.


yes, ok.

well, I don't specify the internals, only that there are objects and it ispossible to add slots at runtime, and that these will not effect the baseclass.

there is a slight issue related to the semantics though, namely itwould be possible to make the delegation model completely formal/provable,but at a certain cost:new slots/methods in delegated-to objects, beyond the delegated baseclass, would remain effectively hidden.

conventional prototype/delegate semantics will remain possiblewithin the implementation, but at the cost of not being staticallycheckable.


       examples (again, hypothetical language):

       class A {
           double x, y;
           delegates double len() { return(sqrt(x*x+y*y)); }
       }

       class A1 extends A {
           double z;
           delegates double len() { return(sqrt(x*x+y*y+z*z)); }
       }

       class B {
           delegates A a;
           void B(A ra) { a=ra; }
       }

       B obj=new B(new A1);

       obj.x //OK
       obj.y //OK

don't you mean 'obj.a.x' ? Or just 'class B { deleages A; ...}' ... orsomething ?


   this is the whole thing with delegation.

the accesses see "through" 'a', as if the contents of class 'A' wereincorporated into 'B', only that the state for 'A' is located in a differentobject.and so, writing 'obj.a.x' is not needed (this is not delegation, butwould be allowable), whereas leaving out the name would make it essentiallyno longer possible to reference or modify the delegates.

so, basically, the behavior of delegation is essentially analogous toruntime assignable inheritence (this is actually the role it typicallyserves in many prototype-based languages, such as Self).



 Okay so you still have a handle to the delegate.

My only worries are name clashes, what happens there is there are twodelegates and a subscribing class that all have an 'x'. How do you resolvethat logically ?

I suppose at least you still have the handle and can explicitly access itand its sub objects.


yes, one can explicitly access it...

as for name lookup, usually it is looked up according to rules similar toSelf:

first the current object is checked;
then any delegates are checked in-turn.

at present, it first checks the first delegate, then the next, ...
the rule is that the first found match is used.

like Self, the lookup allows cyclic graphs (a stack is used to preventinfinite recursion), and like Self, multiple delegates are allowedper-object.

so, basically I am trying to merge together the object models from Javaand Self...

Yeah. V8 was developed by Lars Bak who did the Java HotSpot engine, andSelf too.


may need to look into this...



       obj.z //BAD, A1::z is not statically visible from B

obj.len(); //OK, A1::z is provably visible from the methodA1::len()



       now, a big question is one of interfaces:

       interface C {
           double x, y, z;
       }

       C obj1;

       obj1=obj;    //?, B not statically provable to conform to C

//?, B does not implement C (interface injection,of sorts...)


       obj1.x        //OK
       obj1.y        //OK

obj1.z //?, would work in implementation, but is notstatically verifiable.

?, these features could be made to work in practice, but aretheoretically unsafe and could be rejected by a verifier (likely the defaultcase for 'secure' code).also possible would be to allow these, but also allow an exceptionto be thrown should the operation fail (VM level, the object system APIcurrently does not throw exceptions).

fully dynamic prototype OO may be possible in the implementation byadding a feature similar to that used for implementing prototype objects:the addition of a virtual "super-interface" capable of interfacingwith "any possible object" (in effect the interface would contain everypossible slot and method it has seen).

a similar mechanism might be needed for the VM-level implementationof delegating methods (that or, as considered before, having an explicitlydefined interface associated with the method).

I can imagine a few points of awkwardness though in trying to makeES3 efficiently operate on top of this system (by default ES3 may use a few"cheap tricks" to make itself work, but absent specific declarations andtype annotations may be slower than is ideal).

some semantic restrictions may also be placed on objects created "exnihilo" is ES3 land, namely that these objects will not be compatible withJava-land (likely even via interfaces), although it could be possible tomake them accessible via an API.

so, the same object system could be used in both cases, but not allobjects will be compatible due to language-level semantic differences.

a common superset language, such as likely ES4, could probablyhandle these cases though (such as being able to produce usable objects andutilize objects from both langs).

ES4 is going to be a damb site easier to implement now. But I wouldstill start with ES3.1


   I would have to look into the specific differences...

   but, whatever one has static types and classes is worth consideration.

actually, my last major script language included a lot of things thathave apparently been borrowed by ES4 (but, then again, I borrowed a lot ofideas from JavaScript and ActionScript as well, so I guess ES4 had themfirst in a way...).

ES3.1 is only 3 + things like object accessors (get/set) plusObject.freeze and lambda coding which will be used by the _new_ ES4 called"Harmony" to create classes.


ok, ok.
well at least they added lamba...

now this brings up a thought:
I thought ES3 already had closures?...

will have to look into this...



 Here's Brendan Eich's email to the es-discuss list :-


     https://mail.mozilla.org/pipermail/es-discuss/2008-August/006837.html


 Just in case you did not read it.


 And here's the es3.1 drafts :-


     http://wiki.ecmascript.org/doku.php?id=es3.1:es3.1_proposal_working_draft


 Have not read them myself yet.


yes, I will probably need to look into all this...

for many things though a "least common denominator" approach may beneeded.



       or such...

I am now going back to looking at separate languages in a frameworkmuch like .NET CLR & DLR but with Multiple Inheritance.



     I still continue to study ES4 RI for a while longer though.

   yeah...

I am writing a framework, but right at this point I don't intend to tryto innovate much with languages, more rather to try to implement a frameworkthat can handle several different languages.

as for .NET, well they have achieved a good deal more goals in thisdirection, but as has often been noted NET makes a good deal more legal painas well...

Mono seems all right legally they are doing a moonlight to mirrorsilverlight.


Mono itself has some legal issues.

it exists, but there are I think some MS liscense and patent issues in themix (many parts of the .NET framework are under MS patents), so Mono existsmore because MS allows it to exist, but it is not entirely free of legalissues. I guess the current standing is that there is a an agreement betweenMS and the Mono developers that MS will not sue over the patent issues.

I just don't really trust all this, I would much rather things stayed morein "clear water" (ie: well away from patented technologies).

I guess this is actually a notable reason for Mozilla focusing moreattention on Tamarin, and generally not looking much into .NET or Mono...



 Moonlight is going to be in Firefox soonish when its ready.


ok.

I would suspect probably as a plugin rather than a core component.

I read some things before over all this, namely that I guess their officialposition was to develop and pursue Tamarin rather than Mono largely on thegrounds of concern over possible patent issues (ie: Mono may be ok, butFirefox may not be, since FF is the primary competitor of IE, and soalthough Mono itself is safe, FF may fall outside the agreement if Mono wereto be integrated as a core component, whereas for a plugin people can juststop using it without compromising the whole project...).

similarly, for this and several other reasons, I have focused moreeffort on using the JVM design as a base.

I never like the JVM, specs where not that great, or well defined, toomuch like a toy machine for my liking. Only has single inheritance, genericscame late and were not that powerful, basically using type erasure Ibelieve.

generics exist at the language level, and I have not looked into how theyare implemented (the JVM, at the bytecode level, doesn't really care aboutmany things).

now, granted the JVM itself is a much smaller and in many ways cruftierdesign, but it can be implemented in such a way that the interpreter runsfairly efficiently, whereas both .NET and AVM2 would require JIT to runefficiently (or, at least an internal bytecode rewrite pass).

the main difference is that the JVM fully specifies most types in thebytecode, and focuses more on their concrete representation (after all, thisis why long and double are defined to require 2 stack elements, ...).

both .NET and AVM2 seem to define the bytecodes a little more abstractly,requiring either a dynamically typed interpreter (slow), or recompiling thebytecode into another representation (a lower-level bytecode, or full JIT).

so, as I view it, both MSIL and AVM2 are probably better as far as an IRgoes, but JBC is better as far as an interpreted bytecode goes.

JIT more or less "levels the field" though, since any likely effective JITwill have to do much of the same things anyways (stack tracing, coderestructuring, ...).

if I ever produce something really usable/useful, it would be worthwhileto have a fairly open and not "too" novel design...

if I can manage to get it implemented, and get several existinglanguages targetted to it, this would be a worthwhile goal I think...

as well, it is good to have an implementation and design that isclonable...



 You need good specs for that.


yes, my specs thus far are not so good, but I document things where I can.

sadly, most of my documentation is strewn about various text files, and itis not clear which parts are valid, which were just ideas, which are out ofdate, ...

at one point, I had a convention for this (reviewing files and addingspecial "marks" to the filenames), but I have not kept up with this...


sadly, properly documenting everything would be no small task...

as far as the JVM goes, the specs can hardly be called "good", but they areunderstandable if one puts some effort into it (and goes in search of theproper pieces of documentation...).

so, I am partly using it as a "base", but I am developing many of my ownthings as well (in its full form, my VM will use a somewhat modifiedbytecode, ...).

also, the major emphasis is not on the bytecode (as is the case with mostVMs), rather a good deal of emphasis is placed on all the internal machineryand frameworks, for which I am trying to develop clean and usable APIs.

so, it is not a centralized "ivory tower" design, but is intended to be moreof an open-ended toolbox of loosely coupled parts (much more like OpenGL orPOSIX).

what it "is" is what you make from it, not something intrinsic in the"design"...

however, a bytecoded VM is itself a fairly powerful tool, and so is beingdeveloped to be added to the "toolbox", however many of the parts this VMwill depend on are also independent and exposed elsewhere with their ownAPIs...

of course, the big catch is that all of this is C based, and so not of asmuch use if one doesn't like C, or wants a centrally-integrated "whole".

sadly, the JVM, as are most other VMs in existence, is by default a verycentralized and "ivory tower" design, but oh well, I can try to make it workas a component, and maybe generalize it so that it can interoperate nicelywith my other stuff...

in most VMs, the "toolbox" AKA "framework" is built on top of the VM,whereas in this case the idea is that it "is" the VM, and that the designmore "digs downwards" than builds upwards.

so, rather than getting all this flexibility by building yet another "layerof abstraction", one tries to dig through what things already exist, to beable to handle all of this already existing stuff (common APIs, ...) as onebig and reflective piece of machinery.

for example, I would want, for example, ECMAScript, rather than havingeverything wrapped up nicely for itself, is able to play in the bigger"outside world" of all of the code that already exists (C land and similar).

a capable script may well have access to the same VM and machinery used toimplement itself, and the C compiler, so for example a piece of JS codecould very well construct a piece of code that compiles itself to machinecode and accesses C API's, or FWIW is self-modifying.

now, granted such capabilities are as much dangerous as they are useful, andso it is my eventual plan to integrate a fairly strict security model intoeverything as well (bytecode also has this use as well, as I realize thatstrictly enforcing a security model within C code would be nearlyimpossible, but bytecode makes this kind of thing far more plausible withoutcompromising the power of the system).

also this is another reason why static typing and formal checking is beingpursued for many components (beyond performance and optimization reasons).

as well, bytecode may also allow higher performance, at least in terms ofdynamic facilities, since it adds more capabilities for both separation ofissues and optimization, vs a more traditional linear-compilation design(where nearly everything has to go through the whole compiler tower).

for example, the one-off results of an eval expression need not get compiledall the way down to machine code (wasting both time and creating potentiallynon-collectable garbage), and as well, bytecode fragments can be craftedexplicitly (rather than having to go through preprocessing and having togrind though a whole bunch of contents from various system headers, ...).

so, in some ways, bytecode could also fill a similar role to what I am doingin many places with dynamically crafting specialized chunks of assembler(but at a much higher level of abstraction).

ok... granted I would probably need an abstraction over the bytecode, suchas it would be far more convinient to craft a chunk of PostScript-like codethan a JBC class-file, but yeah...

so, yeah, FWIW I will probably need to add a specialized PostScript-like VMsomewhere in the mix as well...

hmm, I could partly/temporarily redirect my efforts since this would beprobably of more immediate use (and less work) than a JVM anyways...



a few notes:

I have used specialized PostScript like languages in the past to goodeffect, only that PS is not good for a script language IME (too damnawkward, and then later one forgets just what the hell their scripts weredoing anyways).

internally, the upper and middle/lower stages of my C compiler communicateusing a somewhat PS-infuenced language as the IR, and in general a PS-likestructure is very convinient for dynamically crafted code, only that it israther poor for end-user scripts, where a more conventional syntax (C-style)is by far more preferable.

however, as-is, my compiler IR is not particularly suitible for use innon-compiled dynamic crafting, where something far more similar to proper PSis needed (non-modular, includes an external visible scope rather thanhaving to declare everything, ...). oh yes, and likely being compiled tobytecode rather than necessarily going all the way down to machine code...

very likely, such a language will have visibility of the C toplevel andfunctions (my framework already has this, but for different reasons), and assuch would need little or no explicit FFI (although granted externalfunction calls would use a different syntax than internal block-basedcalls).

potentially, it would also make sense for it to have access to the objectsystem.



note, further:

this would not halt or redirect the JVM effort (just as the JVM effort didnot entirely halt my plans to get an ECMAScript VM working, this goal havingsomewhat infuenced the design of the object system).

rather, it would be a much smaller effort than either of these 2, and willlikely have much more immediately useful effects (aka: serving a similarrole to, but being distinct from, specialized ASM crafting...).



or such...



_______________________________________________
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc

[fonc] maybe of interest: VM directions...

Reply via email to