well, ok, I will say I am new to this list, but am hoping for interesting conversations. sorry if in being new here, I am being an ignorant troll.
main reason: well, mostly I am doing "my own thing", but if possible would like to be operating within the confines of 'the community'. I will admit that at this point in time I am not really that much of a Java developer, so my familiarity with the community or with specific technologies in this area is limited. basically, my purpose and efforts (general overview): I have over the past few years implemented several languages and VMs of different varieties (typically dynamic). more recently (ok, over the last 1.5 years or so), I went and wrote a dynamic C compiler framework (basically, it allows dynamically taking C source and compiling it to machine code at runtime, making use of a number of hacks for allowing dynamic relinking and relatively seamless integration with the host app). C is good, yes, but by itself if does not do "everything". the sad thing is, although I can dynamically compile, C is not so ideal of a language for this, and more so there is "the problem of the headers", namely in that I know of no "good" way to escape header processing, meaning that typically there is some time overhead WRT dynamically compiling modules. another sad point is that C is, as is, largely incapable of proper "eval" (of the sort typically done in JavaScript and friends). now, yes, one can compile some functions and then invoke them, but this sucks (and anything that can eval in some useful way, is technically no longer C...). it is also the case that at present dynamically-compiled C code is not garbage collected (ok, my framework has many "unorthodox" features and compiler extensions, among them, is that I also make use of optional dynamic typing, and have a conservative concurrent garbage-collector, ...). but, otherwise, I have a decent chunk of C99 implemented and 'most' C code should work ok, so it is probably good enough. also, generating native machine code is not always the optimal solution, as for many things, bytecode is preferable (I will make the analogy of using a tank as ones' main vehicle... it is hard-core and powerful, but not so good for short trips and really tears up ones' lawn...). recently, I had been looking to "absorb" Java support into my project as well, and more so, to make use of Java's bytecode format as probably the primary bytecode format (namely, there are cases when using something "standard" makes sense, and I don't feel there is need for "yet another non-standard bytecode format"...). this will possibly allow leveraging some amount of existing stuff, and make it so that people can more readily make use of my stuff. basically, when needed my project could "emulate" a more traditional JVM, and as much as is reasonable remain compatible with other JVM's. partly, I am now under the opinion that Java is better for some of my tasks than C is, but I still like mostly keeping C around (actually C and C++ are still my primary languages...). so, unlike C, it is a lot better suited to dynamic loading and modular systems, and is also a lot easier to verify. unlike JavaScript, performance and garbage generation issues are likely to be far better (after all, for most things, the language still is statically typed). I also intend to tightly integrate it with my existing framework, and actually use it for many other tasks as well (basically, offloading tasks not as well suited to native code generation). I may also use it as a target for JavaScript as well (I have an existing partial JS implementation, but targeting JBC would probably be a better and more general solution). a JIT compiler may also be added at some point... however, I will probably not implement JNI unless I have some good reason (me considering alternative and more desirable options...). progress thus far (JBC support): well, most of the "more general" stuff exists within my framework already. I have a class/instance system in place, which mostly overlaps with the JVM in terms of functionality (it is being implemented for this purpose, as formerly I had been using a prototype-based object system, like in JavaScript, but this would not be very good for implementing a JVM performance-wise). a lot of the core interpreter functionality has been written recently, but otherwise I have not had much free time for coding as of late... actually, I came up with the idea of writing a JVM like several weeks ago, but haven't had much time to do so (actually, thus far it has been less painful than expected, but other stuff uses up most of my time...). I have yet to implement a class-loader or similar functionality. I had decided actually to implement the interpreter first, and then make the loader target it, rather than implementing the loader first and building an interpreter around it (this is what the JVM spec makes me think...). at this point, I have not actually tested any of the interpreter machinery (things are still very preliminary). a minor complaint: the JVM spec (or at least the one I found) makes some things rather annoying to figure out, such as which opcodes have arguments and what they are, what exactly each opcode does, ... an instruction listing similar to that found in the Intel docs (or many other processor-specific references) would make this less annoying. so, maybe a slightly more formal structure could help (dunno if anyone here is in a position to effect this though). namely: parts of my interpreter I generate with tools, and it is preferable to go and fill out some tables, rather than have to dig around and figure things out. for example, when I wrote an x86/x86-64 assembler before, I just sort of scanned through the instruction reference and transcribed all the stuff into my own table formats (and similar would also work with the PPC spec), but the JVM as-is, requires a bit more digging (a lot of this is maybe more useful for people targetting the VM, but not as much for writing one). specific thoughts: the idea of "invokedynamic" seems interesting. however, this does not seem to be a complete solution IMO (everything would be done via method invokation, which in many cases would not be ideal performance wise, and would limit utilization of things we can know about dynamic operations). for my efforts, I had considered the possibility of further extensions, in particular: dynamicly typed arithmetic, comparrision, conversion, ... operations; potentially, a dynamic type system like that in many dynamic languages (fixnums, flonums, lists, ...); .. granted, this may not be the reasonable solution 'in general', since it would imply adding a good deal of functionality to existing VMs. related to this, had been to also consider adding features to better facilitate languages like C, such as the ability to make use of "unsafe" memory access and pointer operations (the purpose would be to allow "acceptably" targetting a C compiler to it while still retaining many of the capabilities of C, along with its ability to play along "acceptably" with its natively-compiled counterpart). actually, this much (even within my project) would likely require special permission to use. I have actually for a fairly long time mentally been considering the idea of a multi-layered security model, where certain access to certain things is granted to certain code, but thus far don't have any specific plans (this is especially the case in natively-compiled C, as any such policy is made technically almost impossible to enforce...). my idea here would be to find a hopefully unused opcode (or opcodes), and then use it/them as an extended opcode space. 254 and 255 are possible, but if there is any real possibility of "cooperation" a different opcode number may make sense (given 254 and 255 are reserved for implementation dependent features), allowing other implementations to potentially implement some of these. these would then be features the VM is not required to support (or allow), and so a piece of code could be safely rejected for using any of this functionality. for example, the opcodes would be "specified but optional". maybe a kind of "extended opcode blocks", where opcodes can be added to specific blocks absent risking clashing with other blocks (basically, some people can "own" certain areas, and so if they have not specified an opcode, they can know it is safe to do so). hypothetically: 240-243 <byte>: a group of 64 blocks of 16-opcodes 244-247 <short>: 1024 blocks of 256 opcodes certain blocks could be assigned for "development by various implementors", and others could be regarded as "experimental". these would differ from the opcodes 254 and 255, in that they could potentially be shared between implementations, and used as a form of "de-facto" extension mechanism (one implementation or group can specify opcodes, and another can safely implement opcodes that another has specified, and if they want/need to add new opcodes, they can have their own block). as such, it would be implicitly assumed that once an opcode is specified, it can not be readily changed (maybe deprecated or replaced). however, experimental blocks could be free from this restriction (so, implementation A could develop features within their implementation, and group B could later decide to move them to a more permanent home). or, maybe it is just the case that there is not enough activity to justify this?... however, at this point I am mostly just asking for peoples' opinions on all this... _______________________________________________ mlvm-dev mailing list [email protected] http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
