Added summary here - http://wiki.apache.org/harmony/JvmInJava
-- dims On 5/17/05, Steve Blackburn <[EMAIL PROTECTED]> wrote: > I appeal to everyone to get the facts on the Java-in-Java issue. > > It is an important issue and despite a number of comprehensive posts to > the list from a variety of writers, people still perpetuate ideas which > have been debunked on the list. > > Listreader wrote: > > >>A lot of people have expressed interest in a JVM written in Java providing > >>performance is adequate. There has been substantial evidence present to > >>show that it can be. > >> > >> > > > >I simply collected what I felt was the opinion of the majority of posts > >on the list thus far. Personally, I care little about the exact language > >as long as it is relevant and optimal for writing the compiler. > > > >Since Java lacks low-level memory management capabilities, and the > >JVM obviously needs to deal with these issues, I would be somewhat > >hesitant to write the full JVM in Java myself. > > > > > Please read through the earlier posts. > > 1. Java-in-Java VMs deal with low-level memory management through a > small set of type safe extensions, supported by the compiler, which > allow typed access to memory in a variety of ways. > > http://jikesrvm.sourceforge.net/api/org/vmmagic/unboxed/package-summary.html > > 2. Having written an extensive high performance memory management > toolkit, which includes a wide range of GC algorithms, and which can > outperform the standard glibc malloc(), I can assure you (as I and > others have stated in previous posts), that a Java-in-Java VM is not > encumbered by any such limitations. > > 3. The Java-in-Java VM has some performance *advantages* through the > lack of impedence mismatch between the supported language and the > implementation language (this is one of the reasons for 2. above). > > 4. There are a number of successful instances of this technology, > including OVM, Jikes RVM, and Bartok (a C# in C# VM at MSR which can > match the MS product VM on some benchmarks, despite having vastly less > resoures applied to it). > > If the community decides that it would be more helpful to build the VM > in C++ that's fine. I for one hope that the component-based architecture > will support a variety of implementation languages. > > But lets not make such a choice from a position of ignorance. > > --Steve > > > Hi Dmitry, > > > >> <constructive_interest> > > > > > > [...] > > > >> First one is of the chicken-vs-egg variety -- as the GC algorithm > >> written in Java executes, won't it generate garbage of its own, and > >> won't it then need to be stopped and the tiny little "real" garbage > >> collector run to clean up after it? I can only see two alternatives > >> -- either it is going to cleanup its own garbage, which would be > >> truly fantastacal... Or it will somehow not generate any garbage, > >> which I think is not realistic for a Java program... > > > > > > This is a very important issue. > > > > The short answer is as follows: > > a) Within the GC code itself we don't really use Java, we use > > a special subset of Java and a few extensions. > > b) We never call new() within the GC at runtime > > c) We try not to collect ourselves > > > > You will find the long answer buried in the source code and a somewhat > > out of date paper: > > > > http://cvs.sourceforge.net/viewcvs.py/jikesrvm/MMTk/ > > http://jikesrvm.sourceforge.net/api/org/vmmagic/unboxed/package-summary.html > > > > http://jikesrvm.sourceforge.net/api/org/vmmagic/pragma/package-summary.html > > > > http://cs.anu.edu.au/~Steve.Blackburn/pubs/abstracts.html#mmtk-icse-2004 > > > > I'll try to give a more succinct answer here: > > > > As for a), we essentially apply a few design patterns and idioms for > > correctness and performance (more on performance later). We don't use > > patterns that depend on allocating instances. In fact the only > > instances we create are per-thread metadata instances which drive the > > GC. These are allocated only when new threads are instantiated > > (actually these are per posix thread, Jikes RVM uses an N-M threading > > model). > > > > As for b), there is not much call for dynamic memory management within > > a GC. The exceptions are a) short-lived metadata such as work queues, > > and b) per-object metadata such as that associated with free lists and > > mark bits etc etc. We solve this by explicitly managing these special > > cases from within our own framework. We have a queue mechanism that > > works off raw memory and a mechanism for associating metadata with > > allocated space. The details are beyond the scope of this email. > > > > Actually c) is one of the hardest parts. It is essential that heap > > objects associated with the VM and the GC are not inadvertently > > collected. This requires some very careful thought (remembering that > > the compiler will place our *code* into the heap too!). > > > > As to whether this is feasible, its been done at least three times > > over. First in the original Jalapeno, then in GCTk (developed while I > > was at UMass) and now MMTk. Right now I am working with my students > > here to push the MMTk design even cleaner while not sacrificing > > performance---fun! > > > > So, can it perform? Well it is very hard to do apples to apples > > comparisons, but we measure the performance of our raw mechanisms with > > C implementations as a milestone and we do very well (by this I mean > > we can beat glibc's malloc for allocation performance, but this claim > > needs to be covered with caveats because it is very hard to make fair > > comparisons). So the raw mechanisms perform well. But then the > > software engineering benefits of Java come to the fore and our > > capacity to implement a toolkit and thus have a choice of many > > different GC algorithms gives us a real advantage (the GC > > mechanism/algorithm thing was the subject of a previous thread). > > > > I've glossed over a huge amount of important stuff (like how we get > > raw memory from the OS, how we introduce type safe pointers and object > > references, etc etc). > > > >> To summarize (and to get to the question already) - the point is that > >> language shapes thought. In other words, a program designed in Java > >> will naturally tend to be slower then a program designed in C, simply > >> because Java most naturally expresses slower designs then C does. And > >> the question is - does this agree with anyone elses experiences? And > >> if it does, is it a valid argument against using Java for the design > >> and implementation of the VM? > > > > > > OK so there is already at least one response to this, but let me add > > my experience. > > > > I am very focused on performance. The approach Perry Cheng and I took > > when writing the code for MMTk was very much that premature > > optimization is indeed the root of all evil. Moreover, we placed > > enormous faith in the optimizing compiler. The philosophy was to > > assume the optimizing compiler was smart enough to optimize around our > > coding abstractions, and then to do careful performance analysis after > > the fact and see where we were being let down. In some cases the > > compiler was improved to deal with our approach, other times we > > modified our approach. > > > > Over time we learned certain idioms which on one hand meant we tended > > to get reasonable performance first shot, but on the other may have > > undermined the natural Java style we started with. > > > > While I understand what you mean when you say: "a program designed in > > Java will naturally tend to be slower then a program designed in C", > > addressing that concern is one of the most important challenges of > > language implementation, and is why Java performance has improved so > > greatly over the past five years. > > > >> </constructive_interest> > > > > > > > > Cheers, > > > > --Steve > > > -- Davanum Srinivas - http://webservices.apache.org/~dims/
