Re: [general] DaCapo Benchmark Suite and Paper

2006-09-03 Thread Steve Blackburn

Geir Magnusson Jr. wrote:

Excellent!  lets get this as part of our test rig...

That sounds like a good idea.

I think DRL should be able to run all of the benchmarks.

Eclipse presents a minor hurdle since it expects a Sun-like organization 
of jre libs.  However, I've added a workaround for this for now (new to 
this release).  I'll be interested in working with DRL people on 
developing a better long-term solution.  Ideally we'll include in the 
benchmark a jre lib stub which eclipse will always build against, 
regardless of the JVM being tested.  This avoids the problem DRL was 
seeing and also ensures the test workload is not affected by the 
organization of the host JVM's libs.


For those who are curious: the eclipse benchmark includes performing 
some eclipse source builds.  In the case of our benchmark, this entails 
compiling parts of the eclipse sources.  These compilations need to be 
with respect to some jre libs.  Eclipse has its own expectation of where 
to find such libs and how they will be laid out.  Sun-based JVMs work 
fine, but many others (including DRL) will not.   IBM's j9 has a 
different layout, but works by virtual of special-casing within the 
eclipse codebase.


Cheers,

--Steve



-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [modules] classloader/jit interface

2005-06-26 Thread Steve Blackburn

Hi Rafal,

My guess is that there are a dozen people on this list right now who, 
given a month, could build a simple VM from scratch, on their own.  
However, building something like ORP or Jikes RVM is an enormous task, 
well beyond the scope of any individual.


In other words, I don't think the challenge we collectively face is that 
of identifying the right abstractions for a simple VM---this is 
relatively simple.   My feeling is that the real task in front of us is 
designing and implementing a VM core which can support the ambitious 
goals of Harmony---this is a challenge.


As I've said in previous posts, I think that looking at clean, simple 
VMs such as Jam VM will be invaluable to this project.


However, clean abstractions made in the context of a simple VM goes to 
pieces once the complexities of aggressive optimizations and pluggable 
generality come into the picture.  A great example of this is the 
hardness of retrofitting GC maps into a VM.  GC maps are essential to 
high performance GC, but are often overlooked in simple VM designs and 
yet retro designing them is an enormous task.  Likewise object model 
optimizations, method dispatch, stack frame organization, etc etc all 
wreak havoc on more basic abstractions.


I don't think we're that far away from being able to build a new VM 
core.  We have so much at the table.  My hope is that we could have our 
own VM core with a simple interpreter or JIT and perhaps a naive GC 
could be up and working in a matter of months. I am also a strong 
believer in being prepared to throw early implementations away.  If we 
try to get our first implementation perfect first shot, we're sure never 
to have a first implementation.  On the other hand, we should certainly 
try to make the most of the experience we do have at hand, so that our 
first implementation is not too wildly off the mark.


A few more detailed responses:


- it is important to distinct between innovation/research and good
engineering.(eg mono

Innovation/research and good engineering are absolutely not in 
opposition!!  Good engineering is KEY to good research.



well engineered framework
shall make research easier after all; that is why I'm rather a
C-camper

What is the connection between good engineering and the use of C?  I 
don't follow.



make sure to make it efficient as some operations (write
barriers, for example) are critical when it comes to performance;
 

Please see previous posts on this subject.  Fast barriers probably need 
to be written in Java or compiler IR, not C (this is because they are 
injected into user code and need to be optimized in context).



- avoid prematurely sacrificing design for the sake of performance;
 


Agree 100%


- Java-C/C-Java (internal) interface doesn't necessarily have to be
slow;

Anything that involves a function call is slow in the context of many 
aspects of VM implementation.  This is fine for coarse grained 
interfaces, but unacceptable for fine grained interfaces such as the 
allocation and barrier paths.  Careful design should make the 
integration of C  Java modules possible (just avoid interface 
transitions on the fast path).



- having some kind of 'frankenVM' consisting of various pieces doesn't
have to be inherently bad; did Mono emerge this way or am I wrong ?
 

A VM that draws on the enormous investment at hand by taking a JIT from 
here and a memory manager from there is a very wise choice.



- someone has to build a 'big picture' and split it into parts to make
people working on details (don't start with one or two interfaces,
start with a big picture first: several main modules, how shall they
interact with other modules, where are potential problems); that HAS
to be done by a person with extensive experience in VM construction
(engineer rather than researcher); newbies like myself fall short in
this mainly because of not having dealt with details;
 

I agree with this very strongly, however I don't think it should be one 
person---it should be done collectively.



- sorry for being a bit offensive on researchers ;) it wasn't my
intention, I just think that we need to have a proven set of things
first, then may do some good research on it;
 

OK.  I'm offended ;-) 

In all seriousness, you need to understand that the state of the art in 
VM design and implementation is coming out of researchers and research 
labs.  Research and proven results are not and must not be mutually 
exclusive.  To the contrary!  As far as I know, as far as proven 
results go, the two best performing open VMs are ORP and Jikes RVM, both 
of which have come out of research labs.  Good engineering is at the 
heart of good research---I believe the opposite to be true too.


--Steve


Re: [modules] big picture

2005-06-25 Thread Steve Blackburn

Geir Magnusson Jr. wrote:


What is a sensical way to modularize the rest?


Key to the approach is precisely that we need not introspect within 
modules which we will not (initially) be implementing.  Thus we need not 
concern ourselves with the internal structure of the execution engine, 
or the memory manager if these are modules we plan to leverage from 
elsewhere.


If we focus on getting a core built quickly, then what matters  most is 
a) the interfaces between the core and other modules, and b) the 
internal modular structure of the core.


I think we'd all really like to see a core begin to take shape.  
Understanding its internal structure and its relationship to the outside 
world is the very first step.


--Steve


Re: [Legal] Requirements for Committers

2005-06-08 Thread Steve Blackburn

Dalibor Topic wrote:

Many people don't see the need to look at non-free software in 
general, and chances are pretty slim that anyone I know will ever get 
that bored and out of reading material to accept the 'Read only' 
license, for an example of a very funny non-free software license.


I have never looked at non-free implementations, but I am interested to 
know what this means for those of us who have extensive exposure to 
implementations such as Kaffe (GPL) or Jikes RVM (CPL).  My reading of 
it is that I can't work on any part of Harmony for which I am tainted by 
my Jikes RVM exposure without permission from the copyright holder of 
Jikes RVM.  Is that right?


--Steve


[arch] Modules and interfaces (was: How much of java.* ...)

2005-06-06 Thread Steve Blackburn

Geir Magnusson Jr. wrote:

Doesn't this imply that the GNU Classpath interface should add a  
second API that *it* should comply with for symmetry?  That way you  
don't get dependencies on GNU Classpath internals? 


I've been a bystander in this discussion as I know very little about the 
class library issues.  There were obviously a lot of concerns being 
discussed in this thread, but I'd just like to respond briefly to the 
above...


It brings to mind some of the portability/modularity issues we've been 
wrestling with (at great length) with MMTk and the interdependence 
between the memory manager and the VM.  Our solution is not earth 
shattering, but it evolved out of a very long struggle with issues like 
the one Geir is alluding to above.


In the end, our interface is not in the form of a simple API, but of 
reciprocal packages implemented on the MM and VM sides of the fence.   
So we have org.mmtk.vm, which captures all VM-specific dependencies, 
leaving the rest of MMTk's packages (the bulk of the code) strictly 
VM-neutral.


We have a generic template implementation of the org.mmtk.vm package 
which serves to define the interface, and against which a VM-neutral 
mmtk.jar is built.  Each VM then provides a VM-specific implementation 
of this package which binds into that VM's services (such as the way 
that VM identifies roots for collection, supports mmap(), defines the 
size of an address, or whatever...).


At this stage we only have one example in cvs, but there are two others 
in various states of development (jnode, which is actively being 
developed  Rotor which is a little out of date right now).


http://cvs.sourceforge.net/viewcvs.py/jikesrvm/MMTk/ext/vm/

(the stub directory defines the package abstractly, the JikesRVM 
directory has the Jikes RVM - specific implementation)


We want this to be symmetric, so that the VM has a similar arrangement 
whereby it can support various memory managers by having each of them 
implement some package.  We have not yet cleaned up this aspect of Jikes 
RVM, but it is on our short list of planned cleanups.  MOre generally, 
we want to use this model in a complete componentization of Jikes RVM.


--Steve


Re: JVM performance article

2005-06-04 Thread Steve Blackburn

Steven Gong wrote:


Is the sampling process done before running or during runtime?


The sampling is done at runtime.

(There is not much advantage in using anything other than full 
optimization for anything that is compiled ahead of time.  However, even 
ahead of time compiled methods, such as the boot image, can benefit from 
profile information gathered during previous runs).


If it's done 
during runtime, does it mean that some methods may be compiled several times 
by different leveled JIT?
 

In the case of the compile only systems (Jikes RVM, J Rocket, etc), this 
means that when a method is first encountered at runtime, it gets 
baselined compiled (at very low cost by a very cheap compiler), and then 
may subsequently be opt-compiled.   Jikes RVM has three levels of 
optimization O0, O1,  O2 (progressively more expensive and 
progressively more heavily optimized).  As Mike said, some cost-benefit 
analysis is done to determine whether recompilation is likely to be a win.


A similar situation exists for systems which mix interpretation and 
compilation, except the first phase is interpretation...


This gradual tuning and focussing of compilation effort is aided by 
instrumentation gathered at runtime.  This allows these compilers to 
perform dynamic optimizations, some of which an ahead of time compiler 
could only perform with the aid of profiles, and others of which are 
generally not possible (microarchitecture-specific optimizations).


This is covered in much greater depth in the tutorials which are on the 
wiki.


--Steve



Re: Work items

2005-05-27 Thread Steve Blackburn

Raffaele Castagno wrote:


There's not only the need to start implementing code.

I'm (slowly) translating some of the wiki documentation to Italian, but 
someone could also create a webpage for the Incubator site, or sort 
alphabetically the People page, organize the reference documentation, or 
simply change the layout of the wiki to make it more accessibile and good 
looking. These are tasks that anyone could take in charge, but that are 
somehow important anyway.


 


Absolutely!!!  This is invaluable to the project!

--Steve


Re: Work items

2005-05-27 Thread Steve Blackburn

Tom Tromey wrote:


Don't forget hacking on Classpath :-)
 


Gosh no!!  ;-)  Obviously that should very much be on the work list.

I suspect that for the work items at peer projects (classpath, mmtk, 
gcj, jikesrvm etc), that when possible the worklist just provide links 
to work items maintained by the associated project.  Dave Grove has 
started constructing a list for the opt compiler, so once that's 
available we can remove the jikes rvm items and replace them with a link 
to the jikes rvm list.



Steve . bytecode optimization [reserach]

Something interesting sort of related to this area is vmgen.  This
seems like a nice way to build interpreters.


Thanks for pointing that out!

This is actually the same work as the first item on the list I sent out 
(prototype backend generator), only I was unaware of vmgen (Anton 
Ertl).  I only knew about the paper with Gregg.  vmgen appears to be 
GPLed.  It looks interesting.


http://www.complang.tuwien.ac.at/anton/vmgen/
http://www.complang.tuwien.ac.at/anton/
http://www.complang.tuwien.ac.at/projects/backends.html

--Steve


Re: Terminology etc

2005-05-25 Thread Steve Blackburn

Hi Weldon,


One reason is that Harmony will need to plug in GCs other than MMTk.
 

Absolutely.  MMTk was designed from the outset with this in mind (at the 
time Jikes RVM already had another set of collectors).



Another reason is that in the long term the JVM's memory manager (GC)
probably ends up being merged with the OS's memory manager.

Hmmm.  This is not at all obvious to me.  I can imagine closer coupling 
of the VM and OS scheduluers.  I understand why the GC may need to 
cooperate more with the OS than it currently does 
(http://www.cs.umass.edu/~emery/pubs/f034-hertz.pdf), but the interfaces 
required for that are thin and coarse grained.  I think merging the OS 
and VM memory managers is a big step and outside the immediate goals of 
Harmony.



 Most OS
kernels are written in ansi C today.  A Harmony migration path that
allows JVM/OS integration would serve this project well.
 

I think that's a long bow to draw as far as motivating the use of C!  
Harmony is intended to be OS-neutral as far as I know.  Thus I can't 
imagine it not working through some OS abstraction layer, which will 
make it irrelevant what language a particular OS kernel is written in.  
If we really want to be that forward looking, we should probably be 
writing in Java or C#, as that is where the cutting edge of OS research 
is headed anyway ;-) (http://jnode.org, 
http://research.microsoft.com/os/singularity/).



Interesting.  I remember hearing that free list allocators are useful
for embedded applications where RAM is constrained.  In the embedded
market, I remember hearing  an interpreter is preferred because the
footprint bloat of adding a JIT is unacceptable.  Also for many
embedded situtions, the performance of the java code is not the top
concern.  In other words, inlining GC alloc and write barrier is not
useful to the embedded JVM marketplace.  Do you foresee or know of
uses other than embedded where a free list allocator is useful?  If
anyone in the embedded JVM market is reading the above,  can you
confirm/deny the above statements?
 

I agree with some of what you say.  However, another context is realtime 
work in performance critical contexts (this sort of work is happening in 
the leading commercial VMs right now).  Moreover, even in an embedded 
context, precompiled, optimized code (libraries for example) remains 
important.  I could go on at length about this, but in short, I think 
that having a system that is general enough to allow you to do that, 
whilst attaining a performance advantage to boot is a good idea.  I'm 
all for folks writting memory mangers in C if they like.  I'm just 
pointing out that as we both agree, performance critical elements of the 
memory manger will want to be expressed in Java (or bytecode or IR) 
regardless of what you write the rest of it in.



I am curious what inlining the entire free list allocator does to code
size.   I worry about code bloat.  Perhaps you can send me pointers to
the analysis.
 

I also worry about code bloat.  We have a porky regression test that 
checks some of this systematically.  As far as the free list goes, you 
need to realize that the compiler *statically* evaluates the sizeclass 
selection code I sent in my last email, reducing it all down to a simple 
compile-time constant which is used to index the free list.  There is no 
code bloat issue there---to the contrary, this removes the need for the 
instruction associated with a method despatch and replaces them all with 
a simple constantl.  Kathryn McKinley and I wrote a paper on this 
subject (in the context of write barriers).


http://cs.anu.edu.au/~Steve.Blackburn/pubs/abstracts.html#writebarrier-ismm-2002

Cheers,

--Steve


Re: Terminology etc

2005-05-25 Thread Steve Blackburn

Weldon Washburn wrote:


You advocate starting from a clean slate.
 

[Interpreting the above as a comment about the harmony project as a 
whole...]


That's not my position at all. 


http://mail-archives.apache.org/mod_mbox/incubator-harmony-dev/200505.mbox/[EMAIL
 PROTECTED]

I advocate a model where we identify what's at the table and leverage 
that as far as we can in building the Harmony VM.  We mustn't start with 
a clean slate as we have so much in front of us.  I've outlined a 
specific approach in the above email which involves seeding the project 
with existing VMs and concurrently building a new core or cores which 
will utilize existing (and new) components.


When I say identify what's at the table, I mean that very broadly.  I 
mean, everything from entire code bases, through to code for components, 
through to the availability of external components (the boehm collector 
for example, or someone else's JIT if it were pluggable), through to 
design ideas (how is it that JamVM is so compact? how does ovm do boot 
image stitching? What did Shudo learn when building his JIT?), through 
to mechanisms (how does JCVM do dynamic dispatch?).  Some of this is 
already appearing on the wiki.


My strong feeling is that as a community we have an extraordinary 
advantage of a vast amount of great work from a variety of backgrounds 
which we can freely integrate and build upon.  I feel uncomfortable 
talking too much about MMTk and Jikes RVM because I know that there is a 
phenomonal amount of intersting work out there which we have not started 
hearing about yet.  If Harmony does not leverage this extraordinary 
wealth of ideas, experience (and code), then it would be a missed 
opportunity of great proportions.


I believe that there is an enormous amount we can start working on right 
now, without feeling a need to start writing a core from stratch right 
now, be it in Java or C.  I hope that will emerge very soon.


Really, I think there's not a lot we disagree about.

Cheers,

--Steve


Terminology etc

2005-05-24 Thread Steve Blackburn

I thought it might be helpful to clarify some terminology and a few
technical issues.  Corrections/improvements/clarifications welcome ;-)

VM core

 The precise composition of the VM core is open to discussion and
 debate.  However, I think a safe, broad definition of it is that
 part of the VM which brings together the major components such as
 JITs, classloaders, scheduler, and GC.  It's the hub in the wheel
 and is responsible for the main VM bootstrap (bootstrapping the
 classloader, starting the scheduler, memory manager, compiler etc).

VM bootstrap

 The bootstrap of the VM has a number of elements to it, including
 gathering command line arguments, and starting the various
 components (above).


In the context of a Java-in-Java VM, the above is all written in Java.


VM boot image

 The boot image is an image of a VM heap constructed ahead of time
 and populated with Java objects including code objects corresponding
 to the VM core and other elements of the VM necessary for the VM
 bootstrap (all written in Java, compiled ahead of time, packaged
 into Java objects and composed into a boot image).  The boot image
 construction phase requires a working host VM (ideally the VM is
 self-hosting).

VM bootloader

 In the case of Jikes RVM a dozen or so lines of assember and a few
 lines of C are required to basically do the job of any boot loader
 loader---mmap a boot image and throw the instruction pointer into
 it.  It will also marshal argv and make it available to the VM core.
 This is technically interesting, but actually pretty trivial and has
 little to do with the VM core (aside from ensuring the instruction
 pointer lands a nice place within the boot image ;-)

OS interface

 The VM must talk to the OS (for file IO, signal handling, etc).
 There is not a whole lot to it, but a Java wrapper around OS
 functionality is required if the VM is java-in-java.  This wrapper
 is pretty trivial and one half of it will (by necessity) be written
 in C.

I hope this brief sketch provides folks with a slightly clearer view
of what a java-in-java VM looks like, and some (tentitive) terminology
we can use to ensure we're not talking at cross purposes.

Cheers,

--Steve



Re: Terminology etc

2005-05-24 Thread Steve Blackburn

A question (sorry): would it include realtime capabilities? 
Current
systems are sometimes real nightmares from this point of view and it is a
must have for embedded systems...
  

I certainly hope so... I don't believe it is beyond us to achieve that.
That is another area where the OVM experience may be particularly
helpful. As to the technical practicality of such a broad-ranging goal,
the j9 VM is as ambitious as this...

Cheers,

--Steve

Re: Other interesting papers and research

2005-05-24 Thread Steve Blackburn

[EMAIL PROTECTED] wrote:


They automatically build themselves
simple JIT backends (by extracting fragments produced by the ahead of
time compiler).  This sounds like a great way to achieve portability
while achiving better performance than a conventional interpreter.
   



I guess it's a bit better or just comparable with a good interpreter.
 

They say it is a lot better: speedups of up to 1.87 over the fastest 
previous interpreter based technique, and performance comparable to 
simple native code compilers. The effort required for retargeting our 
implementation from the 386 to the PPC architecture was less than a 
person day.



In 1998, I have written such a JIT compiler concatinate code fragments
generated by GCC for each JVM instruction.


Very interesting!


Unfortunately, the JIT was
slightly slower than an interpreter in Sun's Classic VM. The
interpreter was written in x86 assembly language and implements
dynamic stack caching with 2 registers and 3 states. It performs much
better than the previous interpreter written in C.

Then I rewrote the JIT.
 

It would be interesting to hear your perspective on Ertl  Gregg's 
approach.  Did they do something you had not done?  Do they have any 
particular insight?  You are in an excellent position to make a critical 
assessment of their work.



I am not very sure which is better for us, having a portable and so-so
baseline compiler or a good interpreter which is possibly less
portable than the compiler. There will be a trade off between memory
consumption, portability and so on.
 

Ideally we will have both as components and the  capacity to choose 
either or both depending on the build we are targetting.


Cheers,

--Steve


Re: [arch] The Third Way : C/C++ and Java and lets go forward

2005-05-24 Thread Steve Blackburn

Archie Cobbs wrote:


That's a good idea... I've made a small start..

  http://wiki.apache.org/harmony/JVM_Implementation_Ideas


Excellent!

--Steve


vmmagic

2005-05-24 Thread Steve Blackburn

Perhaps it is worth saying a little more about vmmagic.

The basic idea is to provide the necessary extensions to Java for the
construction of a java-in-java virtual machine.  These fall into two
categories: pragmas (some for safety, some for performance) and
unboxed types.  In all cases, these are regular java expressions that
are noticed by the compiler as being special, intercepted and
compiled to reflect the defined semantics of vmmagic.  Thus, for
example, when the compiler sees an instance of Address.loadInt(), it
compiles it into a load instruction.

Aside from satisfying the need to directly access memory, these types
are used to enforce type safety, provide us with unsigned word type
and abstract over issues such as the size of an address (32/64).  The
latter is extremely valuable in Jikes RVM where we support both 32 
64 bit architectures.

I've written a few notes about the history of vmmagic below, but I
want to say right now that vmmagic has evolved and will continue to
evolve, particularly in light of new developments such as annotations
etc in J2SE 5.

Pragmas.

 
http://jikesrvm.sourceforge.net/api/org/vmmagic/pragma/package-summary.html


 The magic pragmas can in principle be scoped at least three ways.
 Per-class (overloading the implements keyword), per-method
 (overloading the throws keyword), and intra-method (overloading the
 try catch keywords).  Presently we only use the first two.

 An example of a performance related pragma is InlinePragma, which is
 used to indicate that a method should always be inlined by the
 optimizing compiler.

 http://jikesrvm.sourceforge.net/api/org/vmmagic/pragma/InlinePragma.html

 An example of a correctness related pragma is UninterruptiblePragma
 which ensures that the compiler does not allow a method to be
 interrupted for thread switching (by omitting yield points).

 
http://jikesrvm.sourceforge.net/api/org/vmmagic/pragma/UninterruptiblePragma.html


Unboxed Types.

 
http://jikesrvm.sourceforge.net/api/org/vmmagic/unboxed/package-summary.html


 The Address type is used to provide raw memory access through its
 loadtype() and storetype() methods.  It also provides a memory
 barrier abstraction with a prepare()/attempt() idiom.  Finally, it
 abstracts over address width (32/64).  Despite the appearance of a
 value field in the javadoc below, in fact the type is materialized
 by the compiler as a primitive 32/64 bit type, not an object, and
 its methods are reduced to the appropriate instructions (load, store
 etc), not method calls.

 http://jikesrvm.sourceforge.net/api/org/vmmagic/unboxed/Address.html

 The ObjectReference type is used to provide an abstraction over
 object identifiers (rather than using Address, which is somewhat
 like using void*).  In the case of Jikes RVM ObjectReferences are
 materialized as 32/64 bit primitive types with direct operations,
 with the toAddress() operator returning that same value (since Jikes
 RVM uses addresses as object identifiers).  However in a VM that
 used handles, an ObjectReference would map to a handle and
 toAddress() would dereference the handle.

 
http://jikesrvm.sourceforge.net/api/org/vmmagic/unboxed/ObjectReference.html



For more information I suggest you browse the java doc, and if brave,
venture into the Jikes RVM compilers and see how the magic is actually
implemented!

Brief history

  The use of magic dates back to the beginning of the Jikes RVM
  project (then known as Jalapeno).  I know of three major sources of
  refinement.  First the OVM group, and in particular Chapman Flack
  (Purdue), made significant improvements to the Jalapeno approach to
  magic. Perry Cheng (IBM Research) then wrote an initial version of
  the Address, Word and Extent classes and in doing so with Kris
  Venstermans (Ghent) and Dave Grove (IBM Research) was able to
  identify and fix all of the thousands of uses of int (for
  address) in the Jikes RVM codebase.  This was key to getting a 64
  bit port going.  Daniel Frampton (ANU) was responsible for
  packaging it as org.vmmagic, separating the unboxed and pragma
  types, and introducing new types including ObjectReference.  Most
  of the original pragma types were written by Stephen Fink and Dave
  Grove (IBM Research).


Cheers,

--Steve



Re: [arch] The Third Way : C/C++ and Java and lets go forward

2005-05-23 Thread Steve Blackburn
Lets get moving.  Comments? 



Respectfully, I think this would be a mistake.  I think it would be a 
major error to start coding a VM core until there was more clarity about 
what we are doing and what the core would require.


but rather my understanding that we'll need a small C/C++  kernel to 
host the modules, no matter how they are written, and this  is a way 
to get that going...


This is not the case Geir.

When a VM is built in Java, the only need for C/C++ is for direct 
interaction with the OS (one modest file of C code with interfaces to 
the most basic OS functionality), and for bootstrapping (another 
OS-specific file of C code plus about a dozen of lines of assembler).  
That's it. The kernel of the VM can be entirely written in Java.  
Whether or not we chose to do that is another matter, but your comment 
above is technically incorrect, and therefore should not be the basis on 
which we start coding.


This misconception highlights why it is that I think we need a seeding 
process to gain some collective understanding before we start cutting 
code for a new VM core.  This requires some patience but I think will 
make the difference between us producing a) something that is free, runs 
OK, and is portable, from b) something that leverages the outstanding 
collective pool of ideas at the table (ovm, gcj, kaffe, joeq, jamvm, jc, 
orp, mudgevm, jikesrvm, etc etc) to deliver what I think could be the 
best performing, most exciting VM, free or non-free.


I am very excited about all of the technology that this project is 
bringing out.  I think JamVM looks outstanding, but I think it would be 
a serious error to take it as the core for Harmony.  It was not 
*designed* with our goals in mind.  We need to understand where the 
value in JamVM (and all other candidates) is, and then maximize our 
leverage on that in the Harmony VM, whether it be through an entire VM 
(unlikely), components (I hope so), designs (I am sure), or mechanisms 
(certainly).


I understand that it is important that we seize the enthusiasm of the 
list and start working, but respectfully, I think that cutting code for 
a VM kernel right now would be a bad mistake, one that might be 
gratifying in the short term but that is likely to lay the wrong 
foundation for what I think may become the most exciting VM project yet.


--Steve


Inventory of assets

2005-05-23 Thread Steve Blackburn

The wealth of ideas that are coming to the table is impressive.

I suggest that we establish some sort of inventory of our assets 
(presumably on the wiki).  I think that such a resource will be 
enormously helpful to a project like this one which is trying to bring 
together a huge pool of enthusiasm, prior work and outstanding 
research.  If this project could produce a VM that literally combined 
the best of all of the work at the table, it would be a formidable VM 
indeed.


By assets, I mean everything from the concrete (donated code), to the 
more abstract (resources such as papers like the one Andy posted to the 
list yesterday).  I think that if we maintain this as we go along, it 
could prove to be a really valuable resource.  To me this is our 
strength---so many different individuals bringing different expertise to 
the project, from code to ideas.  If we can catalog that wealth and 
therefore have it at our collective fingertips, I think the chances of 
this project capitalizing on those ideas and doing really exciting 
things will increase greatly.


I can imagine a tree like structure, with leaves on the tree being 
assets, each one described briefly---if code, then the level of 
availability and license, and perhaps a summary with pointers to more 
info.  If a paper, then a short summary and a pointer, etc etc etc.  For 
example, I think all of the VMs we're aware of should be leaves of a VM 
node in the tree.  The Boehm collector adn MMTk are obvious leaves of 
the VM.GC node.  The paper Andy raised could be added to 
VM.interpreter.  It may in fact be a graph, with, JamVM, for example 
getting a mention under VM.interpreter as well as VM and OVM could be 
under VM and VM.java-in-java, etc etc etc.  In a relatively short period 
of time, this could become a great repository of the state of the art in 
VM design and implementation.


It may then be that the technical FAQ simply becomes a set of references 
to the inventory of ideas.


--Steve


Re: Threading

2005-05-22 Thread Steve Blackburn

The Jikes RVM experience is kind of interesting...

From the outset, one of the key goals of the project was to achieve 
much greater levels of scalability than the commercial VMs could deliver 
(BTW, the project was then known as Jalapeno).   The  design decision 
was to use a multiplexed threading model, where the VM scheduled its own 
green threads on top of posix threads, and multiple posix threads were 
supported.  One view of this was that it was pointless to have more than 
one posix thread per physical CPU (since multiple posix threads would 
only have to time slice anyway).  Under that world view, the JVM might 
be run on a 64-way SMP with 64 kernel threads onto which the user 
threads were mapped.  This resulted in a highly scalable system: one of 
the first big achievements of the project (many years ago now) was 
enormously better scalability than the commercial VMs on very large SMP 
boxes. 

I was discussing this recently and the view was put that really this 
level of scalability was probably not worth the various sacrifices 
associated with the approach (our load balancing leaves something to be 
desired, for example).  So as far as I know, most VMs these days just 
rely on posix style threads.  Of course in that case your scalability 
will largely depend on your underlying kernel threads implementation.


As a side note, I am working on a project with MITRE right now where 
we're implementing coroutine support in Jikes RVM so we can support 
massive numbers of coroutines (they're using this to run large scale 
scale simulations).  We've got the system pretty much working and can 
support  10 of these very light weight threads. This has been 
demoed at MITRE and far outscales the commercial VMs.   We achieve it 
with a simple variation of cactus stacks.  We expect that once 
completed, the MITRE work will be contributed back to Jikes RVM.


Incidentally, this is a good example of where James Gosling misses the 
point a little: MITRE got involved in Jikes RVM not because it is 
better than the Sun VM, but because it was OSS which meant they could 
fix a limitation (and redistribute the fix) that they observed in the 
commercial and non-commercial VMs alike.


--Steve


Re: [arch] VM Candidate : JikesRVM http://jikesrvm.sourceforge.net/

2005-05-22 Thread Steve Blackburn

Dan Lydick wrote:


From: [EMAIL PROTECTED]
 

2. There are no Java virtual machines period that are presently 
practical to run high volume production code.
 


I meant Java virtual machines written in Java...  sorry
   


Does anyone have any benchmarks on such designs?
As a hard-core real-time and device driver guy,
I am rather skeptical that this is anything else
but a conflict in requirements, runtime performance
in execution speed versus interpretability and/or
compilability of the runtime module.

But then again, I've only been working with Java
at all for less than four years :-)
 

The discussion has been rehashed a lot of times.  I'd like to see it put 
to rest (perhaps we need a technical FAQ?).  See the end of this email 
for a brief answer anyway...*


Previously you wrote:


- FIRST:  A basic JVM, written in C/C++, by some combination
   of contribution, new development, and extension

- SECOND:  JIT and/or other compile optimizations, still C/C++

- THIRD:  Other JVM speedups, in concert or concurrently
   with a JIT compiler, still C/C++.

- FOURTH:  Potential rewrite of key portions into Java.

 

Let's not lock ourselves into an implementation technology (C/C++ in 
your post here) without a deep understanding of the alternatives.  There 
are people at the table with that experience.


The seeding proposal is intended to address this very concern and 
acknowledge the reality that people will come to this project with 
different backgrounds and different experiences.  If we can learn from 
the wealth of technology on the table through the seeding process, I 
think we'll be able to make much more informed decisions about how we 
want the core/s of the Harmony VM to look.


Cheers,

--Steve

digression and rehash on java-in-java implementation technology

* In a nutshell, for a high performance VM, the single biggest issue is 
the quality of your compiler/s (which dictates the performance of user 
code)---this is true regardless of the implementation language.  Very 
little runtime is actually due to the execution of the VM per se 
(perhaps 10%) and that is primarily just the execution of the JIT and 
the GC.  The remaining 90% is user code (and libraries) whose 
performance rests largely on the quality of your JIT.  We've 
demonstrated the performance of MMTk and I've already said many times on 
this list why the lack of implementation/implemented language impedance 
helps us so much.  As for the JIT, we have the eat your own dogfood 
phenomena: if the JIT is any good, then writing the JIT in Java won't be 
a performance problem.  We've seen this with the Jikes RVM project.  
Dave Grove has a work list as long as his arm of things we could do to 
improve in the optimizing compiler, but we lack the resources to get 
them done.  That is why Jikes RVM is currently lagging the commercial 
JITs where it once was competitive.   I think this is why the harmony 
project holds a lot of promise: it will sharpening the community's focus 
and lead to much better  targeting of resources.


We don't have any published performance comparisons.  I can only tell 
you that for a long time our comparison point was the IBM product VM (on 
PPC) and for a while we were competitive.  Since we no longer have full 
time IBM staff working on the project (they've been diverted to the 
product ;-), we've started to slip.  The story on the IA32 is less 
pretty mainly because our primary focus for much of the time was on the 
PPC, so a lot of trivial (but time consuming) IA32 optimizations are yet 
to be done.


As to Andy's comment to which you were responding, there is nothing 
about the Java-in-Java VM design which is preventing it from being used 
to run high volume production code.  Jikes RVM performs very well in 
this context and that is exactly where it first made its mark 
(outscaling commercial VMs running JBB on large SMP boxes).  The two 
primary issues holding it back today are a) keeping the compiler up to 
speed with the commercial VMs, and b) completeness of the VM.  To the 
extent that either are limitations, they are due to resource constraints 
and have nothing whatsoever to do with the Java-in-Java implementation 
technology.  To the contrary: in my view there is no way we'd have 
achieved as much as we have with the resources we've had if we did not 
have the considerable software engineering advantages of a strongly 
typed language.


Having said all that, the looming performance issue is locality, which 
depends on many more subtle issues and places a lot more pressure on the 
memory management subsystem (the locality properties of the allocator, 
the capacity of a copying collector to improve locality, etc, etc).


Fortunately we're not writing device drivers here, but if that's what 
you're into, you might find this (http://jnode.sourceforge.net/portal/) 
interesting, and this (http://research.microsoft.com/os/singularity/).


Re: Other interesting papers and research

2005-05-22 Thread Steve Blackburn

Archie Cobbs wrote:


[EMAIL PROTECTED] wrote:


The approach of using C Compiler generated code rather than writing a
full compiler appeals to me:
http://www.csc.uvic.ca/~csc586a/papers/ertlgregg04.pdf

I am curious on how well the approach performs compared to existing 
JITs.


I'm admittedly biased, but the approach of using the C compiler has
some good benefits, mainly in portability.


As far as I can tell, the technical insight in this paper has nothing to 
do with C per se. It has to do with having a portable ahead of time 
compiler (be it C or Java).  The idea of leveraging a portable ahead of 
time compiler is something that all interpreters do.  The insight here 
is to do it far more agressively.  They automatically build themselves 
simple JIT backends (by extracting fragments produced by the ahead of 
time compiler).  This sounds like a great way to achieve portability 
while achiving better performance than a conventional interpreter.


So long as we have a portable java WAT compiler at our disposal (gcj), I 
think we can apply this neat idea independant of whether we're using C, 
C++ or Java (or fortran for that matter).


--Steve


Re: Harmony and JamVM (Re: JIRA and SVN)

2005-05-21 Thread Steve Blackburn
JamVM sounds very interesting.  A fast lightweight interpreter has at 
least two attractions:


a) portability (this is true regardless of the implementation language, 
the point is that an interpreter does not require a new compiler backend 
to be written for each architecture)

b) compactness

If the lightweight interpreter were a pluggable component, or if we 
could use JamVM as a prototype for a pluggable lightweight interpreter, 
I think that would be a real asset.


Rob, what do you think?  Could the JamVM interpreter be extracted as a 
pluggable module?  Could the JamVM interpreter be used as a protype for 
a new componentized implementation (will your insights and lessons 
translate well into a new implementation, or are they mostly 
JamVM-specific?)


Is the GC exact or conservative?

In other words, how can Harmony make the most of JamVM's most valuable 
ideas?


Cheers,

--Steve


Re: [arch] VM Candidate : JikesRVM http://jikesrvm.sourceforge.net/

2005-05-20 Thread Steve Blackburn
Weldon Washburn wrote:
On 5/19/05, Stefano Mazzocchi [EMAIL PROTECTED] wrote:
 

This is why I would like Harmony to have two VMs, one written in java
and one written in C-or-friends: this would give us
   

Well, I suspect if we design the interfaces correctly, we could do the
above with one JVM instead of two.   Two competing Harmony
implementations means ultimately one of them must die.
I envisage that harmony is *seeded* with two VMs.  Under the seeding 
model both seeds are destined to die (that is, their *cores* die) once 
new core/s evolve.  I view this as a good thing.

 Harmony really
needs one quickly evolvable code base.  The concept is to write the
JVM in 100% C/C++ today.   Rationale:  C/C++ is battle tested for
building OS and compiler systems.
As for a language battle tested for building compiler systems, I think 
Java has well and truly earned that badge (javac? antlr? soot? 
jikesrvm?...).  I don't know about an OS, but we're not building an OS. 
We have pushed Java hard in Jikes RVM, and as a long time C/C++ systems 
programmer who 6 or so years ago started working on a Java-in-Java VM, I 
for one can say I never want to go back! :-)  As an apache project, we 
should be looking forward, not backward.  Having said all of that, I am 
all for a development model that allows us to build the core and 
components in either language.

 Set a goal of rewritting the JIT in
Java/C# by 2007.  If IT shops are happy deploying Harmony with the JIT
written in Java, then set a goal of rewriting 90% of the VM in Java/C#
by 2009.
 

IT shops will judge Harmony by its performance, robustness and 
completeness.  I happen to believe we are going to meet those goals 
better if we use Java.  Again, I am all for a model that allows us to 
use JITs written in C, C++ or Java.

Yes!  Although it will be more challenging to create interfaces that
will work for both Java and C/C++, I suspect the end result will be
worthwhile.
 

As far as GC goes (arguably the most complex interface), we have done 
this with MMTk  Rotor.

Modularization allows specialization.  Specialization fosters faster
evolution.  Harmony is an opportunity to build an infrastructure that
can outrun the existing monolithic JVM code bases.  You don't need to
know the entire system to work on a given module
I could not agree more.
.  A short list of JVM
modules: JIT, GC, threading/sync, class loader, verifier, OS
portablility layer.  Different JITs and GCs might actually decide to
sub-modularize if they like.  For example JIT X might have a single
high-level IR module and separate low-level modules for each
instruction set architecture supported.
 

I imagine that a compiler policy will become a (thin) module which in 
turn interfaces to sub-modules which are particular compiler instances.  
For example an adaptive recompilation policy could interface to the core 
vm on one side, and multiple compiler choices on the other.

Last Friday, I made the following proposal:
http://mail-archives.apache.org/mod_mbox/incubator-harmony-dev/200505.mbox/[EMAIL
 PROTECTED]
In the context of the current discussion I'd like to re-advocate that 
proposal.  It is consistent with what Stefano has suggested.

To summarize:
1. Leverage existing momentum by seeding the project with two existing VMs
2. Leverage existing work by focusing on modularity of major reusable 
components (including components from outside of the seed VMs).
3. Concurrently design new VM cores.

Modularizing the seed VMs will provide the group with a great deal of 
insight into how new VM cores should be built.  I say cores for three 
reasons: a) the cores will (by defn) be small, so with a modular 
framework, having multiple cores should be feasible, b) different cores 
can target different needs, c) we can explore different implementation 
strategies.

--Steve



Re: [arch] VM Candidate : JikesRVM http://jikesrvm.sourceforge.net/

2005-05-20 Thread Steve Blackburn
[EMAIL PROTECTED] wrote:
Part of a runtime written in Java has to be interpreted, or compiled
before executed. Throughput is sacrificed when interpreted and
interactivity is sacrificed when compiled.
 

The runtime itself can't realistically be interpreted because it would 
just be too slow.  So it is normally compiled.  However, the compilation 
of the core VM (as for the VM written in C) occurs ahead of time (using 
its own JIT and persisting the JITed image, which is the binary users 
execute).  Running an application does not require compilation of the 
VM.  So using Java need not be a cause for reduced interactivity.  Of 
course a JITed VM (writen in Java or C) will always have the challenge 
of avoiding reduced interactivty due to the jitting of user code.

--Steve


Re: Developing Harmony

2005-05-20 Thread Steve Blackburn
Tom Tromey wrote:
One thing I don't know, that I would like to know, is what parts of
the java class libraries are used by the JikesRVM core.  How big is
the minimal installed image?  What runtime facilities of Java are
assumed by the core?  E.g., does the JIT itself rely crucially on
exception handling?  Garbage collection?
The reason I ask is that I'm curious about how minimal a system can be
made with JikesRVM.
 

I think this is a very good question.
The very short answer is that, as it stands, Jikes RVM is not 
lightweight.   The interesting question is how much of this is intrinsic 
and how much of this is an artifact of its explicit focus from inception 
on server applications, high performance and scalability.

I think an indepth answer will take a bit of research.  Some quick 
observations that may help...

. MMTk has pretty minimal dependencies (it does not depend on GC, 
exception handling, or any extenal feature or library that could 
concievably call new(), for example!)
. The opt compiler is a large non-trivial piece of Java code written in 
a more normal style.  It probably has a similar level of dependencies to 
other large complex optimizing compilers written in Java.   It most 
certainly depends on GC (it generates lots of small objects), and it 
certainly depends on exception handling (if it fails for whatever 
reason, it falls back to the baseline compiler).
. The opt compiler would not be appropriate for a lightweight VM in your 
cell phone or whatever.
. Jikes RVM can run without the opt compiler.

Back to the development model...
I think the above discussion illustrates very nicely why we want 
componentization.

It also highlights why I would like the seeds motivate the development 
of new cores, written from scratch to work with existing high value 
components.  I would love to see a new core for the Jikes RVM 
components.  I would like to see new cores learn some of the lessons of 
OVM (second generation java-in-java).

I can also see that there may be an important role for gcj here
The problem with a compiler is that new backends must be written for 
every target architecture.  This is not a big deal if you're targetting 
a modest set, such as IA32, PPC, SPARC etc, but if you really want to be 
as portable to just about anything, then you will probably need to 
forego the JIT on such platforms.  This is fine too, because you 
probably don't really want a JIT on your cell phone.  So I see a role 
for a core with an interpreter, if only to support the goal of 
maximizing portability.  (As a side note, I am opposed to mixed 
interpretation/compilation within a given VM core.  Far simpler to have 
a quick and dirty compiler used alongside your opt compiler).

If you want to write an interpreter, you will want to compile it (!), 
and if your compiler is highly portable, then voila, you have a highly 
portable execution engine (a lesson kaffe, sable and others have shown 
very nicely by leveraging gcc).

Anyway, back to gcj... 

It seems to me that gcj can have a key role here.  If our interpreted 
core is written in Java and compiled with gcj then we gain that high 
degree of portability.  There are lots of other good things to say about 
gcj, but this is just one thought about how we can build a nice clean 
small-footprint core in Java and retain our portability goals.

Cheers,
--Steve


Re: [arch] VM Candidate : JikesRVM http://jikesrvm.sourceforge.net/

2005-05-20 Thread Steve Blackburn

Mark Brooks wrote:

Investigation is fine enough, but let's face facts.  This is a HUGE 
project.


We do NOT have the time or manpower to write two VMs as part of a 
project where we need a working J2SE 5 implementation, from basement 
to TV aerial so to speak, and we need it in time to be relevant.  Even 
with donated code, it won't work to do VMs in parallel.


I think it would be insane to develop multiple VMs in parallel. 


Fortunately, I don't think anyone is proposing that.

The seeding model I have pushed would only have multiple VMs on the 
table during the seeding process.  That is something I envisage would 
take three to nine months.  During that time the community will have the 
opportunity to engage with the projects and gain much needed experience 
and insight into how a real VM is built.  Concurrently new cores would 
be architected and built. The VM cores are small but they are the 
centerpiece and getting them right is critical.  Therefore, as far as 
the cores go, I am much less concerned about person power than I am 
about informed design decisions.  The bulk of the work is going to be in 
achieving completeness, improving packaging, fixing known shortcomings, 
smoothing corners etc etc.


With a strong component based model, different configurations could be 
built for different purposes.  This is not a radical idea nor is it 
rocket science---we already do it in Jikes RVM (you choose your compiler 
and your GC ahead of time and tailor the VM accordingly).  I am sure 
other VMs do this as well.  If we're going to try to cover the desktop, 
servers and embedded devices, then reuse is going to be key.


--Steve


Re: Against using Java to implement Java (Was: Java)

2005-05-18 Thread Steve Blackburn
This subject has been covered in detail at least twice already.
There is no need for any function call on the fast path of the 
allocation sequence.  In a Java in Java VM the allocation sequence is 
inlined into the user code.  This has considerable advantages over a 
few lines of assembler.  Aside from the obvious advantage of not having 
to express your allocator in assembler, using Java also compiles to 
better code since the code can be optimized in context (register 
allocation, constant folding etc), producing much better code than hand 
crafted assembler.

However this is small fry compared to the importance of compiling write 
barriers correctly (barriers are used by most high performance GCs and 
are executed far more frequently than allocations).  The same argument 
applies though.  The barrier expressed in Java is inlined insitu, and 
the optimizing compiler is able to optimize in context.

Modularization does not imply any of this.
--Steve
Weldon Washburn wrote:
On 5/18/05, David Griffiths [EMAIL PROTECTED] wrote:
 

I think it's too slow to have the overhead of a function call for
every object allocation. This is the cost of modularization. I doubt
any of the mainstream JVMs you are competing with do this.
   

Yes.  I agree.  A clean interface would have a function call for every
object allocation.  However if the allocation code itself is only a
few lines of assemby, it can be inlined by the JIT.  Using a moving
GC, it is possible to get the allocation sequence down to a bump the
pointer and a compare to see if you have run off the end of the
allocation area.
 



Re: Stop this framework/modularity madness!

2005-05-17 Thread Steve Blackburn
Hi Rodrigo,
I believe the focus should be on deciding if Harmony will star from
other JVM or not.
I agree entirely that this is an important issue, and a lot of people 
are working hard right now to see if this can happen.  Donating an 
entire JVM to apache is not a trivial issue, so we will need to be patient.

However, it is not the either/or situtation you paint above.  I think it 
may make most sense to work on a preexisting donated VM or VMs while 
*concurrently* developing a new VM core or cores from scratch.  This 
approach has a number of advantages, including maximizing our leverage 
from existing work, minimizing startup time and accelerating the process 
of building an existing VM.  It also allows us to more fruitfully 
explore some of the implementation choices (which language to use, ...).

[previously you wrote...]
Making Harmony modular enouth to be kind of a JVM framework cannot be
done before having a working JVM. There is a lot of literature about
how frameworks should emerge from continuous design and development.
 

The donated VM or VMs (if they araise) may already be at this point.  As 
I have already said, this is a process already underway in the Jikes RVM 
project.  I can't speak for the other projects, but they may also be at 
such a stage or, better, have moved beyond that stage.

This must not be the focus until required, so no JIT plugable layer
until someone tries to write another JIT for Harmony (emphasis on
another).
 

I would like to leverage prexisting VMs to push the process of 
modularization.

Creating such is a big chalenge, to guess what spots need to flexible
and the others that don't. Guess what, people often make bad guesses
about these and in the end we have a very complex design with a lot of
shortcomings.
 

Right.  This is why it is essential that harmony learn from kaffe, gcj, 
jikesrvm, sable, ovm, joeq, etc etc.  The project does not exist in a vacum.

--Steve


Re: Chicago

2005-05-15 Thread Steve Blackburn
Geir Magnusson Jr. wrote:
On May 10, 2005, at 11:11 PM, Steve Blackburn wrote:
I wonder how many folks will be in Chicago next month for
PLDI/VEE/MSP/LCTES?  I for one would really enjoy spending some time
discussing this project.  It seems a perfect opportunity given the
combination of conferences.
Can you give us some more info?
http://research.ihost.com/pldi2005/
PLDI: the major anual conference on Programming Language Design and 
Implementation
VEE: Virtual Execution Enviornments (ie VM design)
MSP: Memory Systems Performance (a lot of VM related work)
LCTES: Languages, Compilers, Tools for Embedded Systems

This is the cream of international research in the area of P/L and VM 
design.

--Steve


Research

2005-05-14 Thread Steve Blackburn
A lot of the existing work on open VMs has come out of the academic
and industrial research communities (Jikes RVM, SableVM, ORP, OVM,
Joeq, LLVM...).
I think it may be worthwhile considering both what the research
community's needs are and what the research community can bring to the
harmony project.
. The research community needs access to a quality VM with a liberal
  license.
- Performance is key to credibility.  A research result that shows
  a 10% speedup over something we know to perform awfully is of
  little interest to anyone.  This is a reason why Jikes RVM has
  been attractive to researchers.  Consider the number of
  institutions, names and hours represented in this list:
  http://jikesrvm.sourceforge.net/info/papers.shtml. When I was
  at U. Mass we stopped what had been a huge effort to build our
  own JVM and instead put our resources into Jikes RVM because it
  had an outstanding optimizing compiler and (at the time) was
  very competitive with the best commercial VMs.
- Software quality is very important.  The software must be
  flexible and modular to facilitate rapid realization of new
  ideas.
- Researchers don't want to be locked in by a license (some
  commercial licenses, for example).  They also want a
  transparent development model where they can see what's going
  on and what is likely to happen.  Above all, many researchers
  get a real buzz out of seeing their neat ideas being widely
  used, so they want a model that allows them to recontribute
  their work.
. The research community can bring cutting edge technology to a VM.
- The amount and variety of Java technology emerging from the
  academic research community is amazing.  The biggest challenge
  is channeling those ideas back into VM infrastructures.  This
  is partly a cultural issue and partly a software engineering
  issue.  If the software is not readily amenable to
  contributions of radical technology, such contributions are
  unlikely to happen.  If academics are not part of or don't
  understand the open community they won't be part of the culture
  of contribution.
So, the academic community represents a huge intellectual resource.
For a project with big ambitions and yet so dependent on cutting edge
technology, harnessing some of that intellectual resource is going to be
important.
--Steve


Re: Sending native code to the processor at runtime

2005-05-14 Thread Steve Blackburn

Why can a JIT not achieve 110% of native performance? (Assuming that we
strip out the compile time and compare like with like.)
The reason I say 110% is that binary code is usually compiled for the
lowest common denominator. So x86 code targets a 386, and Sparc binaries
target UltraSparc v8 or older processors. 

The advantage a JIT has is that it knows exactly where it is being
compiled, so in theory can make use of as much hardware assistance as it
can.
 

Just taking up this point, the answer is yes.   This is a very hot 
subject in the research community right now.  Dynamic optimization can 
improve micro-architectural performance in ways that are not possible 
with static compilation.  Traditional static optimization is limited by 
factors including dynamically linked libraries, lack of information 
about the runtime microarchitecture, and unpredictable or phasic 
application behavior.

This approach is not limited to JITs for languages like Java, it is also 
being applied to binaries (ie parsing and recompiling machine code).  
There is a whole slew of work out there on this subject.  As we push the 
architectures harder, dynamic optimization becomes more and more important.

Cheers,
--Steve
Here's a semi-random set of pointers to some recent work. There is a 
stack more work out there, these are just a few that happen to be at my 
finger tips.  Most of it is available online, just google for it.

@inproceedings{ arnold05collecting,
   author = Matthew Arnold and David Grove,
   title = {Collecting and Exploiting High-Accuracy Call Graph 
Profiles in Virtual Machines},
   booktitle = International Symposium on Code Generation and 
Optimization,
   pages = 51-62,
   year = 2005 }
@MISC{ klaiber00crusoe,
 AUTHOR = {Alexander Klaiber},
 TITLE = {The Technology Behind Crusoe Processors},
 HOWPUBLISHED = {White Paper of Transmeta Corporation},
 YEAR = 2000,
 MONTH = {January},
 URL = {http://www.transmeta.com/pdfs/paper_aklaiber_19jan00.pdf},
 CATEGORY = {techdoc,dvs}
}

@inproceedings{bruening03,
author = {Derek Bruening and Timothy Garnett and Saman Amarasinghe},
title = {An infrastructure for adaptive dynamic optimization},
booktitle = {CGO '03: Proceedings of the international symposium on 
Code generation and optimization},
year = {2003},
isbn = {0-7695-1913-X},
pages = {265--275},
location = {San Francisco, California},
publisher = {IEEE Computer Society},
address = {Washington, DC, USA},
}
@article{kistler03,
author = {Thomas Kistler and Michael Franz},
title = {Continuous program optimization: A case study},
journal = {ACM Trans. Program. Lang. Syst.},
volume = {25},
number = {4},
year = {2003},
issn = {0164-0925},
pages = {500--548},
doi = {http://doi.acm.org/10.1145/778559.778562},
publisher = {ACM Press},
address = {New York, NY, USA},
}
@inproceedings{lu03,
author = {Jiwei Lu and Howard Chen and Rao Fu and Wei-Chung Hsu and 
Bobbie Othmer and Pen-Chung Yew and Dong-Yuan Chen},
title = {The Performance of Runtime Data Cache Prefetching in a Dynamic 
Optimization System},
booktitle = {MICRO 36: Proceedings of the 36th Annual IEEE/ACM 
International Symposium on Microarchitecture},
year = {2003},
isbn = {0-7695-2043-X},
pages = {180},
publisher = {IEEE Computer Society},
address = {Washington, DC, USA},
}
@inproceedings{merten00,
author = {Matthew C. Merten and Andrew R. Trick and Erik M. Nystrom and 
Ronald D. Barnes and Wen-mei W. Hmu},
title = {A hardware mechanism for dynamic extraction and relayout of 
program hot spots},
booktitle = {ISCA '00: Proceedings of the 27th annual international 
symposium on Computer architecture},
year = {2000},
isbn = {1-58113-232-8},
pages = {59--70},
location = {Vancouver, British Columbia, Canada},
doi = {http://doi.acm.org/10.1145/339647.339655},
publisher = {ACM Press},
address = {New York, NY, USA},
}
@inproceedings{rosner04,
author = {Roni Rosner and Yoav Almog and Micha Moffie and Naftali 
Schwartz and Avi Mendelson},
title = {Power Awareness through Selective Dynamically Optimized Traces},
booktitle = {ISCA '04: Proceedings of the 31st annual international 
symposium on Computer architecture},
year = {2004},
isbn = {0-7695-2143-6},
pages = {162},
location = {Munchen, Germany},
publisher = {IEEE Computer Society},
address = {Washington, DC, USA},
}
@inproceedings{ arnold00adaptive,
   author = Matthew Arnold and Stephen J. Fink and David Grove and 
Michael Hind and Peter F. Sweeney,
   title = {Adaptive Optimization in the Jalape{\~n}o JVM},
   booktitle = Conference on Object-Oriented Programming, Systems, 
Languages, and Applications,
   pages = 47--65,
   year = 2000,
   url = citeseer.nj.nec.com/arnold00adaptive.html }



Re: I hope the JVM implements most using Java itself

2005-05-13 Thread Steve Blackburn
Hi Dmitry,
constructive_interest 
[...]
First one is of the chicken-vs-egg variety -- as the GC algorithm 
written in Java executes, won't it generate garbage of its own, and 
won't it then need to be stopped and the tiny little real garbage 
collector run to clean up after it? I can only see two alternatives -- 
either it is going to cleanup its own garbage, which would be truly 
fantastacal... Or it will somehow not generate any garbage, which I 
think is not realistic for a Java program...
This is a very important issue.
The short answer is as follows:
   a) Within the GC code itself we don't really use Java, we use
  a special subset of Java and a few extensions.
   b) We never call new() within the GC at runtime
   c) We try not to collect ourselves ;-)
You will find the long answer buried in the source code and a somewhat 
out of date paper:

http://cvs.sourceforge.net/viewcvs.py/jikesrvm/MMTk/
http://jikesrvm.sourceforge.net/api/org/vmmagic/unboxed/package-summary.html
http://jikesrvm.sourceforge.net/api/org/vmmagic/pragma/package-summary.html
http://cs.anu.edu.au/~Steve.Blackburn/pubs/abstracts.html#mmtk-icse-2004
I'll try to give a more succinct answer here:
As for a), we essentially apply a few design patterns and idioms for 
correctness and performance (more on performance later).  We don't use 
patterns that depend on allocating instances.  In fact the only 
instances we create are per-thread metadata instances which drive the 
GC.  These are allocated only when new threads are instantiated 
(actually these are per posix thread, Jikes RVM uses an N-M threading 
model).

As for b), there is not much call for dynamic memory management within a 
GC.  The exceptions are a) short-lived metadata such as work queues, and 
b) per-object metadata such as that associated with free lists and mark 
bits etc etc.  We solve this by explicitly managing these special cases 
from within our own framework.  We have a queue mechanism that works off 
raw memory and a mechanism for associating metadata with allocated 
space.  The details are beyond the scope of this email.

Actually c) is one of the hardest parts.  It is essential that heap 
objects associated with the VM and the GC are not inadvertently 
collected.  This requires some very careful thought (remembering that 
the compiler will place our *code* into the heap too!).

As to whether this is feasible, its been done at least three times 
over.  First in the original Jalapeno, then in GCTk (developed while I 
was at UMass) and now MMTk.  Right now I am working with my students 
here to push the MMTk design even cleaner while not sacrificing 
performance---fun!

So, can it perform?  Well it is very hard to do apples to apples 
comparisons, but we measure the performance of our raw mechanisms with C 
implementations as a milestone and we do very well (by this I mean we 
can beat glibc's malloc for allocation performance, but this claim needs 
to be covered with caveats because it is very hard to make fair 
comparisons).  So the raw mechanisms perform well.  But then the 
software engineering benefits of Java come to the fore and our capacity 
to implement a toolkit and thus have a choice of many different GC 
algorithms gives us a real advantage (the GC mechanism/algorithm thing 
was the subject of a previous thread).

I've glossed over a huge amount of important stuff (like how we get raw 
memory from the OS, how we introduce type safe pointers and object 
references, etc etc).

To summarize (and to get to the question already) - the point is that 
language shapes thought. In other words, a program designed in Java 
will naturally tend to be slower then a program designed in C, simply 
because Java most naturally expresses slower designs then C does. And 
the question is - does this agree with anyone elses experiences? And 
if it does, is it a valid argument against using Java for the design 
and implementation of the VM?
OK so there is already at least one response to this, but let me add my 
experience.

I am very focused on performance.  The approach Perry Cheng and I took 
when writing the code for MMTk was very much that premature optimization 
is indeed the root of all evil.  Moreover, we placed enormous faith in 
the optimizing compiler.  The philosophy was to assume the optimizing 
compiler was smart enough to optimize around our coding abstractions, 
and then to do careful performance analysis after the fact and see where 
we were being let down.  In some cases the compiler was improved to deal 
with our approach, other times we modified our approach.

Over time we learned certain idioms which on one hand meant we tended to 
get reasonable performance first shot, but on the other may have 
undermined the natural Java style we started with.

While I understand what you mean when you say: a program designed in 
Java will naturally tend to be slower then a program designed in C, 
addressing that concern is one of the most important challenges of 

Re: I hope the JVM implements most using Java itself

2005-05-12 Thread Steve Blackburn
Davanum Srinivas wrote:
Steve,
is there some writeup on the your approach to making JikesRVM modular
and composable?
thanks,
dims
 

This is all very much work in progress, but since our thinking on this
task relates so closely to the Harmony goals, I thought it was worth
writing something down.  Two important caveats:
1. I am not making any presumption that this is necessarily the
 direction Jikes RVM wants to go.
2. Please note that this is *not* an official Jikes RVM
 roadmap---most of this has not been openly discussed yet, and
 won't happen until it is discussed first.
Goals:
o normalizing the Jikes RVM code base
. improved maintainability
. development with Eclipse
- reduce impedance to contributions to the project
Context: Jikes RVM has a number of idiosyncrasies in its code base
 (dependence on a pre-processor, non-standard directory
 structure, weak use of packages...)
o rationalizing the build process
. improved maintainability
. use of standard tools
. support componentization
Context: Jikes RVM has a baroque build process to support
 configurable builds (architectures, compilers, GC
 algorithms can all be selected).  The process depends on
 configuration files and heafty shell scripts, rather than
 orthodox build processes.
o improved componentization
. improved maintainability
. development with Eclipse
. pluggable GCs, compilers
- strength through diversity of components
- encourage non-trivial contributions (a new compiler)
Context: Jikes RVM has a number of key components, not all of
 which are well factored.
Approach:
o identify and isolate components
a) mutually exclusive (object model, arch (IA32/PPC))
b) runtime selectable (compiler)
. structure packages so that these components map closely to the
  Eclipse notion of project (self-contained modules of code)
o compile components to standalone jars
. use stubs to isolate against specifics
  - For example with MMTk (Jikes RVM's memory manager), the
org.mmtk.vm subpackage implements a bunch of vm-specific stubs
which are fleshed out by any particular client VM (Jikes RVM
or jnode, for example).  The mmtk.jar builds against the
stubs, but does not include them.  When the VM is composed,
the appropriate org.mmtk.vm implementation is put in place.
o building/composing a system
. mostly consists of fairly simple scripts that
  - define configuration-specific constants (eg BYTES_IN_WORD)
  - stitch together some selection of the above jars
Status:
We are just about at the point of completely achieving our goals with
regards to componentizing MMTk.  The final patches are being tested
right now.  Applying the above approach to the rest of the VM should
not present many technical difficulties since the memory manager has
the most complex interface, but will be a *lot* of work given our
current resources.  Of course an injection of new energy could
completely change this.


Re: Thoughts on VM

2005-05-11 Thread Steve Blackburn

 Current VM's use JIT or dynamic profiling/performance optimisation 
techniques to improve the code running on them, but this only applies 
to application code.  Would it be possible to write the VM in such a 
way as the VM itself is able to be optimised at runtime?
Yes.  This is what already happens in Jikes RVM.  The most performance 
critical examples are the barriers and allocation sequences, which the 
optimizing compiler gets to inline and then optimize in context.  As I 
said in an earlier post, these dynamic optimizations can lead to the 
Java implementation of the allocator out-performing highly tuned C 
allocators (important since Java programs tend to be allocation 
intensive).  This is an argument for implementing the VM in Java.

Other elements of the VM such as the garbage collector and the 
compiler/s may not benefit so much from adaptive compilation since they 
are monolithic and their behavior tends to be so predictable.

--Steve


Re: Thoughts on VM

2005-05-11 Thread Steve Blackburn
Bob Griswold wrote:
GC efficiency is very different than the speed at which the JVM runs it's GC
code.
 

In my experience both the GC algorithm (Bob's GC efficiency) and the 
performance of the key GC code (Dmitry's point) are important.  Having 
developed a high performance GC core (Dmitry's point), we have spent 
most of our time developing better GC algorithms (Bob's point).  ;-)

However, I think Dmitry's post highlights an important trend.  Memory 
performance is very important and architectural trends point to this 
becoming more and more the case in the future (the relative cost of a 
memory access and the relative advantage of locality is increasing all 
the time).

--Steve


Re: I hope the JVM implements most using Java itself

2005-05-10 Thread Steve Blackburn
Hi Larry,
I understand your sentiment.  I am also a pragmatist.
One of my major missions over the past year or so has been cleaning up 
Jikes RVM to make it more modular and composable.  We've nearly got 
there with memory management, but still have a way to go with the other 
components.

Why is this important?
Because I want to make it easier for people to contribute to the 
project.  Not just in terms of a few lines here or there, or a bell or a 
whistle, but I want people to be able to drop in alternative compilers, 
alternative GC algorithms etc.  Unless the framework is right the 
impedance becomes too high and the rate of non-trivial contribution 
drops off to a trickle.  Getting the framework right after the fact is 
an enormous task.

The fundamental architecture of the VM is what makes or breaks it. The 
just get it out the door approach has its merits for some projects, 
but for something as complex as this, if you want the thing to 
last---which we do---give some thought to the architecture before you 
throw it over the fence.  Choosing to build it in Java or C/C++ is a 
relatively unimportant issue for most projects, but for a JVM it will 
have a significant impact on the architecture.

The goal posts are moving very fast, in terms of the spec, in terms of 
the competing technology, and in terms of the architectural targets. 
Thus the importance of ongoing non-trivial contributions is enormous 
with a project such as this.

This is why I brought it up now (and that is why I prefaced my original 
comments the way I did).

--Steve
Larry Meadors wrote:
Despite my earlier Mono comment, I could not possibly care less what is used 
to build the JVM. Use Ruby if it gets the job done.

My vote would be to use whatever gets it done quickly and correctly. IMO, 
focusing on performance at this point is important, but not critical. 

First priority: Get it out the door, and make sure it is easy to build so 
everyone who want to tweak it can.

Second priority: Work with the community to make it faster and more stable 
than anything anyone has ever seen.