Re: Introduction, and a question

2005-05-17 Thread shudo
 Newbie question: What is to stop us from caching JITed code?  .NET/
 mono does this as far as I know?

We can do it even in the forthcoming Harmony runtime.

On the other hand, an apparent drawback is disk
consumption. Generally, JITted native code takes 3 times or more as
much as bytecode takes.

Another drawback is that the saved JITted code still needs symbol
resolution and the JIT compilers and loader of the saved native code
are complicated according to it.

It may be possible for the reused code to be optimized along runtime
information. But the Java runtime has to have such a feature to
judge a method is being executed or not.

  Kazuyuki Shudo[EMAIL PROTECTED]   http://www.shudo.net/


Re: Against using Java to implement Java

2005-05-18 Thread shudo
There is an old document describing a JIT interface though ORP should
be more advanced, for example, as having GC interface.

  The JIT Compiler Interface Specification
  http://java.sun.com/docs/jit_interface.html

Sun's Classic VM, which was a reference VM, of JDK 1.0.2 and 1.1.X
implements this interface and it was modified a bit for J2SDK 1.2.
There were actually multiple JIT compilers based on this JIT interface
including Symantec JIT, OpenJIT, shuJIT and TYA.

This interface is not enough to support advanced optimizations
including adaptive compilation, which today's Sun's and IBM's runtimes
do.  Adaptive compilation needs cooperation by an interpreter (or a
baseline compiler) and I am not sure whether it can be factored out
from the JVM core.


From: Tom Tromey [EMAIL PROTECTED]

 David Maybe a concrete example would help. Let's say you have a GC module
 David written in C. One of it's API calls is to allocate a new object. How
 David is your JIT module going to produce code to use that API? Via a C
 David function pointer?

 Yes.

 One way is to mandate link- or compile-time pluggability only.  Then
 this can be done by name.  Your JIT just references
 'harmony_allocate_object' in its source and uses this pointer
 in the code it generates.

 The other way is to have the JIT call some central function to get a
 pointer to the allocator function (or functions, in libgcj it turned
 out to be useful to have several).  This only needs to be done once,
 at startup.


 For folks interested in pluggability, I advise downloading a copy of
 ORP and reading through it.  ORP already solved these problems in a
 fairly reasonable way.


  Kazuyuki Shudo[EMAIL PROTECTED]   http://www.shudo.net/


Re: [arch] VM Candidate : JikesRVM http://jikesrvm.sourceforge.net/

2005-05-19 Thread shudo
 The problem of Java written JVM/JIT isn't one of performance.  You can
 theoretically achieve the same performance (although I'm not 100%
 convinced, I'm partially there)

It is reasonable to model the performance of a Java runtime in several
aspects, especially throughput and interactivity (start-up time).
JIT (and JVM) written in Java can achieve the same throughput as one
written in C/C++/etc. But good start-up time / interactivity are more
difficult to achieve and have to be elaborated.

Part of a runtime written in Java has to be interpreted, or compiled
before executed. Throughput is sacrificed when interpreted and
interactivity is sacrificed when compiled.

Another possible disadvantage, which might not be discussed, is
reflective nature of Java-written JVM. This has been appealed as one of
strong points in this list so far as removing boundary of
languages. But we have to consider maintenance and debugging of the runtime.
Java-written JIT is compiled by itself eventually. In the case,
debugging will become pretty hard. Of course, such a runtime will have
another interpreter or a baseline compiler (written in C/C++?) and
Java-written JIT can be debugged exhaustively. But such a reflective
nature certainly makee debugging harder.

I myself do not have any experience on development of Java-written JIT
and then I am not very sure how it makes maintainance and debugging
harder. There have been a few Java-written JIT, Jikes RVM and OpenJIT
and we may be able to have input from the developers of them if we hope.

  Kazuyuki Shudo[EMAIL PROTECTED]   http://www.shudo.net/


Re: Other interesting papers and research

2005-05-23 Thread shudo
From: Steve Blackburn [EMAIL PROTECTED]

  [EMAIL PROTECTED] wrote:
 
  The approach of using C Compiler generated code rather than writing a
  full compiler appeals to me:
  http://www.csc.uvic.ca/~csc586a/papers/ertlgregg04.pdf
 
  I am curious on how well the approach performs compared to existing 
  JITs.

 They automatically build themselves
 simple JIT backends (by extracting fragments produced by the ahead of
 time compiler).  This sounds like a great way to achieve portability
 while achiving better performance than a conventional interpreter.

I guess it's a bit better or just comparable with a good interpreter.

In 1998, I have written such a JIT compiler concatinate code fragments
generated by GCC for each JVM instruction. Unfortunately, the JIT was
slightly slower than an interpreter in Sun's Classic VM. The
interpreter was written in x86 assembly language and implements
dynamic stack caching with 2 registers and 3 states. It performs much
better than the previous interpreter written in C.

Then I rewrote the JIT.

I am not very sure which is better for us, having a portable and so-so
baseline compiler or a good interpreter which is possibly less
portable than the compiler. There will be a trade off between memory
consumption, portability and so on.

  Kazuyuki Shudo[EMAIL PROTECTED]   http://www.shudo.net/


Re: Other interesting papers and research

2005-06-05 Thread shudo
.

  This means a good interpreter cannot be implemented in a completely
  portable way.

  Note that I do not know how ErtlGregg's implementations (1) and (2)
  use a machine register to keep TOS.  I guess they have a little
  assembly code.


  Kazuyuki Shudo[EMAIL PROTECTED]   http://www.shudo.net/


Re: Other interesting papers and research

2005-06-06 Thread shudo
Hi Dave,

 From: David P Grove [EMAIL PROTECTED]

 [EMAIL PROTECTED] wrote on 06/05/2005 10:48:29 PM:

  - The machine code concatinating technique consumes much memory.
In my experience, generated machine code is about 10 times larger
than the original instructions in Java bytecode.
 
  In the paper, the authors have not mentioned memory consumption of the
  technique.  We cannot guess how much it is precisely, but it is
  possible to be a big drawback.  Yes, we can say the same for the
  approach taking a baseline compiler instead of an interpreter (like
  Jikes RVM).  Memory consumption of the baseline compiler of Jike RVM
  is very interesting.

 It's platform dependent of course, but on IA32 isn't too horrible.  For
 example, running SPECjvm98 we see a 6.23x expansion from the Jikes RVM
 baseline compiler machine code bytes over bytecode bytes.

Thanks for giving us such an useful number.
It looks reasonable.

 One thing to
 note is that a threaded interpreter would see something like a 2-4x
 expansion over normal bytecodes when it converts from bytecodes to its
 internal form (arrays of function pointers).

Direct threading interpreters like JDK's one work on plain Java
bytecode and they do not need to expand normal bytecode instructions.
Such expansion may have been required if Java bytecode is not linear
and rather a tree or other complicated form.

Then,

 So, a 6x expansion is
 probably only roughly 2x worse than some interpreted systems would
 actually see in practice.

We have to just say the baseline compiler of Jikes RVM generates 6x
larger native code than the original bytecode instructions.


For Java-written JVM, it seems to be natural to have a baseline
compiler instead of an interpreter.

It looks complicated to have an interpreter for a Java-written JVM. We
hope that the architecture of a JVM (e.g. interpreter or baseline
compiler) is independent of the language for implementing a certain
part of JVM. But there seems to be an implication between them.
Any comment?

  Kazuyuki Shudo[EMAIL PROTECTED]   http://www.shudo.net/


Re: Other interesting papers and research

2005-06-06 Thread shudo
Hi Rob,

 From: Robert Lougher [EMAIL PROTECTED]
 Date: Mon, 6 Jun 2005 14:58:45 +0100

   One thing to
   note is that a threaded interpreter would see something like a 2-4x
   expansion over normal bytecodes when it converts from bytecodes to its
   internal form (arrays of function pointers).
 
  Direct threading interpreters like JDK's one work on plain Java
  bytecode and they do not need to expand normal bytecode instructions.
  Such expansion may have been required if Java bytecode is not linear
  and rather a tree or other complicated form.

 According to my understanding, an indirect threaded interpreter uses
 the original bytecode stream.  It's indirect because the handler
 address must be looked up via the bytecode.

Ah, thanks for the indication.
My wording 'direct threading' was not correct.

Threading (interpreting) techniques I referred as implemented in JDKs
should be called 'token threading', neither direct nor indirect threading
because they work directly on bytecode instructions withought any expansion.
Note that the interpreter provides NEXT routines to for all native
code fragments corresponding to VM instructions.
For JVM, this wording like something threading is not very informative
because direct interpretation of portable bytecode is naturally
'token threading'.

Dave's last posting was based on direct threading technique and his saying
was correct about direct threading but my posting was incorrect in advance.

  Kazuyuki Shudo[EMAIL PROTECTED]   http://www.shudo.net/


Re: [arch] voluntary vs. preemptive suspension of Java threads

2005-09-01 Thread shudo
From: Xiao-Feng Li [EMAIL PROTECTED]

 Thread suspension happens in many situations in JVM, such as for GC,
 for java.lang.Thread.suspend(), etc. There are various techniques to
 suspend a thread. Basically we can classify them into two categories:
 preemptive and voluntary.

 The preemptive approach requires the suspender, say a GC thread,
 suspend the execution of a target thread asynchronously with IPC
 mechanism or OS APIs. If the suspended thread happened to be in a
 region of code (Java or native) that could be enumerated, the live
 references were collected. This kind of region is called safe-region,
 and the suspended point is a safe-point. If the suspended point is not
 in safe-region, the thread would be resumed and stopped again until it
 ended up in a safe-region randomly or on purpose.

Sun's HotSpot VMs patch compiled native code to stop thread at the
safe points, on which a stack map is provided. It's smart but prone to
causes subtle problems, an engineer working on the VM said.

  Kazuyuki Shudo[EMAIL PROTECTED]   http://www.shudo.net/


Re: [arch] Interpreter vs. JIT for Harmony VM

2005-11-14 Thread shudo
From: Steve Shih-wei Liao [EMAIL PROTECTED]

 - Re-entrant JIT: Many JITs are not re-entrant. When running, for instance,
 MTRT in SPECJVM, because multiple threads are running, multiple JITTing may
 happen concurrently if there is no locking. The Execution Manager can put
 locks before each JITTing in order to ensure that no multiple JITTing is
 going on concurrently.

Do you know an actual JIT compiling the same method simultaneously?


HotSpot VM has a thread dedicated to JIT compilation and the compiling
thread receive a compilation request from a queue.

A JIT I have developed took another way in which a thread executing an
application compiles the application. The JIT allows multiple threads
to do JIT compilation simultaneously but a method is not compiled
multiple times because of appropriate locks assined to each stage
of JIT compilation.

There are choices on the relationship between threads and JIT
compilation:

- Separated threads dedicated to JIT compilation.
  - Number of locks by JIT gets fewer?
  - Compilation takes much time in case that there are many active
application threads. It leads to further starvation.
Note that HotSpot VM has -Xbatch option to lighten this problem.
  - Exploits more processors remaining?

- Application threads which also perform JIT compilation.
  - Exploits multiple processors for JIT compilation naturally.


  Kazuyuki Shudo[EMAIL PROTECTED]   http://www.shudo.net/