Re: [arch] Interpreter vs. JIT for Harmony VM

2005-11-14 Thread shudo
From: Steve Shih-wei Liao <[EMAIL PROTECTED]>

> - Re-entrant JIT: Many JITs are not re-entrant. When running, for instance,
> MTRT in SPECJVM, because multiple threads are running, multiple JITTing may
> happen concurrently if there is no locking. The Execution Manager can put
> locks before each JITTing in order to ensure that no multiple JITTing is
> going on concurrently.

Do you know an actual JIT compiling the same method simultaneously?


HotSpot VM has a thread dedicated to JIT compilation and the compiling
thread receive a compilation request from a queue.

A JIT I have developed took another way in which a thread executing an
application compiles the application. The JIT allows multiple threads
to do JIT compilation simultaneously but a method is not compiled
multiple times because of appropriate locks assined to each stage
of JIT compilation.

There are choices on the relationship between threads and JIT
compilation:

- Separated threads dedicated to JIT compilation.
  - Number of locks by JIT gets fewer?
  - Compilation takes much time in case that there are many active
application threads. It leads to further starvation.
Note that HotSpot VM has -Xbatch option to lighten this problem.
  - Exploits more processors remaining?

- Application threads which also perform JIT compilation.
  - Exploits multiple processors for JIT compilation naturally.


  Kazuyuki Shudo[EMAIL PROTECTED]   http://www.shudo.net/



Re: [arch] voluntary vs. preemptive suspension of Java threads

2005-09-01 Thread shudo
From: Xiao-Feng Li <[EMAIL PROTECTED]>

> Thread suspension happens in many situations in JVM, such as for GC,
> for java.lang.Thread.suspend(), etc. There are various techniques to
> suspend a thread. Basically we can classify them into two categories:
> preemptive and voluntary.

> The preemptive approach requires the suspender, say a GC thread,
> suspend the execution of a target thread asynchronously with IPC
> mechanism or OS APIs. If the suspended thread happened to be in a
> region of code (Java or native) that could be enumerated, the live
> references were collected. This kind of region is called safe-region,
> and the suspended point is a safe-point. If the suspended point is not
> in safe-region, the thread would be resumed and stopped again until it
> ended up in a safe-region randomly or on purpose.

Sun's HotSpot VMs patch compiled native code to stop thread at the
safe points, on which a stack map is provided. It's smart but prone to
causes subtle problems, an engineer working on the VM said.

  Kazuyuki Shudo[EMAIL PROTECTED]   http://www.shudo.net/


Re: [modules] classloader/jit interface

2005-06-27 Thread shudo
From: [EMAIL PROTECTED]

> I guess that the reason why Jikes RVM does not have an interpreter is
> mainly its implementation language Java, not for performance:

> Imagine us implementing an interpreter in Java language.  We need
> another Java runtime to execute the interpreter, otherwise the
> interpreter has been compiled into native code. A Java runtime should
> be self-containing (except bootstrapping) and then the interpreter has
> to be compiled. We have to implement a compiler in advance of an
> interpreter.

Can we use a pre-existing AOT compiler like GCJ to compile our
interpreter? It will be difficult because compiled code has to be
compliant with the internal structure (e.g. execution engine
interface) of targetted Java runtime and other AOT compilers do not
comply with it.

  Kazuyuki Shudo[EMAIL PROTECTED]   http://www.shudo.net/


Re: [modules] classloader/jit interface

2005-06-27 Thread shudo
From: Rafal Lewczuk <[EMAIL PROTECTED]>

> Things like ORP (and propably Jikes - not
> sure) are intentionally omitting interpreter, relying on a fast,
> non-optimizing JIT instead. I don't know but it may affect some design
> decisions. Assuming that framework = set of interfaces (for example
> GC) + set of conventions (for example, stack layout), discarding
> interpreter part in a server VM may lead to more opportunities in
> optimizing VM.

I guess that the reason why Jikes RVM does not have an interpreter is
mainly its implementation language Java, not for performance:

| Subject: Re: Other interesting papers and research
| Date: Mon, 06 Jun 2005 21:34:44 +0900 (JST)

| For Java-written JVM, it seems to be natural to have a baseline
| compiler instead of an interpreter.
|
| It looks complicated to have an interpreter for a Java-written JVM. We
| hope that the architecture of a JVM (e.g. interpreter or baseline
| compiler) is independent of the language for implementing a certain
| part of JVM. But there seems to be an implication between them.
| Any comment?

Imagine us implementing an interpreter in Java language.  We need
another Java runtime to execute the interpreter, otherwise the
interpreter has been compiled into native code. A Java runtime should
be self-containing (except bootstrapping) and then the interpreter has
to be compiled. We have to implement a compiler in advance of an
interpreter.

We will first implement a baseline compiler to compile an interpreter.
Do we implement an interpreter after having a baseline compiler?
There needs a very strong reason to do it. And in many cases, we will
not have it.

This is what happend in the Jikes RVM (Jalapeno) team, I guess.

If this is correct, discarding an interpreter was not for optimization
and rather for its implementation language.


  Kazuyuki Shudo[EMAIL PROTECTED]   http://www.shudo.net/


Re: Other interesting papers and research

2005-06-06 Thread shudo
Hi Rob,

> From: Robert Lougher <[EMAIL PROTECTED]>
> Date: Mon, 6 Jun 2005 14:58:45 +0100

> > > One thing to
> > > note is that a threaded interpreter would see something like a 2-4x
> > > expansion over "normal" bytecodes when it converts from bytecodes to its
> > > internal form (arrays of function pointers).
> >
> > Direct threading interpreters like JDK's one work on plain Java
> > bytecode and they do not need to expand normal bytecode instructions.
> > Such expansion may have been required if Java bytecode is not linear
> > and rather a tree or other complicated form.
>
> According to my understanding, an indirect threaded interpreter uses
> the original bytecode stream.  It's indirect because the handler
> address must be looked up via the bytecode.

Ah, thanks for the indication.
My wording 'direct threading' was not correct.

Threading (interpreting) techniques I referred as implemented in JDKs
should be called 'token threading', neither direct nor indirect threading
because they work directly on bytecode instructions withought any expansion.
Note that the interpreter provides NEXT routines to for all native
code fragments corresponding to VM instructions.
For JVM, this wording like something threading is not very informative
because direct interpretation of portable bytecode is naturally
'token threading'.

Dave's last posting was based on direct threading technique and his saying
was correct about direct threading but my posting was incorrect in advance.

  Kazuyuki Shudo[EMAIL PROTECTED]   http://www.shudo.net/


Re: Other interesting papers and research

2005-06-06 Thread shudo
Hi Dave,

> From: David P Grove <[EMAIL PROTECTED]>

> [EMAIL PROTECTED] wrote on 06/05/2005 10:48:29 PM:
>
> > - The machine code concatinating technique consumes much memory.
> >   In my experience, generated machine code is about 10 times larger
> >   than the original instructions in Java bytecode.
> >
> > In the paper, the authors have not mentioned memory consumption of the
> > technique.  We cannot guess how much it is precisely, but it is
> > possible to be a big drawback.  Yes, we can say the same for the
> > approach taking a baseline compiler instead of an interpreter (like
> > Jikes RVM).  Memory consumption of the baseline compiler of Jike RVM
> > is very interesting.
>
> It's platform dependent of course, but on IA32 isn't too horrible.  For
> example, running SPECjvm98 we see a 6.23x expansion from the Jikes RVM
> baseline compiler machine code bytes over bytecode bytes.

Thanks for giving us such an useful number.
It looks reasonable.

> One thing to
> note is that a threaded interpreter would see something like a 2-4x
> expansion over "normal" bytecodes when it converts from bytecodes to its
> internal form (arrays of function pointers).

Direct threading interpreters like JDK's one work on plain Java
bytecode and they do not need to expand normal bytecode instructions.
Such expansion may have been required if Java bytecode is not linear
and rather a tree or other complicated form.

Then,

> So, a 6x expansion is
> probably only roughly 2x worse than some interpreted systems would
> actually see in practice.

We have to just say the baseline compiler of Jikes RVM generates 6x
larger native code than the original bytecode instructions.


For Java-written JVM, it seems to be natural to have a baseline
compiler instead of an interpreter.

It looks complicated to have an interpreter for a Java-written JVM. We
hope that the architecture of a JVM (e.g. interpreter or baseline
compiler) is independent of the language for implementing a certain
part of JVM. But there seems to be an implication between them.
Any comment?

  Kazuyuki Shudo[EMAIL PROTECTED]   http://www.shudo.net/


Re: Other interesting papers and research

2005-06-05 Thread shudo
 that an interpreter is often regarded as just slow
  but a faster one and a slower one are very different in performance.

- Register utilization is still important even for an interpreter.
  Interpreters (1) and (4), and lightweight JIT compilers (2) and (6)
  utilizes one or two machine registers to cache values around TOS.
  It is also important to map VM registers (e.g. PC) onto machine registers.

  This means a good interpreter cannot be implemented in a completely
  portable way.

  Note that I do not know how Ertl&Gregg's implementations (1) and (2)
  use a machine register to keep TOS.  I guess they have a little
  assembly code.


  Kazuyuki Shudo[EMAIL PROTECTED]   http://www.shudo.net/


Re: Other interesting papers and research

2005-05-23 Thread shudo
From: Steve Blackburn <[EMAIL PROTECTED]>

> > [EMAIL PROTECTED] wrote:
> >
> >> The approach of using C Compiler generated code rather than writing a
> >> full compiler appeals to me:
> >> http://www.csc.uvic.ca/~csc586a/papers/ertlgregg04.pdf
> >>
> >> I am curious on how well the approach performs compared to existing 
> >> JITs.

> They automatically build themselves
> simple JIT backends (by extracting fragments produced by the ahead of
> time compiler).  This sounds like a great way to achieve portability
> while achiving better performance than a conventional interpreter.

I guess it's a bit better or just comparable with a good interpreter.

In 1998, I have written such a JIT compiler concatinate code fragments
generated by GCC for each JVM instruction. Unfortunately, the JIT was
slightly slower than an interpreter in Sun's Classic VM. The
interpreter was written in x86 assembly language and implements
dynamic stack caching with 2 registers and 3 states. It performs much
better than the previous interpreter written in C.

Then I rewrote the JIT.

I am not very sure which is better for us, having a portable and so-so
baseline compiler or a good interpreter which is possibly less
portable than the compiler. There will be a trade off between memory
consumption, portability and so on.

  Kazuyuki Shudo[EMAIL PROTECTED]   http://www.shudo.net/


Re: [arch] VM Candidate : JikesRVM http://jikesrvm.sourceforge.net/

2005-05-19 Thread shudo
> The problem of Java written JVM/JIT isn't one of performance.  You can
> theoretically achieve the same performance (although I'm not 100%
> convinced, I'm partially there)

It is reasonable to model the performance of a Java runtime in several
aspects, especially throughput and interactivity (start-up time).
JIT (and JVM) written in Java can achieve the same throughput as one
written in C/C++/etc. But good start-up time / interactivity are more
difficult to achieve and have to be elaborated.

Part of a runtime written in Java has to be interpreted, or compiled
before executed. Throughput is sacrificed when interpreted and
interactivity is sacrificed when compiled.

Another possible disadvantage, which might not be discussed, is
reflective nature of Java-written JVM. This has been appealed as one of
strong points in this list so far as removing boundary of
languages. But we have to consider maintenance and debugging of the runtime.
Java-written JIT is compiled by itself eventually. In the case,
debugging will become pretty hard. Of course, such a runtime will have
another interpreter or a baseline compiler (written in C/C++?) and
Java-written JIT can be debugged exhaustively. But such a reflective
nature certainly makee debugging harder.

I myself do not have any experience on development of Java-written JIT
and then I am not very sure how it makes maintainance and debugging
harder. There have been a few Java-written JIT, Jikes RVM and OpenJIT
and we may be able to have input from the developers of them if we hope.

  Kazuyuki Shudo[EMAIL PROTECTED]   http://www.shudo.net/


Re: [arch] VM Candidate : JNode

2005-05-18 Thread shudo
> I would like to mention the JNode VM.

> The performance figures show that the JIT gives performance higher than 
> 50% of the 1.4.2 hotspot.
>
> http://jnode.sourceforge.net/portal/node/51

Those scores seems to be produced by HotSpot Client VM.
If HotSpot Server VM was used, the score of Sun J2SDK 1.4.2
will be higher than 7500.
JNode is a great effort, I respect it and performance will not be
the primary target of the harmony. But Sun's runtime performes better
than the page shows.

Sorry for being off-topic a little.

P.S.
I found One of benchmark programs, Sieve.java in jnodesources-0.2.0.tar.gz
looks same as Sieve.java distributed with TYA (*1).
The Sieve.java has been distributed with TYA since 1999 at least,
but one with JNode 0.2.0 has a line "Copyright (C) 2005 JNode.org".
Is Sieve.java in public domain?

(*1) TYA: a JIT compiler, http://sax.sax.de/~adlibit/

  Kazuyuki Shudo[EMAIL PROTECTED]   http://www.shudo.net/


Re: Against using Java to implement Java

2005-05-18 Thread shudo
There is an old document describing a JIT interface though ORP should
be more advanced, for example, as having GC interface.

  The JIT Compiler Interface Specification
  http://java.sun.com/docs/jit_interface.html

Sun's Classic VM, which was a reference VM, of JDK 1.0.2 and 1.1.X
implements this interface and it was modified a bit for J2SDK 1.2.
There were actually multiple JIT compilers based on this JIT interface
including Symantec JIT, OpenJIT, shuJIT and TYA.

This interface is not enough to support advanced optimizations
including adaptive compilation, which today's Sun's and IBM's runtimes
do.  Adaptive compilation needs cooperation by an interpreter (or a
baseline compiler) and I am not sure whether it can be factored out
from the JVM core.


From: Tom Tromey <[EMAIL PROTECTED]>

> David> Maybe a concrete example would help. Let's say you have a GC module
> David> written in C. One of it's API calls is to allocate a new object. How
> David> is your JIT module going to produce code to use that API? Via a C
> David> function pointer?
>
> Yes.
>
> One way is to mandate link- or compile-time pluggability only.  Then
> this can be done by name.  Your JIT just references
> '&harmony_allocate_object' in its source and uses this pointer
> in the code it generates.
>
> The other way is to have the JIT call some central function to get a
> pointer to the allocator function (or functions, in libgcj it turned
> out to be useful to have several).  This only needs to be done once,
> at startup.
>
>
> For folks interested in pluggability, I advise downloading a copy of
> ORP and reading through it.  ORP already solved these problems in a
> fairly reasonable way.


  Kazuyuki Shudo[EMAIL PROTECTED]   http://www.shudo.net/


Re: Testing - TCK, mauve, harmony's own test suite?

2005-05-17 Thread shudo
From: Ricky Clarkson <[EMAIL PROTECTED]>

> From informal chat in IRC, Davanum Srinivas (dims) said that each
> committer (not contributor) will sign an NDA (Non-Disclosure
> Agreement) with Sun to be able to use Sun's TCK (Technology
> Compatibility Kit), which is required for Harmony to be certified as
> Java.

> Perhaps it would be better if at least one Harmony committer didn't
> sign the Sun NDA, then they wouldn't have anything to disclose.

It is much better than no-one can access the TCK because we know
whether the runtime has passed the TCK or not.

I have been hoping to test a JIT compiler and could not access the TCK
so far. But once a licensee applied the TCK to the JIT compiler and
report possible problems the JIT has to me. I could guess the reason
of the problems and fixed them.

I do not know how can we notify others of problems found by the TCK
without violating the NDA.


Rob Gingell said to me that it is easy to make J2SE TCK available and
they are preparing for it in 2002, but we have not seen it.  There may
be a way to ask or motivate Sun to make the TCK available under a
reasonable relaxed license.


Note that I remember Kaffe and GCJ had a test suite good even for JVMs
while Mauve is targetted for class libraries.


  Kazuyuki Shudo[EMAIL PROTECTED]   http://www.shudo.net/


Re: Introduction, and a question

2005-05-17 Thread shudo
> Newbie question: What is to stop us from caching JITed code?  .NET/
> mono does this as far as I know?

We can do it even in the forthcoming Harmony runtime.

On the other hand, an apparent drawback is disk
consumption. Generally, JITted native code takes 3 times or more as
much as bytecode takes.

Another drawback is that the saved JITted code still needs symbol
resolution and the JIT compilers and loader of the saved native code
are complicated according to it.

It may be possible for the reused code to be optimized along runtime
information. But the Java runtime has to have such a feature to
judge a method is being executed or not.

  Kazuyuki Shudo[EMAIL PROTECTED]   http://www.shudo.net/