I've forwarded this to hotspot-compiler-dev.

I know Doug introduced this for final fields because at the time the compiler was not optimizing their use, but I had thought that issue was long since resolved at least in C2. If C1 is lagging then we need to see that it catches up.

There should not be a need to code this way at the Java-level. (Note, as Martin says sometimes you must copy a field to a local for correctness - the field might change value but the current code must not see that - but that's not the case we're concerned with.)

Cheers,
David Holmes

Osvaldo Doederlein said the following on 05/04/10 06:13:
2010/5/3 Martin Buchholz <marti...@google.com <mailto:marti...@google.com>>

    It's a coding style made popular by Doug Lea.
    It's an extreme optimization that probably isn't necessary;
    you can expect the JIT to make the same optimizations.


It certainly is necessary - unfortunately. Testing my particle/octree-based 3D renderer without this manual optimization (dumping FPS performance each 100 frames, begin at 10th score after startup):

JDK 6u21-b03, Hotspot Client:
159.4896331738437fps
161.29032258064515fps
158.73015873015873fps
160.0fps
159.23566878980893fps

JDK 6u21-b03, Hotspot Server:
197.23865877712032fps
204.91803278688525fps
196.07843137254903fps
200.40080160320642fps
198.01980198019803fps

Now let's cache 8 instance variables into local variables (most final, a couple non-final ones too):

JDK 6u21-b03, Hotspot Client:
169.4915254237288fps
172.1170395869191fps
168.63406408094434fps
168.0672268907563fps
170.64846416382252fps

JDK 6u21-b03, Hotspot Server:
197.62845849802372fps
200.40080160320642fps
196.8503937007874fps
199.6007984031936fps
203.2520325203252fps

So, the manual optimization makes no difference for Hotspot Server; but hell it does for Client - 6% better performance in this test; and the test is not only the complex, deeply nested rendering loops that use those cacheable variables to read the input data and update the output pixel and Z buffers - there's also other code that burns significant CPU and doesn't use these variables, remarkably buffer filling and copying steps. This means the speedup in the optimized code should be much higher than 6%, I only reported / cared to measure the application's global performance.

We'll need to deal with HotSpot Client for years to come, not to mention smaller platforms (JavaME, JavaFX Mobile&TV) which JIT compilers are even lesser than JavaSE's C1. Tuned bytecode is also faster to interpret, which benefits warm-up time too. Please keep your dirty purist hands off the API code that Doug and others micro-optimized; it is necessary. :)

And my +1 to add the same opts to other perf-critical APIs. Even most important for java.nio as under C1, it doesn't currently benefit from intrinsic compilation of critical DirectBuffer methods.

A+
Osvaldo

    (you can try to check the machine code yourself!)
    Nevertheless, copying to locals produces the smallest
    bytecode, and for low-level code it's nice to write code
    that's a little closer to the machine.

    Also, optimizations of finals (can cache even across volatile
    reads) could be better.  John Rose is working on that.

    For some algorithms in j.u.c,
    copying to a local is necessary for correctness.

    Martin

    On Mon, May 3, 2010 at 04:40, Ulf Zibis <ulf.zi...@gmx.de
    <mailto:ulf.zi...@gmx.de>> wrote:
     > Hi,
     >
     > in class String I often see member variables copied to local
    variables.
     > In java.nio.Buffer I don't see that (e.g. for "position" in
    nextPutIndex(int
     > nb)).
     > Now I'm wondering.
     >
     > From JMM (Java-Memory-Model) I learned, that jvm can hold
    non-volatile
     > variables in a cache for each thread, so e.g. even in CPU
    register for few
     > ones.
     > From this knowing, I don't understand, why doing the local
    caching manually
     > in String (and many other classes), instead trusting on the JVM.
     >
     > Can anybody help me in understanding this ?
     >
     > -Ulf
     >
     >
     >


Reply via email to