As Eamonn writes it, it will never cache miss but may frequently branch mispredict (possibly multiple times). If you do a shift + mask + index into a small table, it will cache miss most the time but never branch mispredict. (In a real program it will cache miss frequently since thread state calls are infrequent and the lookup table will fall out of cache; in a microbenchmark it will almost never cache miss as the lookup table will be hot.)

On 12/4/2010 7:22 AM, Eamonn McManus wrote:
Hi Mandy,

This test:

         if ((threadStatus&  JVMTI_THREAD_STATE_RUNNABLE) == 1) {

is always false, since JVMTI_THREAD_STATE_RUNNABLE is 4. (NetBeans 7.0
helpfully flags this; I'm not sure if earlier versions do.)

But, once corrected, I think you could use this idea further to write a much
simpler and faster method, on these lines:

     public static Thread.State toThreadState(int threadStatus) {
         if ((threadStatus&  JVMTI_THREAD_STATE_RUNNABLE)*!= 0*) {
             return RUNNABLE;
         } else if ((threadStatus&  
JVMTI_THREAD_STATE_BLOCKED_ON_MONITOR_ENTER) != 0) {
             return BLOCKED;
         } else if ((threadStatus&  JVMTI_THREAD_STATE_WAITING_WITH_TIMEOUT) != 
0) {
             return TIMED_WAITING;
         } else if ((threadStatus&  JVMTI_THREAD_STATE_WAITING_INDEFINITELY) != 
0) {
             return WAITING;
         } else if ((threadStatus&  JVMTI_THREAD_STATE_TERMINATED) != 0) {
             return TERMINATED;
         } else {
             return NEW;
         }
     }

You could tweak the order of the tests based on what might be the relative
frequency of the different states but it probably isn't worth it.

Regards,

Éamonn


On 3/12/10 11:52 PM, Mandy Chung wrote:
Fix for 6977034: Thread.getState() very slow

Webrev at:
http://cr.openjdk.java.net/~mchung/6977034/webrev.00/

This is an improvement to map a Thread's threadStatus field to Thread.State.
The VM updates the Thread.threadStatus field directly at state transition
with the value as defined in JVM TI [1]. The java.lang.Thread.getState()
implementation can directly access the threadStatus value and do a direct
lookup from an array of Thread.State. The threadStatus value is a bit vector
and we would have to create an array of a minimum of 1061 (0x425) elements
to do direct mapping. I took the approach to use the first highest order bit
set to 1 in the masked threadStatus value as the index to the Thread.State
element and only caches 32 elements (could be fewer). I wrote a
micro-benchmark measuring the Thread.getState of a thread in different state
that shows 1.7X to 6X speedup (see below). There is possibly some issue with
my micro-benchmark that I didn't observe the 14X speed up as Doug did in his
experiment. However, I'd like to get this reviewed and pushed to the
repository so that anyone can do more experiment on the performance measurement.

Thanks
Mandy
P.S. The discussion on this thread can be found at [2] [3].

[1] http://download.java.net/jdk7/docs/platform/jvmti/jvmti.html#GetThreadState
[2] http://mail.openjdk.java.net/pipermail/core-libs-dev/2010-July/004567.html
[3] http://mail.openjdk.java.net/pipermail/core-libs-dev/2010-August/004721.html


        JDK 7 b120 (in ms)      With fix (in ms)        Speed up
main            46465           22772                   2.04
NEW             50676           29921                   1.69
RUNNABLE        42202           14690                   2.87
BLOCKED         72773           12296                   5.92
WAITING         48811           13041                   3.74
TIMED_WAITING   45737           12849                   3.56
TERMINATED      40314           16376                   2.46

Reply via email to