Hi Dan,

On 11/6/17 12:23 PM, Daniel D. Daugherty wrote:
On 11/3/17 8:25 PM, Chris Plummer wrote:
Hello,

Please review the following:

https://bugs.openjdk.java.net/browse/JDK-8059334

Wow! This bug was quite the adventure...
I know how to pick 'em, huh? :)


http://cr.openjdk.java.net/~cjplummer/8059334/webrev.00/webrev.open/

src/hotspot/share/interpreter/interpreterRuntime.cpp
    L924:   if (thread->is_interp_only_mode()) {
        Perhaps: if (nm != NULL && thread->is_interp_only_mode()) {
Ok, I'll add the NULL check.

    I kept trying to decide whether your new check only needs to
    be inside the existing block:

        L913:   if (branch_bcp != NULL && nm != NULL) {
        :
        L923   }

    but I finally convinced myself that you want to check a non-NULL
    nm value in either branch_bcp code branch (NULL or non-NULL) so
    where you put the fix is just fine.

    Of course, the usual question we have to ask in these kinds of
    races is what prevents the racy condition from asserting itself
    right after the fixed code location. Thanks for including your
    last sentence:

    If we are not in "interp only" mode at this point (and start executing     the compiled method) it should not be possible to enter "interp only"     mode until we reach a safepoint at some later time, and at that point     the method will be properly deopt so it can execute interpreted.

Nicely done bug hunt!

Thumbs up!
Thanks!

Chris

Dan



The CR is closed, so I'll try to explain the issue here. The very short explanation is that the JVMTI test was enabling SINGLE STEP and doing a PopFrame, but the test method managed to get compiled and started executing compiled after the thread was put in "interp only" mode (which should never happen) and before the PopFrame was processed. The cause is a lack of a check for "interp only" mode in the OSR related compilation policy code.

Details:

The test is testing JVMTI PopFrame support. The test thread has a small method that sits in a tight loop. It will never exit. The main thread enables SINGLE STEP on the test thread, and then does a PopFrame on the test thread to force it out of the looping method. When the test failed due to a time out, I noticed it was still stuck in the small method, even though a PopFrame had been requested. Further, I noticed the method was compiled, so there was no chance the method would ever detect that it should do a PopFrame. Since "interp only" mode for SINGLE STEP had been enabled, the method should not be running compiled, so clearly something went wrong that allowed it to compile and execute.

When SINGLE STEP is requested, JVMTI will deopt the topmost method (actually the top 2), put the thread in "interp only" mode, and then has checks to make sure the thread continues to execute interpreted. To avoid compilation when a back branch tries to trigger one, there is a check for "interp only" mode in SimpleThresholdPolicy::event(). If the thread is in "interp only" mode, it will prevent compilation. SimpleThresholdPolicy::event() is called (indirectly) by InterpreterRuntime::frequency_counter_overflow(), which is called from the interpreter when the back branch threshold is reached.

After some debugging I noticed when the test timeout happens, "interp only" mode is not yet enabled when InterpreterRuntime::frequency_counter_overflow() is called, but is enabled by the time InterpreterRuntime::frequency_counter_overflow() has done the lookup of the nm. So there is a race here allowing the thread to begin execution in a compiled method even though "interp only" mode is enabled. I think the reason is because we safepoint during the compilation, and this allows a SINGLE STEP request to be processed, which enables "interp only" mode.

I should add that initially I only saw this bug with -Xcomp, but eventually realized it was caused by disabling BackgroundCompilation. That makes it much more likely that a SINGLE STEP request will come in and be processed during the call to InterpreterRuntime::frequency_counter_overflow() (because it will block until the compilation completes).

I believe for the fix it is enough just to add an "interp only" mode check in InterpreterRuntime::frequency_counter_overflow() after the nm lookup, and set it nm to NULL if we are now in "interp only" mode. If we are not in "interp only" mode at this point (and start executing the compiled method) it should not be possible to enter "interp only" mode until we reach a safepoint at some later time, and at that point the method will be properly deopt so it can execute interpreted.

thanks,

Chris




Reply via email to