Re: RFR(M): 8244383: jhsdb/HeapDumpTestWithActiveProcess.java fails with "AssertionFailure: illegal bci"

Chris Plummer Tue, 23 Jun 2020 11:31:32 -0700

Ping!

If this fix is too complicated, there is a simplification I can make,but at the cost of abandoning some attempts to determine the currentframe when this error condition pops up. At the start ofvalidateInterpreterFrame() it attempts to verify that the frame is validby verifying that frame->method and frame->bcp are valid. This part ispretty simple. The complicated part is everything that follows if theverification fails. It attempts to error correct the situation bylooking at various register contents and stack contents. I could justabandon this complicated code and return false if frame->method andframe->bcp don't check out. Upon return, the caller's code would besimplified to:


            if (validateInterpreterFrame(sp, fp, pc)) {

return true; // We're done. setValues() has been calledfor valid interpreter frame.

            } else {
              return checkLastJavaSP();
            }

So there's still a chance we can determine a valid current frame if"last java frame" has been setup. However, if not setup we would not beable to. This is where the complicated code invalidateInterpreterFrame() is useful because it can usually determinethe current frame, even if "last java frame" is not setup, but it's rareenough that we run into this situation that I think failing to get thecurrent frame is ok.

So if I can get a couple promises for reviews if I make this change,I'll go ahead and do it and send out a new RFR.


thanks,

Chris

On 6/18/20 5:54 PM, Chris Plummer wrote:

[I've added runtime-dev to this SA review since understandinginterpreter invokes (code generated byTemplateInterpreterGenerator::generate_normal_entry()) and stackwalking is probably more important than understanding SA.]
Hello,

Please help review the following:

https://bugs.openjdk.java.net/browse/JDK-8244383
http://cr.openjdk.java.net/~cjplummer/8244383/webrev.00/index.html
The crux of the bug is when doing stack walking the topmost frame isin an inconsistent state because we are in the middle of pushing a newinterpreter frame. Basically we are executing code generated byTemplateInterpreterGenerator::generate_normal_entry(). Since the PCregister is in this code, SA assumes the topmost frame is aninterpreter frame.
The first issue with this interpreter frame assumption is if wehaven't actually pushed the frame yet, then the current frame is thecaller's frame, and could be compiled. But since SA thinks it'sinterpreted, later on it tries to convert the frame->bcp to a BCI, butframe->bcp is only valid for interpreter frames. Thus the "illegalBCI" failures. If the previous frame happened to be interpreted, thenthe existing SA code works fine.
The other state of frame pushing that was problematic was when the newframe had been pushed, but frame->method and frame->bcp were not setupyet. This also would lead to "illegal BCI" later on because garbagewould be stored in these locations.
Fixing the above problems requires trying to determine the state ofthe frame push through a series of checks, and then adapting what isconsidered to be the current frame based on the outcome of the checks.The first things checked is that frame->method is valid (we cansuccessfully instantiate a wrapper for the Method* without failure)and that frame->bcp is within the method. If both these pass then wecan use the frame as-is.
If the above checks fail, then we try to determine whether the issueis that the frame is not yet pushed and the current frame is actuallycompiled, or the frame has been pushed but not yet initialized. Thisis done by first getting the return address from the stack or RAX(it's location depends on how far along we are in the entry code) andcomparing this to what is stored in frame->return_addr. If they arethe same, then we have pushed the frame but not yet initialized it. Inthis case we use the previous frame (senderSP() and senderFP()) as thecurrent frame since the current frame is not yet initialized. If thereturn address check fails, then we assume the new frame is not yetpushed, and and treat the current frame as compiled, even though PCpoints into the interpreter (we replace PC with RAX in this case).
Comments in the code pretty well explain all the above, so it isprobably easier to follow the logic in the code along with thecomments rather than apply my above description to the code.
I should add that it's very rare that we ever get into this specialerror handling code. This bug was very hard to reproduce initially. Iwas only able to make progress with reproducing and debugging byinserting delay loops in various spots in the code generated byTemplateInterpreterGenerator::generate_normal_entry(). By doing this Iwas able to reproduce the issue quite easily and hit all the logic inthe new code I've added.
The fix is basically entirely contained withinAMD64CurrentFrameGuess.java. The rest of the changes are minor:
src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/amd64/AMD64CurrentFrameGuess.java
-Main fix for CR
src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/x86/X86Frame.java-Added getInterpreterFrameBCP(), which is now needed byAMD64CurrentFrameGuess.java-I also simplified some code by using the existinggetInterpreterFrameMethod()
 rather than replicating inline what it does.
src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/bsd/amd64/BsdAMD64CFrame.java-I noticed the windows version of this code had some extra checks thatwere missing from the bsd version. I then looked at the linux version, but it hadbeen heavily modified a short while back to leverage DWARF info to determine frames. So Ilooked at the previous rev and it too had these extra checks. I decided to add them to theBSD port. I'm not sure
 if it helps at all, but it certainly doesn't seem to do any harm.

thanks,

Chris

Re: RFR(M): 8244383: jhsdb/HeapDumpTestWithActiveProcess.java fails with "AssertionFailure: illegal bci"

Reply via email to