I think your fix is a reasonable way to resolve the NPE. We certainly shouldn't try to create a ScopeDesc from an invalid location. Maybe getPCDescNearDbg should try to do a better job finding a PcDesc with a valid scope_decode_offset instead? Something like this maybe?

diff -r c414c554b38b src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/code/NMethod.java
--- a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/code/NMethod.java
+++ b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/code/NMethod.java
@@ -306,12 +306,16 @@
   /** This is only for use by the debugging system, and is only
       intended for use in the topmost frame, where we are not
       guaranteed to be at a PC for which we have a PCDesc. It finds
-      the PCDesc with realPC closest to the current PC. */
+ the PCDesc with realPC closest to the current PC that has a valid scopeDecodeOffset. */
   public PCDesc getPCDescNearDbg(Address pc) {
     PCDesc bestGuessPCDesc = null;
     long bestDistance = 0;
for (Address p = scopesPCsBegin(); p.lessThan(scopesPCsEnd()); p = p.addOffsetTo(pcDescSize)) {
       PCDesc pcDesc = new PCDesc(p);
+ if (pd.getScopeDecodeOffset() == DebugInformationRecorder.SERIALIZED_NULL) { + // We've observed a serialized null decode offset so ignore this PcDesc
+        continue;
+      }
       // In case pc is null
       long distance = -pcDesc.getRealPC(this).minus(pc);
       if ((bestGuessPCDesc == null) ||

As far as I can see getPCDescNearDbg is only used by getScopeDescNearDbg so the semantic change seems reasonable to me.

tom

Chris Plummer wrote on 10/3/19 9:19 AM:
Hi Severin,

I've cc'd Tom Rodriquez who said he would have a look.

thanks,

Chris

On 10/1/19 12:51 PM, Chris Plummer wrote:
Hi Severin,

Sorry, this is not an area that I have any expertise in. However, I did confirm that it fixes the NPE I was seeing with JShellHeapDumpTest.java, which brings up a question. You said this happens with -Xcomp, but I was never using -Xcomp. Might it also be triggered without -Xcomp?

thanks,

Chris

On 10/1/19 1:58 AM, Severin Gehwolf wrote:
Anyone? Chris maybe?

On Fri, 2019-09-27 at 14:12 +0200, Severin Gehwolf wrote:
Hi,

Could I please get reviews for this SA fix? The issue only happens
intermittently and with -Xcomp. The new regression test reproduces the
issue somewhat reliably. I got 10/10 fails for unpatched, but I've seen
it pass as well.

When the issue happens, PCDesc's getScopeDecodeOffset() returs 0
(DebugInformationRecorder.SERIALIZED_NULL). The current SA code doesn't
handle this case and goes on and tries to read ScopeDesc from the
DebugInfoReadStream at the bogus offset. From then on, bad things
happen. A NPE in StackTrace could be one symptom.

The same code in hotspot deals with serialized null differently. It
doesn't read from the debug info stream, and manually sets up a
reasonable frame. Note decode_body is called from ScopeDesc's
constructor where decode_offset might have been set to 0:

void ScopeDesc::decode_body() {
   if (decode_offset() == DebugInformationRecorder::serialized_null) {
     // This is a sentinel record, which is only relevant to
     // approximate queries.  Decode a reasonable frame.
     _sender_decode_offset = DebugInformationRecorder::serialized_null;
     _method = _code->method();
     _bci = InvocationEntryBci;
     _locals_decode_offset = DebugInformationRecorder::serialized_null;
     _expressions_decode_offset = DebugInformationRecorder::serialized_null;      _monitors_decode_offset = DebugInformationRecorder::serialized_null;
   } else {
     // decode header
     DebugInfoReadStream* stream  = stream_at(decode_offset());

     _sender_decode_offset = stream->read_int();
     _method = stream->read_method();
     _bci    = stream->read_bci();

     // decode offsets for body and sender
     _locals_decode_offset      = stream->read_int();
     _expressions_decode_offset = stream->read_int();
     _monitors_decode_offset    = stream->read_int();
   }
}

The proposed patch handles serialized null scopes similar to the
hotspot side of things, by returning a null scope. CompiledVFrame
already deals with null scopes when in debugging mode.

Bug: https://bugs.openjdk.java.net/browse/JDK-8196969
webrev: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8196969/03/webrev/

Testing: tier 1 tests on Linux x86_64 (release/fastdebug). jdk-submit
and ran various reproducer tests including 1000 interations of the
added regression test. All pass.

Thoughts?

Thanks,
Severin



Reply via email to