Hi all,

On Mon, 2018-07-30 at 21:23 -0700, Sharath Ballal wrote:
> > Ok. It looks like we don't even have a "jstack --mixed" test. Could
> > you add one? It would be even better if the test included a JNI lib
> > that wasn't compiled with -fno-omit-frame-pointer so you don't need
> > to rely on glibc to reproduce this issue (or is glibc  pretty much
> > always compiled without -fno-omit-frame-pointer)? Or if Sharath
> > agrees, file a bug to have a test added.
> 
> That’s a good suggestion.  Severin you can either write a test or
> open a bug for it.

Latest webrev with a test:
http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8208091/webrev.02/

The test fails prior the test on affected systems and passes after.
There are still issues with getting the test's JNI properly compiled
the way it's supposed to. I've asked for help on build-dev[1]. Example
runs:

Before patch:
http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8208091/before_patch.txt

After patch:
http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8208091/after_patch.txt

Thanks,
Severin

[1] http://mail.openjdk.java.net/pipermail/build-dev/2018-August/022819.html

> -----Original Message-----
> From: Chris Plummer 
> Sent: Monday, July 30, 2018 10:03 PM
> To: Severin Gehwolf; Sharath Ballal; serviceability-dev
> Subject: Re: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws
> UnmappedAddressException on i686
> 
> Hi Severin,
> 
> On 7/30/18 1:28 AM, Severin Gehwolf wrote:
> > Hi Chris,
> > 
> > On Thu, 2018-07-26 at 14:07 -0700, Chris Plummer wrote:
> > > I had looked at this review when it came out, but was hesitant to
> > > ok 
> > > it because I really don't know this code at all. If you can get 
> > > another reviewer who does know the code, then I'll approve it.
> > 
> > Sharath Ballal reviewed it, but he's not a Reviewer as per the
> > OpenJDK 
> > census. As to whether he knows the code, I don't know. He's on CC.
> 
> Yes, but I was asking for a second reviewer (not counting me).
> > 
> > > This only impacts 32-bit, right? If so, keep in mind that it
> > > won't 
> > > get tested by Oracle testing, including the submit repo, so make
> > > sure you do thorough testing.
> > 
> > It only impacts 32-bit, yes. I understand that Oracle isn't
> > testing 
> > 32- bit x86 any more. The change itself should be fairly low risk 
> > since it's changing only a 32-bit-x86-linux-only file and the
> > native 
> > bits don't seem to match what the Java code does[1].
> > REG_INDEX(reg) 
> > being defined as:
> > 
> > #define REG_INDEX(reg) 
> > sun_jvm_hotspot_debugger_x86_X86ThreadContext_##reg
> > 
> > and being used as:
> > 
> > REG_INDEX(SP)
> > 
> > Thus, using
> > 
> > sun_jvm_hotspot_debugger_x86_X86ThreadContext_SP
> > 
> > The Java code uses:
> > 
> > sun.jvm.hotspot.debugger.x86.X86ThreadContext.ESP
> > 
> > > Also, why is there any code being executed that was not compiled
> > > with 
> > > -fno-omit-frame-pointer? The description in the CR just shows a 
> > > simple java program reproducing this, so all the mixed stack
> > > traces 
> > > belong to the JVM and libs, and I thought we made sure to compile
> > > all 
> > > of them with -fno-omit-frame-pointer.
> > 
> > The JVM uses glibc and that simple program is enough to see some 
> > thread's stack currently being in a glibc function when getting a 
> > mixed stack trace. We've originally seen this in JDK 8 with jstack
> > -m 
> > and was reported in [2]. That comment has more details. The
> > problem 
> > here isn't that it's a JDK lib which gets compiled without 
> > -fno-omit-frame- pointer. It's glibc not being compiled with that
> > option.
> > 
> > An example stack trace for a system where this happens looks like
> > this:
> > 
> > Thread 7 (Thread 0xa3863b40 (LWP 834)):
> > #0  0xf771f430 in __kernel_vsyscall ()
> > #1  0xf7703acc in futex_abstimed_wait (cancel=true,
> > private=<optimized 
> > out>, abstime=0x0, expected=1, futex=0xf770f000) at 
> > ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:43
> > #2  do_futex_wait (sem=0xf770f000, sem@entry=0xf70ea854 <sig_sem>, 
> > abstime=0x0) at
> > ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:226
> > #3  0xf7703bb7 in __new_sem_wait_slow (sem=0xf70ea854 <sig_sem>, 
> > abstime=0x0) at
> > ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:407
> > #4  0xf6cc18d4 in check_pending_signals (wait=true) at 
> > /usr/src/debug/java-1.8.0-openjdk-1.8.0.171-
> > 8.b10.el7_5.i386/openjdk/h
> > otspot/src/os/linux/vm/os_linux.cpp:2522
> > #5  0xf6cbc632 in signal_thread_entry (thread=0xa37a4800, 
> > __the_thread__=0xa37a4800) at 
> > /usr/src/debug/java-1.8.0-openjdk-1.8.0.171-
> > 8.b10.el7_5.i386/openjdk/h
> > otspot/src/share/vm/runtime/os.cpp:250
> > 
> > That is, frames 0-3 are JDK foreign. This bug will happen on all 
> > systems which use any native library which isn't compiled with
> > -fno- 
> > omit-frame-pointer. Be it glibc or some other library.
> 
> Ok. It looks like we don't even have a "jstack --mixed" test. Could
> you add one? It would be even better if the test included a JNI lib
> that wasn't compiled with -fno-omit-frame-pointer so you don't need
> to rely on glibc to reproduce this issue (or is glibc pretty much
> always compiled without -fno-omit-frame-pointer)? Or if Sharath
> agrees, file a bug to have a test added.
> 
> thanks,
> 
> Chris
> > 
> > Thanks,
> > Severin
> > 
> > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1602008#c9
> > [2] https://bugzilla.redhat.com/show_bug.cgi?id=1602008#c4
> > 
> > > thanks,
> > > 
> > > Chris
> > > 
> > > On 7/26/18 10:11 AM, Severin Gehwolf wrote:
> > > > On Thu, 2018-07-26 at 10:04 -0700, Sharath Ballal wrote:
> > > > > Changes looks good Severin.
> > > > 
> > > > Thanks for the review, Sharath!
> > > > 
> > > > > I am not a reviewer though, so you still need a Reviewer to
> > > > > review.
> > > > 
> > > > Anyone?
> > > > 
> > > > Thanks,
> > > > Severin
> > > > 
> > > > > -----Original Message-----
> > > > > From: Severin Gehwolf [mailto:sgehw...@redhat.com]
> > > > > Sent: Thursday, July 26, 2018 1:04 PM
> > > > > To: serviceability-dev
> > > > > Subject: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed
> > > > > throws 
> > > > > UnmappedAddressException on i686
> > > > > 
> > > > > On Mon, 2018-07-23 at 18:27 +0200, Severin Gehwolf wrote:
> > > > > > Hi,
> > > > > > 
> > > > > > Could I please get a review of this one-liner change
> > > > > > related to 
> > > > > > jhsdb --mixed when attaching to a running Java process? The
> > > > > > issue 
> > > > > > arises when threads are in native code and that native code
> > > > > > has 
> > > > > > frame pointers not properly preserved. In such a case the
> > > > > > SA 
> > > > > > performs a simple frame pointer valididy check: ebp >= esp
> > > > > > 
> > > > > > However, the code of retrieving the value for esp is
> > > > > > incorrect in 
> > > > > > as much as it's not in sync with native code in regards to
> > > > > > the 
> > > > > > register
> > > > > > index:
> > > > > > 
> > > > > > native code => X86ThreadContext.SP
> > > > > > Java code   => X86ThreadContext.ESP
> > > > > > 
> > > > > > X86ThreadContext.ESP is never being set by the native code.
> > > > > > Since
> > > > > > X86ThreadContext.getRegisterAsAddress(X86ThreadContext.ESP)
> > > > > > then 
> > > > > > returns null, ebp.lessThan(esp) wrongly returns false
> > > > > > causing the 
> > > > > > issue. This webrev fixes it by using SP as index on the
> > > > > > Java side.
> > > > > > Thoughts?
> > > > > > 
> > > > > > webrev:
> > > > > > 
http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8208091/webrev.01
> > > > > > /
> > > > > > bug: https://bugs.openjdk.java.net/browse/JDK-8208091
> > > > > 
> > > > > Anyone willing to review this one-liner?
> > > > > 
> > > > > Thanks,
> > > > > Severin
> > > > > 
> > > > > > Thanks,
> > > > > > Severin
> 
> 

Reply via email to