Hello,
Can I get a second reviewer please. As noted below, it's easier to look
at webrev.00 first to just see the coding changes. webrev.01 just adds
some updated comments.
thanks,
Chris
On 7/8/20 2:07 PM, Kevin Walls wrote:
Thanks Chris, it's a bit of clutter, but truthful clutter. 8-)
On 08/07/2020 20:26, Chris Plummer wrote:
Webrev has been updated with the suggested comment changes. Note to
new reviewers, look in webrev.00 first since it doesn't have the
clutter of the comment changes, making it easier to see which lines
actually have code changes.
http://cr.openjdk.java.net/~cjplummer/8247272/webrev.01/index.html
thanks,
Chris
On 7/8/20 11:04 AM, Chris Plummer wrote:
Hi Kevin,
Thanks for the review. I'll add the additional Elf64_Addr and
Elf64_Off comments. Probably the others should be updated too.
Although they are the same size, they do have different names. For
example:
/* Type for a 16-bit quantity. */
typedef uint16_t Elf32_Half;
typedef uint16_t Elf64_Half;
thanks,
Chris
On 7/8/20 3:47 AM, Kevin Walls wrote:
Hi Chris --
This is a great story/history lesson.
You could if you like, edit those comments in ElfFileParser.java so
"Elf32_Addr" as they will contain either "Elf64_Addr or
Elf32_Addr", similarly Elf64_Off. The other Elf64 fields are the
same as the 32 bit ones.
Yes, the symbol fields are ordered differently.
So all looks good to me!
Thanks
Kevin
On 08/07/2020 07:20, Chris Plummer wrote:
Hello,
Please help review the following:
http://cr.openjdk.java.net/~cjplummer/8247272/webrev.00/index.html
https://bugs.openjdk.java.net/browse/JDK-8247272
The short story is that SA address to native symbol name
mapping/lookup has never worked on 64-bit, and this is due to the
java level ELF file support only supporting 32-bit. This CR fixes
that, and I believe also maintains 32-bit compatibility, although
I have no way of testing that.
There is more to the story however on how we got here. Before
going into the gory detail below, I just want to point out that
currently nothing is using this support, and therefore it is
technically not fixing anything, although I did verify that the
fixes work (see details below). Also, I intend to remove all the
java level ELF file support as part of JDK-8247516 [1]. The only
reason I want to push these changes first is because I already did
the work to get it working with 64-bit, and would like to get it
archived before removing it in case for some reason it is revived
in the future.
Now for the ugly details on how we got here (and you really don't
need to read this unless you have any concerns with what I stated
above). It starts with the clhsdb "whatis" command, which was the
only (indirect) user of this java level ELF file support. It's
implementation is in javascript, so we have not had access to it
ever since JDK9 module support broke the SA javascript support
(and javascript support is now removed). I started the process of
converting "whatis" to java. It is basically the same as the
clhsdb "findpc" command, except it also checks for native symbols,
which it does with the following code:
var dso = loadObjectContainingPC(addr);
var sym = dso.closestSymbolToPC(addr);
return sym.name + '+' + sym.offset;
Converting this to java was trivial. I just stuck support for it
in the PointerFinder class, which is what findpc relies on.
However, it always failed to successfully lookup a symbol. I found
that DSO.closestSymbolToPC() called into the java level ELF
support, and that was failing badly. After some debugging I
noticed that the values read in for various ELF headers were
mostly garbage. It then occurred to me that it was reading in
32-bit values that probably needed to be 64-bit. Sure enough, this
code was never converted to 64-bit support. I then went and tried
"whatis" on JDK8, the last version where it was available, and it
failed there also with 64-bit binaries. So this is why I initially
fixed it to work with 64-bit, and also how I tested it (using the
modified findpc on a native symbol). But the story continues...
DSO.java, and as a consequence the java ELF file support, is used
by all our posix ports to do address to symbol lookups. So I
figured that after fixing the java level ELF file support for
64-bit, my improved findpc would start working on OSX also. No
such luck, and for obvious reasons. OSX uses mach-o files. This
ELF code should never have been used for it, and of course has
never worked.
So I was left trying to figure out how to do OSX address to native
symbol lookups. I then recalled that there was a
CFrame.closestSymbolToPC() API that did address to native symbol
lookups for native stack traces, and wondered how it was ever
working (even on linux with the broken ELF 64-bit support). It
turns out this takes a very different path to do the lookups,
ending up in native code in libsaproc, where we also have ELF file
support. I then converted DSO.closestSymbolToPC(addr) to use this
libsaproc code instead, and it worked fine. So now there was no
need for the java level ELF file support since its only user was
DSO.closestSymbolToPC(addr). I should also add that this is the
approach that has always been used on windows, with both
CFrame.closestSymbolToPC() and DSO.closestSymbolToPC(addr) using
the same libsaproc support.
There is still a bit more to the story. After diverting
DSO.closestSymbolToPC(addr) to the libsaproc lookup code, it still
didn't work for OSX. I thought it would just work since the native
BsdDebuggerLocal.lookupByName0() is implemented, and it seems to
trickle down to the proper lower level APIs to find the symbol,
but there were two issues. The first is that for processes there
is no support for looking up all the libraries and populating the
list of ps_prochandle structures that are used to do the symbol
lookups. This was just never implemented (also is why PMap does
not work for OSX processes). For core files the ps_prochandle
structs are there, but the lookup code was badly broken. That has
now been fixed by JDK-8247515 [2], currently out for review. So
the end result is we'll have address to native symbol lookup for
everything but OSX processes.
If your still here, thanks for listening!
Chris
[1] https://bugs.openjdk.java.net/browse/JDK-8247516
[2] https://bugs.openjdk.java.net/browse/JDK-8247515