> When printing the native stack trace on Linux (mostly done for hs_err files), > it only prints the method with its parameters and a relative offset in the > method: > > Stack: [0x00007f6e01739000,0x00007f6e0183a000], sp=0x00007f6e01838110, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 > V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, > bool, DirectiveSet*)+0xec > V [libjvm.so+0x8303ef] > CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 > V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df > V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, > JavaThread*)+0x69 > V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d > V [libjvm.so+0x12091c9] JavaThread::run()+0x167 > V [libjvm.so+0x1206ada] Thread::call_run()+0x180 > V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f > > This makes it sometimes difficult to see where exactly the methods were > called from and sometimes almost impossible when there are multiple > invocations of the same method within one method. > > This patch improves this by providing source information (filename + line > number) to the native stack traces on Linux similar to what's already done on > Windows (see [JDK-8185712](https://bugs.openjdk.java.net/browse/JDK-8185712)): > > Stack: [0x00007f34fca18000,0x00007f34fcb19000], sp=0x00007f34fcb17110, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 > (c1_Compilation.cpp:607) > V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, > bool, DirectiveSet*)+0xec (c1_Compiler.cpp:250) > V [libjvm.so+0x8303ef] > CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 > (compileBroker.cpp:2291) > V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df > (compileBroker.cpp:1966) > V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, > JavaThread*)+0x69 (compilerThread.cpp:59) > V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d > (thread.cpp:1297) > V [libjvm.so+0x12091c9] JavaThread::run()+0x167 (thread.cpp:1280) > V [libjvm.so+0x1206ada] Thread::call_run()+0x180 (thread.cpp:358) > V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f > (os_linux.cpp:705) > > For Linux, we need to parse the debug symbols which are generated by GCC in > DWARF - a standardized debugging format. This patch adds support for DWARF 4, > the default of GCC 10.x, for 32 and 64 bit architectures (tested with x86_32, > x86_64 and AArch64). DWARF 5 is not supported as it was still experimental > and not generated for HotSpot. However, newer GCC version may soon generate > DWARF 5 by default in which case this parser either needs to be extended or > the build of HotSpot configured to only emit DWARF 4. > > The code follows the parsing steps described in the official DWARF 4 spec: > https://dwarfstd.org/doc/DWARF4.pdf > I added references to the corresponding sections throughout the code. > However, I tried to explain the steps from the DWARF spec directly in the > code (method names, comments etc.). This allows to follow the code without > the need to actually deep dive into the spec. > > The comments at the `Dwarf` class in the `elf.hpp` file explain in more > detail how a DWARF file is structured and how the parsing algorithm works to > get to the filename and line number information. There are more class > comments throughout the `elf.hpp` file about how different DWARF sections are > structured and how the parsing algorithm needs to fetch the required > information. Therefore, I will not repeat the exact workings of the algorithm > here but refer to the code comments. I've tried to add as much information as > possible to improve the readability. > > Generally, I've tried to stay away from adding any assertions as this code is > almost always executed when already processing a VM error. Instead, the DWARF > parser aims to just exit gracefully and possibly omit source information for > a stack frame instead of risking to stop writing the hs_err file when an > assertion would have failed. To debug failures, `-Xlog:dwarf` can be used > with `info`, `debug` or `trace` which provides logging messages throughout > parsing. > > **Testing:** > Apart from manual testing, I've added two kinds of tests: > - A JTreg test: Spawns new VMs to let them crash in various ways. The test > reads the created hs_err files to check if the DWARF parsing could correctly > find the filename and line number. For normal HotSpot files, I could not > check against hardcoded filenames and line numbers as they are subject to > change (especially line number can quickly become different). I therefore > just added some sanity checks in the form of "found a non-empty file" and > "found a non-zero line number". On top of that, I added tests that let the VM > crash in custom C files (which will not change). This enables an additional > verification of hardcoded filenames and line numbers. > - Gtests: Directly calling the `get_source()` method which initiates DWARF > parsing. Tested some special cases, for example, having a buffer that is not > big enough to store the filename. > > On top of that, there are also existing JTreg tests that call > `-XX:NativeMemoryTracking=detail` which will print a native stack trace with > the new source information. These tests were also run as part of the standard > tier testing and can be considered as sanity tests for this implementation. > > To make tests work in our infrastructure or if some other setups want to have > debug symbols at different locations, I've added support for an additional > `_JVM_DWARF_PATH` environment variable. This variable can specify a path from > which the DWARF symbol file should be read by the parser if the default > locations do not contain debug symbols (required some `make` changes). This > is similar to what's done on Windows with `_NT_SYMBOL_PATH`. The JTreg test, > however, also works if there are no symbols available. In that case, the test > just skips all the assertion checks for the filename and line number. > > I haven't run any specific performance testing as this new code is mainly > executed when an error will exit the VM and only if symbol files are > available (which is normally not the case when using Java release builds as a > user). > > Special thanks to @tschatzl for giving me some pointers to start based on his > knowledge from a DWARF 2 parser he once wrote in Pascal and for discussing > approaches on how to retrieve the source information and to @erikj79 for > providing help for the changes required for `make`! > > Thanks, > Christian
Christian Hagedorn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 68 commits: - Exclude TestDwarf.java when run with product because TraceDwarf is a develop flag - Merge branch 'master' into JDK-8242181 - Merge branch 'master' into JDK-8242181 - Fix TestDwarf for older GCC versions - Change logging from UL to tty based with new TraceDwarfLevel develop flag - Add support to parse the .debug_line section in DWARF 2 as emitted by GCC 8, add some comments - Merge branch 'master' into JDK-8242181 - Merge branch 'master' into JDK-8242181 - Merge branch 'master' into JDK-8242181 - Fix build, add GCC flag gdwarf-4 to exclude DWARF 5, add assertions - ... and 58 more: https://git.openjdk.org/jdk/compare/c2ccf4ca...cb030f10 ------------- Changes: https://git.openjdk.org/jdk/pull/7126/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7126&range=11 Stats: 2781 lines in 18 files changed: 2684 ins; 41 del; 56 mod Patch: https://git.openjdk.org/jdk/pull/7126.diff Fetch: git fetch https://git.openjdk.org/jdk pull/7126/head:pull/7126 PR: https://git.openjdk.org/jdk/pull/7126