On Thu, 27 Jan 2022 09:26:50 GMT, Thomas Stuefe <stu...@openjdk.org> wrote:
> Hi Christian, this is very nice and useful! Thanks Thomas! > Two general remarks. One concern I have is that the new functionality should > be super stable, since nothing is more annoying than to crash during stack > dumping in hs-err file; I much rather have a call stack without bells and > whistles than an abridged one. Maybe we could, in hs-err printing, if we got > secondary crashes during callstack dumping, repeat the step with all optional > features (also name demangling) disabled? This could also be done in a > separate RFE. We'll know when this happens, we can react then. I absolutely agree - stability should be the primary concern. An incomplete hs-err file should be avoided at any cost. Doing an additional "catch and repeat without optional features" sounds interesting to get more safety. Would such a thing be easy to add? Yes, it might be better to do that in a separate RFE. > Another small concern, we parse the Elf file while dumping the stack, right? > I remember having a lot of problems on Solaris when dumping callstacks, > because there parsing the elf file was really slow. And that delayed call > stack printing by a lot, so much that the ErrorCrashTimeout often kicked in > and spoiled the crash logs for us. Yes, a pc for a frame is directly parsed when printing the corresponding frame. It takes some more time to do the additional parsing but not that much. These are the timestamps from a quick `-XX:CICrashAt=1` run with `-Xlog:dwarf=info` on my local machine on `Ubuntu 20.04` with a `fastdebug` build: [1.862s][info][dwarf] Open DWARF file: /home/christian/Downloads/test/jdk-19/fastdebug/lib/server/libjvm.debuginfo [1.867s][info][dwarf] pc: 0x00007ffa35c8a9cf, offset: 0x007749cf, filename: c1_Compiler.cpp, line: 250 [1.871s][info][dwarf] pc: 0x00007ffa35fbfb28, offset: 0x00aa9b28, filename: compileBroker.cpp, line: 2291 [1.876s][info][dwarf] pc: 0x00007ffa35fc08e8, offset: 0x00aaa8e8, filename: compileBroker.cpp, line: 1966 [1.881s][info][dwarf] pc: 0x00007ffa36e50cca, offset: 0x0193acca, filename: thread.cpp, line: 1297 [1.890s][info][dwarf] pc: 0x00007ffa36e59010, offset: 0x01943010, filename: thread.cpp, line: 358 [1.897s][info][dwarf] pc: 0x00007ffa36b3c524, offset: 0x01626524, filename: os_linux.cpp, line: 705 The parsing of a single pc takes a little less than 0.01s. Of course, this is not a great way to measure performance. It also highly depends on the source files themselves, the machine setup etc. Thus, this cannot be considered a valid performance test. But still, I think these numbers can give us some indication of the order of magnitude. Compared to the current `ErrorLogTimeout` default value of 2min this looks promising. ------------- PR: https://git.openjdk.java.net/jdk/pull/7126