On Wed, 13 Apr 2022 15:29:00 GMT, Thomas Stuefe <stu...@openjdk.org> wrote:
>> Move the AsyncGetCallTrace method implementation into a separate method and >> wrap its call in non-assert compilation mode in `os::ThreadCrashProtection` >> like it is done in >> [JFR](https://github.com/openjdk/jdk/blob/965ea8d9cd29aee41ba2b1b0b0c67bb67eca22dd/src/hotspot/share/jfr/periodic/sampling/jfrThreadSampler.cpp#L165). >> This prevents AsyncGetCallTrace from crashing on segmentation faults (but >> not on `guarantee`s). >> >> If a crash is observed, then the `num_frames` field of the trace is set to >> `ticks_unknown_state` (-7) to signal a state that cannot be properly >> handled. `ticks_unknown_state` is currently also used for signaling unknown >> thread states but this should not be a problem, as the semantic is the same. >> If `num_frames` already has an error code then this error code is not >> changed. This helps to distinguish between errors in walking threads in Java >> and non-Java mode, as `num_frames` is set there before the walking to the >> appropriate error code. >> >> _Thanks for @tstuefe for suggesting this._ > > I'm actually ambivalent about the "production only" thing. > > I dislike having different behavior between debug and release, I prefer to > test what later runs in the field. Also, if we prevent crashes because we > want to ignore them, we should ignore them in debug too, otherwise, we burn > error analysis cycles needlessly. > > Optionally, I'd make the behavior of ThreadCrashProtection switchable at > runtime with a diagnostic XX switch. That switch would control the jumping > back in the signal handler. That way, if you encounter strange bugs while > using async profiler, one can disable the crash guard and enable asserts and > crashes. If you decide to do that, we should also do tests for it, so maybe > leave it for a future RFE. What @tstuefe said about debug vs. production made me wonder whether the assert in `ThreadCrashProtection` is still useful at all. I mean, we are deliberately using it from threads other than the JFR sampler thread and if that usage is not breaking stuff left and right, probably we could remove it? (/cc @mgronlun) ------------- PR: https://git.openjdk.java.net/jdk/pull/8225