On Wed, 30 Mar 2022 06:35:27 GMT, Christian Hagedorn <chaged...@openjdk.org> wrote:
>> Hi Christian, >> >> this is impressive work. It's a big change, and I had a look at part of it. >> I'll continue tomorrow. >> >> In general, I'm concerned with the use of both UL and ResourceArea in this >> code. I know the use of UL has been discussed, but still. Use of RA will >> prevent us from getting useful callstacks if we crash and Thread::current is >> NULL or invalid. I'd feel better if we were to consistently rely on an >> outside scratch buffer (like we usually do in error reporting). Even raw >> ::malloc would be better IMHO. >> >> Another concern was safety, since this is a potential attack vector with >> manipulated Dwarf files, if someone manages to provove a crash. Maybe far >> fetched, but still. Would be good to get SonarCloud readings for this code >> e.g. >> >> More remarks inline. >> >> Cheers, Thomas > >> this is impressive work. It's a big change, and I had a look at part of it. >> I'll continue tomorrow. > > Thanks a lot Thomas for your careful review! I'm in the process of working > through your comments and will come back with an update today or later this > week. > >> In general, I'm concerned with the use of both UL and ResourceArea in this >> code. I know the use of UL has been discussed, but still. > > I agree that it is problematic but I think it would be good to keep some > logging around when later coming back to the parser code (and that's the only > reason I think that you ever want to turn these logs on). I can currently > think of two options: > > - Leave UL in and just guard it with an additional new develop flag to > exclude the logs from unfiltered UL logging. This would allow us to kinda > accept the risks for debugging purposes. That's not really a good design > though but we could keep the log levels with their time stamps. > - Replace all UL calls with `tty` and also guard them with a new develop flag > and play around with `Verbose` and `WizardMode` to keep the different log > levels. That's not great either but I think it's safer to use and we only > want the logs on rare occasions anyways - so it might be acceptable to use > these verbose flags even though we should generally get away from them. > >> Use of RA will prevent us from getting useful callstacks if we crash and >> Thread::current is NULL or invalid. I'd feel better if we were to >> consistently rely on an outside scratch buffer (like we usually do in error >> reporting). Even raw ::malloc would be better IMHO. > > The idea of a scratch buffer sounds good. I'll check if I can replace all the > `NEW_RESOURCE_ARRAY` usages with it. > >> Another concern was safety, since this is a potential attack vector with >> manipulated Dwarf files, if someone manages to provove a crash. Maybe far >> fetched, but still. Would be good to get SonarCloud readings for this code >> e.g. > > I was also concerned about that and I'm very thankful that you've spotted > some issues already! I think minimizing the risk of a potential attack should > be a top priority. We should definitely add some more checks. What do you > think about the usage of `_JVM_DWARF_PATH` to load a DWARF file? I'm not sure > how safe it is. I originally had it enabled for debug builds only. > >> We see test errors on Linux ppcle and x64 in gtests: > > Could you try running it with `-Xlog:dwarf=info/debug` in order to find out > why it failed? It might not have found the symbols. Is the JTreg test > `TestDwarf.java` working? But there is now another problem that since using > GCC 11.2 (change done for Oracle builds with > [JDK-8283057](https://bugs.openjdk.java.net/browse/JDK-8283057)), it emits > unsupported DWARF 5 for some DWARF sections, at least on my machine, which is > unfortunate. Maybe that's also the reason you see the failures if you use GCC > 11.2. Maybe we can mitigate this problem by forcing GCC to use DWARF 4 for > now. Could that be done by using the `-gdwarf` GCC flag? @erikj79 > >> We also see Problems in runtime/ErrorHandling and in >> jfr/jvm/TestDumpOnCrash. Mostly, these tests now have much longer runtimes >> (about factor 2). With TestDumpOnCrash, both the error file writer and the >> test itself timeouted on some of our slower machines. > > Are these timeouts on ppcle and x64? We could also try to add > `-Xlog:dwarf=info/debug` to the runs to get some rough idea of the time > required to parse DWARF. I'll have a look at the these tests. > > Thanks, > Christian > @chhagedorn > > > There is still some code that could be shared though like opening a DWARF > > file with its checks or reading an LEB 128 etc. Might be worth to > > investigate further if the two implementations can be merged/reused to some > > extent. But I propose to file a separate RFE for that. What do you think? > > Yeah, let's investigate about it in another RFE. > > IMHO we can share some codes about DWARF between HotSpot and SA, and also we > might need DWARF-based call frame parser in HotSpot because some 3rd-party > native libraries don't use base pointer (RBP) to store SP due to > optimization. In SA side, it would be useful if we can check native source > file and line number in mixed jstack with your change. I see, then it makes sense to unify these parsers later. > > So I want to unify DWARF parser (processor) between HotSpot and SA, but it > might be long journey... thus I agree with you to file it as another RFE. Sounds good, I'll file an RFE and link it to this RFE. ------------- PR: https://git.openjdk.java.net/jdk/pull/7126