https://bugs.kde.org/show_bug.cgi?id=402369
Bug ID: 402369 Summary: Overhaul DHAT Product: valgrind Version: unspecified Platform: Other OS: Linux Status: REPORTED Severity: normal Priority: NOR Component: callgrind Assignee: josef.weidendor...@gmx.de Reporter: n.netherc...@gmail.com Target Milestone: --- Created attachment 117020 --> https://bugs.kde.org/attachment.cgi?id=117020&action=edit patch I have totally overhauled DHAT. The current version is useful, but has some annoyances and deficiencies. The new version is much better. I have used extensively on the Rust compiler, using it on more than 20 PRs. The only incomplete part is that I haven't updated the documentation. I plan to do that just before landing, so that if I get feedback about the UI that causes me to change the output I don't need to update the docs again. What follows is the commit message, which describes the changes. Overhaul DHAT. This commit thoroughly overhauls DHAT, moving it out of the "experimental" ghetto. It makes moderate changes to DHAT itself, including dumping profiling data to a JSON format output file. It also implements a new data viewer (as a web app, in dhat/dh_view.html). The main benefits over the old DHAT are as follows. - The separation of data collection and presentation means you can run a program once under DHAT and then sort the data in various ways. Also, full data is in the output file, and the viewer chooses what to omit. - The data can be sorted in more ways than previously. Some of these sorts involve useful filters such as "short-lived" and "zero reads or zero writes". - The tree structure view avoids the need to choose stack trace depth. This avoids both the problem of not enough depth (when records that should be distinct are combined, and may not contain enough information to be actionable) and the problem of too much depth (when records that should be combined are separated, making them seem less important than they really are). - Byte and block measures are shown with a percentage relative to the global count, which helps gauge relative significance of different parts of the profile. - Byte and blocks measures are also shown with an allocation rate (bytes and blocks per million instructions), which enables comparisons across multiple profiles, even if those profiles represent different workloads. - Both global and per-node measurements are taken at the global heap peak ("At t-gmax"), which gives Massif-like insight into the point of peak memory use. - The final/liftimes stats are a bit more useful than the old deaths stats. (E.g. the old deaths stats didn't take into account lifetimes of unfreed blocks.) - The handling of realloc() has changed. The sequence `p = malloc(100); realloc(p, 200);` now increases the total block count by 2 and the total byte count by 300. Previously it increased them by 1 and 200. The new handling is a more operational view that better reflects the effect of allocations on performance. It makes a significant difference in the results, giving paths involving reallocation (e.g. repeated pushing to a growing vector) more prominence. Other things of note: - There is now testing, both regression tests that run within the standard test suite, and viewer-specific tests that cannot run within the standard test suite. The latter are run by loading dh_view.html?test=1 in a web browser. - The commit puts all tool lists in Makefiles (and similar files) in a consistent order: memcheck, cachegrind, callgrind, helgrind, drd, massif, dhat, lackey, none; exp-sgcheck, exp-bbv. - A lot of fields in dh_main.c have been given more descriptive names. Those names now match those used in dh_view.js. -- You are receiving this mail because: You are watching all bug changes.