On Thu, 29 Oct 2020 08:56:13 GMT, Lin Zang <lz...@openjdk.org> wrote:
>> The static `ThreadHeapSampler::_log_table` is currently initialized on JVM >> bootstrap to an overhead of ~67k instructions (linux-x64). By turning the >> initialization into a constexpr, we can precalculate the helper table at >> compile time, which trades a runtime overhead for a small, 8kb, static >> footprint increase. >> >> I compared `fast_log2` with the `log2` builtin with a naive benchmarking >> experiment[1] (not included in this PR) and show that the `fast_log2` is >> ~2.5x faster than `log2` on my system. And that without the lookup table >> we'd be much worse. So I think it makes sense to preserve this optimization, >> but get rid of the startup overhead: >> >> [5.428s][debug][heapsampling] log2, 0.0751173 secs >> [5.457s][debug][heapsampling] fast_log2, 0.0298244 secs >> [5.622s][debug][heapsampling] fast_log2_uncached, 0.1645569 secs >> >> I've verified that this refactoring does not affect performance in this >> naive setup. >> >> [1] https://github.com/openjdk/jdk/compare/master...cl4es:log2_micro?expand=1 > > Dear @cl4es, > I am not a reviewer, just have 1 comment that maybe you need to update the > Year info in the headers of touched files. > > Thanks. > Lin Unfortunately there's currently no portable way to use `std::log` (or any of the other `std` math functions) in a constexpr, so I had to resort to a code generator approach instead. It's either that or withdrawing this PR. Using UL and a debug-only block to implement an adhoc code generator (`-Xlog:heapsampling+generate::none`) might be a bit unorthodox, but I think it turned out OK. ------------- PR: https://git.openjdk.java.net/jdk/pull/880