On Thu, 29 Oct 2020 08:56:13 GMT, Lin Zang <lz...@openjdk.org> wrote:

>> The static `ThreadHeapSampler::_log_table` is currently initialized on JVM 
>> bootstrap to an overhead of ~67k instructions (linux-x64). By turning the 
>> initialization into a constexpr, we can precalculate the helper table at 
>> compile time, which trades a runtime overhead for a small, 8kb, static 
>> footprint increase.
>> 
>> I compared `fast_log2` with the `log2` builtin with a naive benchmarking 
>> experiment[1] (not included in this PR) and show that the `fast_log2` is 
>> ~2.5x faster than `log2` on my system. And that without the lookup table 
>> we'd be much worse. So I think it makes sense to preserve this optimization, 
>> but get rid of the startup overhead:
>> 
>> [5.428s][debug][heapsampling] log2, 0.0751173 secs
>> [5.457s][debug][heapsampling] fast_log2, 0.0298244 secs
>> [5.622s][debug][heapsampling] fast_log2_uncached, 0.1645569 secs
>> 
>> I've verified that this refactoring does not affect performance in this 
>> naive setup.
>> 
>> [1] https://github.com/openjdk/jdk/compare/master...cl4es:log2_micro?expand=1
>
> Dear @cl4es, 
> I am not a reviewer, just have 1 comment that maybe you need to update the 
> Year info in the headers of touched files. 
> 
> Thanks.
> Lin

Unfortunately there's currently no portable way to use `std::log` (or any of 
the other `std` math functions) in a constexpr, so I had to resort to a code 
generator approach instead. It's either that or withdrawing this PR.

Using UL and a debug-only block to implement an adhoc code generator 
(`-Xlog:heapsampling+generate::none`) might be a bit unorthodox, but I think it 
turned out OK.

-------------

PR: https://git.openjdk.java.net/jdk/pull/880

Reply via email to