On Tue, 10 Jun 2025 15:15:34 GMT, Radim Vansa <rva...@openjdk.org> wrote:
>> I think you misunderstand me, as what I'm proposing wouldn't requre creating >> an array in `create_search_table`. >> >> I'm saying that you do this: >> >> ```c++ >> // Note: we require the supplier to provide the elements in the final order >> as we can't easily sort >> // within this method - qsort() accepts only pure function as comparator. >> void PackedTableBuilder::fill(u1* table, size_t table_length, Supplier >> &supplier) const { >> Pair<uint32_t, uint32_t> kvs[4]; >> >> size_t offset = 0; >> size_t len_read = 0; >> while (len = supplier.next(kvs, 4)) { >> // len tells you how many of the full capacity of kvs was used. >> } >> } >> >> >> Now each call of `Supplier::next` gives you up to 4 elements, quartering >> the amount of virtual calls necessary. >> >>>Alternatively, we could replace virtual calls with templated fill method. In >>>that case the Comparator should be a template method as well, since that one >>>is even more perf-critical? >> >> Sure, you can try that out. >> >> To be clear, I am only asking: Are virtual calls expensive enough here that >> not having them, or amortizing their cost, something that makes the >> performance better? > > The earlier version of this PR was not abstracting out PackedTable* so there > were no virtual calls. In both the CCC.java and undisclosed customer > reproducer I don't see a significant difference in performance - these > involve whole JVM startup, so the efficiency of code with linear complexity > is probably under the radar. If we want to optimize the hack out of it - yes, > there would be space for that, maybe at the cost of maintainability and/or > reusability. > TLDR: I don't have a (micro)benchmark that would prove one thing is better > than the other. Alright, if you can't observe a difference then don't change a thing :-). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2138248072