On Wed, 16 Mar 2022 17:16:06 GMT, liach <d...@openjdk.java.net> wrote:

>> liach has updated the pull request incrementally with two additional commits 
>> since the last revision:
>> 
>>  - merge branch 'identityhashmap-bench' of 
>> https://github.com/liachmodded/jdk into identityhashmap-default
>>  - fix whitespace
>
> @stuart-marks Could you help me with this?
> 
> In my JMH benchmark runs, I often find that ambient differences (from the jdk 
> processes, the random test data generated, etc.) hide the actual performance 
> difference caused by the patches.
> 
> Hence, I need help in these two area:
> 1. I probably need to make a baseline benchmark that can discount the 
> fundamental differences between jdk processes. What should it be?
> 2. How do I generate consistent data set for all test runs to avoid 
> contributing to errors?

@liach I don't have much time to spend on this. Several comments.

JMH benchmarking is more than a bit of an art. There's a lot of information in 
the JMH samples, so I'd recommend going through them in detail if you haven't 
already. There are a couple of obvious things to look at, such as to make sure 
to return values produced by the computation (or use black holes); to fork 
multiple JVMs during the benchmark run; to reduce or eliminate garbage 
generation during the test, or account for it if it can't be eliminated; and so 
forth.

In this particular case it's not clear to me how much performance there is to 
be gained from overriding the default methods. Suppose an entry exists and 
`compute(k, bifunc)` is called. The default method calls `get` to obtain the 
value, calls the bifunction, then calls `put(k, newVal)`. An optimized 
implementation would remember the location of the mapping so that the new value 
could be stored without probing again. But probing in an IdentityHashMap is 
likely pretty inexpensive: the identity hashcode is cached, finding the table 
index is integer arithmetic, and searching for the right mapping is reference 
comparisons on table entries that are probably already cached. It doesn't seem 
likely to me that there is much to be gained here in the first place.

Then again, one's intuition about performance, including mine, is easily wrong! 
But: if you're having trouble writing a benchmark that demonstrates a 
performance improvement, maybe that means there isn't much performance to be 
gained.

As a general comment I'd suggest pursuing performance improvements only when 
there's a demonstrated performance issue. This kind of looks to me like a case 
of starting with "I bet I can speed this up by changing this code" and then 
trying to justify that with benchmarks. If so, that's kind of backwards, and it 
can easily lead to a lot of time being wasted.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6532

Reply via email to