Re: RFR (XS) CR 8058643: (str) Re-examine hashCode implementation

Peter Levart Thu, 25 Sep 2014 04:10:50 -0700

On 09/25/2014 09:40 AM, Aleksey Shipilev wrote:

Hi Peter,


On 09/25/2014 02:46 AM, Peter Levart wrote:

http://cr.openjdk.java.net/~plevart/misc/StringHash/HashBench.java

Interesting.

I have to say it once again:
  a) It is *an error* to use static finals as the benchmark input. They
are perfectly constant-foldable in way too many cases. Break this habit,
please.


Hi Aleksey,

The "constant" in this example is only a reference to the char[] array.It's content is not. In String, this is a final instance field, whichbehaves similarly inside an instance method (usually it is read onlyonce per method invocation).

  b) Explicit Blackholes are not needed, and just returning "int" from
@Benchmark method helps readability a lot. Please break this habit as
well. Having readable and maintainable benchmarks is a key.


Ok, here's a modified benchmark:

http://cr.openjdk.java.net/~plevart/misc/StringHash/HashBench2.java

Which behaves similarly.

Here are it's results:

 * Results on JDK9, Linux, i7-2600K CPU, JMH args: -f 1 -wi 5 -i 8 -gc true
 *

* Benchmark Mode Samples Score Scoreerror Units* j.t.HashBench2._hashCode thrpt 8 8308858.217353019.084 ops/s* j.t.HashBench2.hashCode0 thrpt 8 8207337.729217048.634 ops/s* j.t.HashBench2.hashCode1 thrpt 8 13359572.359345736.675 ops/s* j.t.HashBench2.hashCode2 thrpt 8 15310621.202238369.017 ops/s* j.t.HashBench2.hashCode3 thrpt 8 17637944.829232155.847 ops/s* j.t.HashBench2.hashCode3i thrpt 8 17724181.444509913.288 ops/s* j.t.HashBench2.hashCode3x thrpt 8 8344128.432159508.813 ops/s* j.t.HashBench2.hashCode4 thrpt 8 16526850.489969549.448 ops/s* j.t.HashBench2.hashCode5 thrpt 8 17567765.554917934.885 ops/s* j.t.HashBench2.hashCode6 thrpt 8 17705074.332419405.652 ops/s* j.t.HashBench2.hashCode7 thrpt 8 18805633.563209181.299 ops/s* j.t.HashBench2.hashCode8 thrpt 8 18300123.201376681.550 ops/s


It would be interesting to see how it behaves on different CPUs.

This is really great!

Couldn't this be a tweak via HotSpot, instead uglifying and bloating
the Java and hence the byte code?

+1

This is for HotSpot compiler guys to answer. Theoretically I think it is
possible. But it would have to be tailored to the very specific use case
and I don't know if such specific transformation would have wide-enough
applicability. If it would only speed-up String.hashCode and very
similar loops, it is less trouble to do that by hand in one or few
places...

I would think this happens in user-specified hashCode() over arrays.
IDEs would routinely inject the loop like that or delegate to
Arrays.hashCode, that does the same loop.


Arrays.hasCode() can be "improved" this way too.


In other words, I would like to see this fixed on compiler side. This
seems to be the strength-reduction playing tricks with loop unrolling,
I'll submit a bug shortly.

As I said, I don't think it has anything to do with loop unrolling. Thetransformation I applied in hashCode1,2,3, ... 8 produces code thatexecutes 2, 3, 4, ... 9 independent multiplications in each chunk, whichallow CPU's pipeline to execute them in parallel. I had to manuallyunroll the loop a bit just to achieve this transformation. But my manualunrolling does not bring the speed-up per se. The parallelization ofmultiplication does. This can be seen by observing the score ofhashCode3x benchmark, which has the same loop structure as hashCode3,but performs multiplications in a way where each of them depends on theresult of a previous one, which prevents the CPU from parallelizing them.

This is not to say that such transformation couldn't be done on the JITside. I just have a feeling that such transformation won't be widelyused because it is very specific. It can only be used within integerarithmetic of the homogeneous width (since it changes the order ofoperations applied, the final result depends on which width is used whenoverflow happens). Floating arithmetic is equally unsiutable for suchtransformations that change order of operations. It can only help whenthe sequence of operations that are dependent on one another are changedinto a sequence of independent operations and those operations have aweight that matters (such as multiplication).


Regards, Peter


-Aleksey.

Re: RFR (XS) CR 8058643: (str) Re-examine hashCode implementation

Reply via email to