Hi Ben, Thanks for looking into this.
> On Feb 5, 2018, at 8:52 AM, Ben Walsh <ben_wa...@uk.ibm.com> wrote: > > Running with the following test under the JMH benchmark framework (never > used this before, so please do point out any issues with the test) … > Your benchmark looks good, i would return the byte buffer thereby avoiding any risk of dead code elimination (generally a best practice even if not absolutely required). As a follow on you might wanna try measuring a loop e.g.: ByteBuffer bb = ... for (int i = 0; i < L; i++) { bb.put((byte)i) } return bb; where L is parameterized (see JMH’s @Param annotation), then you can easily vary from 1 upwards. Performance might be affected in loops if unrolling and/or vectorization is perturbed by the fence (i doubt the compiler will hoist the fence out of the loop given the JIT is forced to *not* inline it). To test for vectorization you could write a test method to get int values from the buffer and sum ‘em up returning the sum. Alas a 7% hit on simple access is not good :-( we really need the JIT to track the argument passed to the method. Thanks, Paul. > > ------------------------------------------------------------------------------------------------------------------- > Result "org.sample.ByteBufferBenchmark.benchmark_byte_buffer_put": > 33100911.857 ±(99.9%) 747461.951 ops/s [Average] > (min, avg, max) = (25373082.559, 33100911.857, 38885170.177), stdev = > 3164800.705 > CI (99.9%): [32353449.906, 33848373.808] (assumes normal distribution) > > > # Run complete. Total time: 00:27:27 > > Benchmark Mode Cnt Score > Error Units > ByteBufferBenchmark.benchmark_byte_buffer_put thrpt 200 33100911.857 ± > 747461.951 ops/s > > > > Result "org.sample.ByteBufferBenchmark.benchmark_byte_buffer_put": > 35604933.518 ±(99.9%) 654975.515 ops/s [Average] > (min, avg, max) = (25558172.378, 35604933.518, 39524804.534), stdev = > 2773207.341 > CI (99.9%): [34949958.003, 36259909.033] (assumes normal distribution) > > > # Run complete. Total time: 00:27:51 > > Benchmark Mode Cnt Score > Error Units > ByteBufferBenchmark.benchmark_byte_buffer_put thrpt 200 35604933.518 ± > 654975.515 ops/s > > > ... So a performance degradation of roughly 7%. > > > Regards, > Ben Walsh >