Re: Unsafe.{get,put}-X-Unaligned performance

John Rose Sun, 15 Mar 2015 23:13:57 -0700

On Mar 12, 2015, at 11:37 AM, Andrew Haley <a...@redhat.com> wrote:
> 
> On 03/12/2015 05:15 PM, Peter Levart wrote:
>> ...or are JIT+CPU smart enough and there would be no difference?
> 
> C2 always orders things based on profile counts, so there is no
> difference.  Your suggestion would be better for interpreted code
> and I guess C1 also, so I agree it is worthwhile.


Profile counts can partially reorganize decision trees,
if they are unambiguous.  The best effect from profiling
is to prune untaken branches completely (leaving a deopt).

The main caveat here is that this breaks down when the
profile is ambiguous, which can happen when multiple
users of a library routine "pollute" the profile with
divergent behaviors.  See (e.g.) slides 17-19 of:
  http://cr.openjdk.java.net/~jrose/pres/201502-JVMChallenges.pdf

The JVM currently addresses this mainly by combining local
profile data with type inference that crosses inline boundaries.
The present case can perhaps be improved by type inference
or non-local profiling on bitfields, which is partially discussed in:
  https://bugs.openjdk.java.net/browse/JDK-8001436

BTW, I like Peter's suggestion to perform localized merging of
bytes to shorts (etc.) based on exact alignment.  But, I'd rather
see it done further down the pipeline, after vectorization.

— John

Re: Unsafe.{get,put}-X-Unaligned performance

Reply via email to