On 02/18/2015 09:15 AM, Andrew Haley wrote: > On 18/02/15 09:14, Florian Weimer wrote: >> Wow, looks nice. What OpenJDK build did you use? I want to see if this >> happens on x86_64, too. > > I'm working on JDK9. You don't have this code yet. I'll do an x86 > build.
0x00007f2948acbf8c: mov 0xc(%rdx),%r10d ;*synchronization entry ; - java.nio.HeapByteBuffer::<init>@-1 (line 84) ; - java.nio.ByteBuffer::wrap@7 (line 373) ; - java.nio.ByteBuffer::wrap@4 (line 396) ; - bytebuffertests.ByteBufferTests3::getLong@1 (line 23) ; implicit exception: dispatches to 0x00007f2948acbff5 ;; B2: # B5 B3 <- B1 Freq: 0.999999 ;; MEMBAR-release ! (empty encoding) 0x00007f2948acbf90: test %ecx,%ecx 0x00007f2948acbf92: jl 0x00007f2948acbfb5 ;*iflt ; - java.nio.Buffer::checkIndex@1 (line 545) ; - java.nio.HeapByteBuffer::getLong@18 (line 465) ; - bytebuffertests.ByteBufferTests3::getLong@5 (line 23) ;; B3: # B6 B4 <- B2 Freq: 0.999999 0x00007f2948acbf94: mov %r10d,%ebp 0x00007f2948acbf97: sub %ecx,%ebp ;*isub ; - java.nio.Buffer::checkIndex@10 (line 545) ; - java.nio.HeapByteBuffer::getLong@18 (line 465) ; - bytebuffertests.ByteBufferTests3::getLong@5 (line 23) 0x00007f2948acbf99: cmp $0x8,%ebp 0x00007f2948acbf9c: jl 0x00007f2948acbfd5 ;*if_icmple ; - java.nio.Buffer::checkIndex@11 (line 545) ; - java.nio.HeapByteBuffer::getLong@18 (line 465) ; - bytebuffertests.ByteBufferTests3::getLong@5 (line 23) ;; B4: # N95 <- B3 Freq: 0.999998 0x00007f2948acbf9e: movslq %ecx,%r10 0x00007f2948acbfa1: mov 0x10(%rdx,%r10,1),%rax 0x00007f2948acbfa6: bswap %rax ;*invokestatic reverseBytes ; - java.nio.Bits::swap@1 (line 61) ; - java.nio.HeapByteBuffer::getLong@41 (line 466) ; - bytebuffertests.ByteBufferTests3::getLong@5 (line 23) So, just the same except that there is no explicit fence instruction to remove. It's a shame for AArch64 because that fence really kills performance but it's bad for x86 too. Even on machines that don't emit fence instructions the fence still acts as a compiler barrier. Andrew.