Re: Unsafe.{get,put}-X-Unaligned performance

2015-03-16 Thread Andrew Haley
On 16/03/15 06:12, John Rose wrote: > BTW, I like Peter's suggestion to perform localized merging of > bytes to shorts (etc.) based on exact alignment. But, I'd rather > see it done further down the pipeline, after vectorization. That makes sense. Thanks, Andrew.

Re: Unsafe.{get,put}-X-Unaligned performance

2015-03-15 Thread John Rose
On Mar 12, 2015, at 11:37 AM, Andrew Haley wrote: > > On 03/12/2015 05:15 PM, Peter Levart wrote: >> ...or are JIT+CPU smart enough and there would be no difference? > > C2 always orders things based on profile counts, so there is no > difference. Your suggestion would be better for interpreted

Re: Unsafe.{get,put}-X-Unaligned performance

2015-03-15 Thread John Rose
On Mar 12, 2015, at 2:05 PM, Andrew Haley wrote: > > On 03/12/2015 07:29 PM, Peter Levart wrote: >> What about the following variant (or similar with ifs in case switch is >> sub-optimal): >> >> public final long getLongUnaligned(Object o, long offset) { >> switch ((int) offset & 7)

Re: Unsafe.{get,put}-X-Unaligned performance

2015-03-13 Thread Andrew Haley
On 12/03/15 22:02, Vitaly Davidovich wrote: > Is vectorization coming soon? AFAIK, only memory copies are vectorized > currently but not any arithmetic or the like. That's true, but the idea is that this is future-proof and will work well when we get vectorized scatter/gather memory accesses. The

Re: Unsafe.{get,put}-X-Unaligned performance

2015-03-13 Thread Andrew Haley
On 12/03/15 22:15, Vitaly Davidovich wrote: > Switches currently don't profile well (if at all) - John can shed more > light on that as this came up on the compiler list a few weeks ago. My first version used switches and the generated code was horrible. The version I submitted generates optimal (

Re: Unsafe.{get,put}-X-Unaligned performance

2015-03-12 Thread Vitaly Davidovich
Switches currently don't profile well (if at all) - John can shed more light on that as this came up on the compiler list a few weeks ago. sent from my phone On Mar 12, 2015 6:06 PM, "Peter Levart" wrote: > > > On 03/12/2015 10:04 PM, Peter Levart wrote: > > ... putLongUnaligned in the style of

Re: Unsafe.{get,put}-X-Unaligned performance

2015-03-12 Thread Peter Levart
On 03/12/2015 10:04 PM, Peter Levart wrote: ... putLongUnaligned in the style of above getLongUnaligned is more tricky with current code structure. But there may be a middle ground (or a sweet spot): public final void putLongUnaligned(Object o, long offset, long x) { if (((int)

Re: Unsafe.{get,put}-X-Unaligned performance

2015-03-12 Thread Vitaly Davidovich
Is vectorization coming soon? AFAIK, only memory copies are vectorized currently but not any arithmetic or the like. sent from my phone On Mar 12, 2015 5:06 PM, "Andrew Haley" wrote: > On 03/12/2015 07:29 PM, Peter Levart wrote: > > What about the following variant (or similar with ifs in case s

Re: Unsafe.{get,put}-X-Unaligned performance

2015-03-12 Thread Peter Levart
On 03/12/2015 10:05 PM, Andrew Haley wrote: On 03/12/2015 07:29 PM, Peter Levart wrote: What about the following variant (or similar with ifs in case switch is sub-optimal): public final long getLongUnaligned(Object o, long offset) { switch ((int) offset & 7) ... I tried tha

Re: Unsafe.{get,put}-X-Unaligned performance

2015-03-12 Thread Peter Levart
On 03/12/2015 08:29 PM, Peter Levart wrote: On 03/12/2015 07:37 PM, Andrew Haley wrote: On 03/12/2015 05:15 PM, Peter Levart wrote: ...or are JIT+CPU smart enough and there would be no difference? C2 always orders things based on profile counts, so there is no difference. Your suggestion

Re: Unsafe.{get,put}-X-Unaligned performance

2015-03-12 Thread Andrew Haley
On 03/12/2015 07:29 PM, Peter Levart wrote: > What about the following variant (or similar with ifs in case switch is > sub-optimal): > > public final long getLongUnaligned(Object o, long offset) { > switch ((int) offset & 7) ... I tried that already, and it wasn't really any faste

Re: Unsafe.{get,put}-X-Unaligned performance

2015-03-12 Thread Peter Levart
On 03/12/2015 07:37 PM, Andrew Haley wrote: On 03/12/2015 05:15 PM, Peter Levart wrote: ...or are JIT+CPU smart enough and there would be no difference? C2 always orders things based on profile counts, so there is no difference. Your suggestion would be better for interpreted code and I gues

Re: Unsafe.{get,put}-X-Unaligned performance

2015-03-12 Thread Peter Levart
On 03/12/2015 07:16 PM, Vitaly Davidovich wrote: Right, ok -- just wanted to make sure I wasn't missing something. For platforms that don't support unaligned access, is it expected that callers will be reading/writing addresses that are unaligned to the size of the type they're reading? My h

Re: Unsafe.{get,put}-X-Unaligned performance

2015-03-12 Thread Andrew Haley
On 03/12/2015 05:15 PM, Peter Levart wrote: > ...or are JIT+CPU smart enough and there would be no difference? C2 always orders things based on profile counts, so there is no difference. Your suggestion would be better for interpreted code and I guess C1 also, so I agree it is worthwhile. Thanks

Re: Unsafe.{get,put}-X-Unaligned performance

2015-03-12 Thread Andrew Haley
On 03/12/2015 04:52 PM, Peter Levart wrote: > ...getFloat() is calling getFloat(int) which is a virtual method with 2 > implementations. I think it would be better to in-line the the call and > eliminate the need to execute checkIndex()... Okay; I guess it is more symmetrical that way. I did ha

Re: Unsafe.{get,put}-X-Unaligned performance

2015-03-12 Thread Vitaly Davidovich
Right, ok -- just wanted to make sure I wasn't missing something. For platforms that don't support unaligned access, is it expected that callers will be reading/writing addresses that are unaligned to the size of the type they're reading? My hunch is that on such platforms folks would tend to alig

Re: Unsafe.{get,put}-X-Unaligned performance

2015-03-12 Thread Peter Levart
On 03/12/2015 06:30 PM, Vitaly Davidovich wrote: Isn't the C2 intrinsic just reading the value starting at the specified offset directly (when unaligned access is supported) and not doing the branching? It is. This code is for those platforms not supporting unaligned accesses. Peter On T

Re: Unsafe.{get,put}-X-Unaligned performance

2015-03-12 Thread Vitaly Davidovich
Isn't the C2 intrinsic just reading the value starting at the specified offset directly (when unaligned access is supported) and not doing the branching? On Thu, Mar 12, 2015 at 1:15 PM, Peter Levart wrote: > > > On 03/10/2015 08:02 PM, Andrew Haley wrote: > > The new algorithm does an N-way bra

Re: Unsafe.{get,put}-X-Unaligned performance

2015-03-12 Thread Peter Levart
On 03/10/2015 08:02 PM, Andrew Haley wrote: The new algorithm does an N-way branch, always loading and storing subwords according to their natural alignment. So, if the address is random and the size is long it will access 8 bytes 50% of the time, 4 shorts 25% of the time, 2 ints 12.5% of the

Re: Unsafe.{get,put}-X-Unaligned performance

2015-03-12 Thread Peter Levart
On 03/11/2015 06:27 PM, Andrew Haley wrote: On 03/11/2015 07:10 AM, John Rose wrote: John: I'm waiting for an answer to my question here before I submit a webrev for approval. http://mail.openjdk.java.net/pipermail/panama-dev/2015-March/99.html (Answered.) http://cr.openjdk.java.net/~ap

Re: Unsafe.{get,put}-X-Unaligned performance

2015-03-12 Thread Andrew Haley
On 03/12/2015 11:00 AM, Paul Sandoz wrote: > We can re-use this one: > > https://bugs.openjdk.java.net/browse/JDK-8026049 Will do, thx. Andrew.

Re: Unsafe.{get,put}-X-Unaligned performance

2015-03-12 Thread Paul Sandoz
On Mar 11, 2015, at 6:27 PM, Andrew Haley wrote: > On 03/11/2015 07:10 AM, John Rose wrote: >>> >>> John: I'm waiting for an answer to my question here before I submit >>> a webrev for approval. >>> >>> http://mail.openjdk.java.net/pipermail/panama-dev/2015-March/99.html >> >> (Answered.)

Re: Unsafe.{get,put}-X-Unaligned performance

2015-03-12 Thread Paul Sandoz
On Mar 11, 2015, at 7:23 PM, Andrew Haley wrote: > On 03/11/2015 06:00 PM, Paul Sandoz wrote: >> We need to include some unit tests before we can push. > > I have a test which I've been using. It could be converted into > a unit test. > Ok. There are Unsafe tests in: hotspot/test/runtime/

Re: Unsafe.{get,put}-X-Unaligned performance

2015-03-11 Thread Andrew Haley
On 03/11/2015 06:00 PM, Paul Sandoz wrote: > We need to include some unit tests before we can push. I have a test which I've been using. It could be converted into a unit test. Andrew.

Re: Unsafe.{get,put}-X-Unaligned performance

2015-03-11 Thread Paul Sandoz
Hi Andrew, On Mar 11, 2015, at 6:27 PM, Andrew Haley wrote: > On 03/11/2015 07:10 AM, John Rose wrote: >>> >>> John: I'm waiting for an answer to my question here before I submit >>> a webrev for approval. >>> >>> http://mail.openjdk.java.net/pipermail/panama-dev/2015-March/99.html >> >>

Re: Unsafe.{get,put}-X-Unaligned performance

2015-03-11 Thread Andrew Haley
On 03/11/2015 05:41 PM, Vitaly Davidovich wrote: > I don't think we need this unalignedKnown dance anymore -- just return > unsafe.unalignedAccess() there? Yup. Andrew.

Re: Unsafe.{get,put}-X-Unaligned performance

2015-03-11 Thread Vitaly Davidovich
Also, static boolean unaligned() { 595 if (unalignedKnown) 596 return unaligned; 597 unaligned = unsafe.unalignedAccess(); 598 unalignedKnown = true; 599 return unaligned; 600 } I don't think we need this unalignedKnown dance anymore -- just

Re: Unsafe.{get,put}-X-Unaligned performance

2015-03-11 Thread Andrew Haley
On 03/11/2015 05:38 PM, Vitaly Davidovich wrote: > private static final ByteOrder byteOrder > 571 = unsafe.isBigEndian() ? ByteOrder.BIG_ENDIAN : > ByteOrder.LITTLE_ENDIAN; > 572 > 573 static ByteOrder byteOrder() { > 574 if (byteOrder == null) > 575 throw new

Re: Unsafe.{get,put}-X-Unaligned performance

2015-03-11 Thread Vitaly Davidovich
private static final ByteOrder byteOrder 571 = unsafe.isBigEndian() ? ByteOrder.BIG_ENDIAN : ByteOrder.LITTLE_ENDIAN; 572 573 static ByteOrder byteOrder() { 574 if (byteOrder == null) 575 throw new Error("Unknown byte order"); 576 return byteOrder; 577

Re: Unsafe.{get,put}-X-Unaligned performance

2015-03-11 Thread Andrew Haley
On 03/11/2015 07:10 AM, John Rose wrote: >> >> John: I'm waiting for an answer to my question here before I submit >> a webrev for approval. >> >> http://mail.openjdk.java.net/pipermail/panama-dev/2015-March/99.html > > (Answered.) http://cr.openjdk.java.net/~aph/unaligned.jdk.5/ http://cr.op

Re: Unsafe.{get,put}-X-Unaligned performance

2015-03-11 Thread John Rose
On Mar 10, 2015, at 12:02 PM, Andrew Haley wrote: > > The new algorithm is slightly slower because of branch misprediction. > > old: 2.17 IPC, 0.08% branch-misses, 91,965,281,215 cycles > new: 1.23 IPC, 6.11% branch-misses, 99,925,255,682 cycles > > ...but it executes fewer instructions so we