On 16/03/15 06:12, John Rose wrote:
> BTW, I like Peter's suggestion to perform localized merging of
> bytes to shorts (etc.) based on exact alignment. But, I'd rather
> see it done further down the pipeline, after vectorization.
That makes sense.
Thanks,
Andrew.
On Mar 12, 2015, at 11:37 AM, Andrew Haley wrote:
>
> On 03/12/2015 05:15 PM, Peter Levart wrote:
>> ...or are JIT+CPU smart enough and there would be no difference?
>
> C2 always orders things based on profile counts, so there is no
> difference. Your suggestion would be better for interpreted
On Mar 12, 2015, at 2:05 PM, Andrew Haley wrote:
>
> On 03/12/2015 07:29 PM, Peter Levart wrote:
>> What about the following variant (or similar with ifs in case switch is
>> sub-optimal):
>>
>> public final long getLongUnaligned(Object o, long offset) {
>> switch ((int) offset & 7)
On 12/03/15 22:02, Vitaly Davidovich wrote:
> Is vectorization coming soon? AFAIK, only memory copies are vectorized
> currently but not any arithmetic or the like.
That's true, but the idea is that this is future-proof and will work
well when we get vectorized scatter/gather memory accesses.
The
On 12/03/15 22:15, Vitaly Davidovich wrote:
> Switches currently don't profile well (if at all) - John can shed more
> light on that as this came up on the compiler list a few weeks ago.
My first version used switches and the generated code was horrible.
The version I submitted generates optimal (
Switches currently don't profile well (if at all) - John can shed more
light on that as this came up on the compiler list a few weeks ago.
sent from my phone
On Mar 12, 2015 6:06 PM, "Peter Levart" wrote:
>
>
> On 03/12/2015 10:04 PM, Peter Levart wrote:
>
> ... putLongUnaligned in the style of
On 03/12/2015 10:04 PM, Peter Levart wrote:
... putLongUnaligned in the style of above getLongUnaligned is more
tricky with current code structure. But there may be a middle ground
(or a sweet spot):
public final void putLongUnaligned(Object o, long offset, long x) {
if (((int)
Is vectorization coming soon? AFAIK, only memory copies are vectorized
currently but not any arithmetic or the like.
sent from my phone
On Mar 12, 2015 5:06 PM, "Andrew Haley" wrote:
> On 03/12/2015 07:29 PM, Peter Levart wrote:
> > What about the following variant (or similar with ifs in case s
On 03/12/2015 10:05 PM, Andrew Haley wrote:
On 03/12/2015 07:29 PM, Peter Levart wrote:
What about the following variant (or similar with ifs in case switch is
sub-optimal):
public final long getLongUnaligned(Object o, long offset) {
switch ((int) offset & 7)
...
I tried tha
On 03/12/2015 08:29 PM, Peter Levart wrote:
On 03/12/2015 07:37 PM, Andrew Haley wrote:
On 03/12/2015 05:15 PM, Peter Levart wrote:
...or are JIT+CPU smart enough and there would be no difference?
C2 always orders things based on profile counts, so there is no
difference. Your suggestion
On 03/12/2015 07:29 PM, Peter Levart wrote:
> What about the following variant (or similar with ifs in case switch is
> sub-optimal):
>
> public final long getLongUnaligned(Object o, long offset) {
> switch ((int) offset & 7)
...
I tried that already, and it wasn't really any faste
On 03/12/2015 07:37 PM, Andrew Haley wrote:
On 03/12/2015 05:15 PM, Peter Levart wrote:
...or are JIT+CPU smart enough and there would be no difference?
C2 always orders things based on profile counts, so there is no
difference. Your suggestion would be better for interpreted code
and I gues
On 03/12/2015 07:16 PM, Vitaly Davidovich wrote:
Right, ok -- just wanted to make sure I wasn't missing something. For
platforms that don't support unaligned access, is it expected that
callers will be reading/writing addresses that are unaligned to the
size of the type they're reading? My h
On 03/12/2015 05:15 PM, Peter Levart wrote:
> ...or are JIT+CPU smart enough and there would be no difference?
C2 always orders things based on profile counts, so there is no
difference. Your suggestion would be better for interpreted code
and I guess C1 also, so I agree it is worthwhile.
Thanks
On 03/12/2015 04:52 PM, Peter Levart wrote:
> ...getFloat() is calling getFloat(int) which is a virtual method with 2
> implementations. I think it would be better to in-line the the call and
> eliminate the need to execute checkIndex()...
Okay; I guess it is more symmetrical that way. I did ha
Right, ok -- just wanted to make sure I wasn't missing something. For
platforms that don't support unaligned access, is it expected that callers
will be reading/writing addresses that are unaligned to the size of the
type they're reading? My hunch is that on such platforms folks would tend
to alig
On 03/12/2015 06:30 PM, Vitaly Davidovich wrote:
Isn't the C2 intrinsic just reading the value starting at the
specified offset directly (when unaligned access is supported) and not
doing the branching?
It is. This code is for those platforms not supporting unaligned accesses.
Peter
On T
Isn't the C2 intrinsic just reading the value starting at the specified
offset directly (when unaligned access is supported) and not doing the
branching?
On Thu, Mar 12, 2015 at 1:15 PM, Peter Levart
wrote:
>
>
> On 03/10/2015 08:02 PM, Andrew Haley wrote:
>
> The new algorithm does an N-way bra
On 03/10/2015 08:02 PM, Andrew Haley wrote:
The new algorithm does an N-way branch, always loading and storing
subwords according to their natural alignment. So, if the address is
random and the size is long it will access 8 bytes 50% of the time, 4
shorts 25% of the time, 2 ints 12.5% of the
On 03/11/2015 06:27 PM, Andrew Haley wrote:
On 03/11/2015 07:10 AM, John Rose wrote:
John: I'm waiting for an answer to my question here before I submit
a webrev for approval.
http://mail.openjdk.java.net/pipermail/panama-dev/2015-March/99.html
(Answered.)
http://cr.openjdk.java.net/~ap
On 03/12/2015 11:00 AM, Paul Sandoz wrote:
> We can re-use this one:
>
> https://bugs.openjdk.java.net/browse/JDK-8026049
Will do, thx.
Andrew.
On Mar 11, 2015, at 6:27 PM, Andrew Haley wrote:
> On 03/11/2015 07:10 AM, John Rose wrote:
>>>
>>> John: I'm waiting for an answer to my question here before I submit
>>> a webrev for approval.
>>>
>>> http://mail.openjdk.java.net/pipermail/panama-dev/2015-March/99.html
>>
>> (Answered.)
On Mar 11, 2015, at 7:23 PM, Andrew Haley wrote:
> On 03/11/2015 06:00 PM, Paul Sandoz wrote:
>> We need to include some unit tests before we can push.
>
> I have a test which I've been using. It could be converted into
> a unit test.
>
Ok. There are Unsafe tests in:
hotspot/test/runtime/
On 03/11/2015 06:00 PM, Paul Sandoz wrote:
> We need to include some unit tests before we can push.
I have a test which I've been using. It could be converted into
a unit test.
Andrew.
Hi Andrew,
On Mar 11, 2015, at 6:27 PM, Andrew Haley wrote:
> On 03/11/2015 07:10 AM, John Rose wrote:
>>>
>>> John: I'm waiting for an answer to my question here before I submit
>>> a webrev for approval.
>>>
>>> http://mail.openjdk.java.net/pipermail/panama-dev/2015-March/99.html
>>
>>
On 03/11/2015 05:41 PM, Vitaly Davidovich wrote:
> I don't think we need this unalignedKnown dance anymore -- just return
> unsafe.unalignedAccess() there?
Yup.
Andrew.
Also,
static boolean unaligned() {
595 if (unalignedKnown)
596 return unaligned; 597 unaligned =
unsafe.unalignedAccess();
598 unalignedKnown = true;
599 return unaligned;
600 }
I don't think we need this unalignedKnown dance anymore -- just
On 03/11/2015 05:38 PM, Vitaly Davidovich wrote:
> private static final ByteOrder byteOrder
> 571 = unsafe.isBigEndian() ? ByteOrder.BIG_ENDIAN :
> ByteOrder.LITTLE_ENDIAN;
> 572
> 573 static ByteOrder byteOrder() {
> 574 if (byteOrder == null)
> 575 throw new
private static final ByteOrder byteOrder 571 =
unsafe.isBigEndian() ? ByteOrder.BIG_ENDIAN : ByteOrder.LITTLE_ENDIAN;
572
573 static ByteOrder byteOrder() {
574 if (byteOrder == null)
575 throw new Error("Unknown byte order");
576 return byteOrder;
577
On 03/11/2015 07:10 AM, John Rose wrote:
>>
>> John: I'm waiting for an answer to my question here before I submit
>> a webrev for approval.
>>
>> http://mail.openjdk.java.net/pipermail/panama-dev/2015-March/99.html
>
> (Answered.)
http://cr.openjdk.java.net/~aph/unaligned.jdk.5/
http://cr.op
On Mar 10, 2015, at 12:02 PM, Andrew Haley wrote:
>
> The new algorithm is slightly slower because of branch misprediction.
>
> old: 2.17 IPC, 0.08% branch-misses, 91,965,281,215 cycles
> new: 1.23 IPC, 6.11% branch-misses, 99,925,255,682 cycles
>
> ...but it executes fewer instructions so we
31 matches
Mail list logo