Re: Array equality, comparison and mismatch

2015-10-15 Thread Paul Sandoz

> On 13 Oct 2015, at 12:03, Andrew Haley  wrote:
> 
> On 13/10/15 10:22, Paul Sandoz wrote:
>> Analysis so far indicate big gains are to be had on larger arrays with 
>> better or no impact on small arrays if i do the following instead:
>> 
>>  if (Double.doubleToRawLongBits(a[i]) !=
>>  Double.doubleToRawLongBits(b[i])) {
>>  int c = Double.compare(a[i], b[i]);
>>  if (c != 0) return c;
>>  }
> 
> I was about to make a similar comment.  My experiment was with
> 
>if (Double.doubleToRawLongBits(a[i]) != 
> Double.doubleToRawLongBits(b[i])
>&& (Double.doubleToLongBits(a[i]) !=
>Double.doubleToLongBits(b[i])))
>return Double.compare(a[i], b[i]);
>}
> 
> which is about twice as fast as the original version, as is yours.
> But yours is more elegant.  :-)
> 

Thanks :-) updated:

  
http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8033148-Arrays-lexico-compare/webrev/


> It's a shame that HotSpot doesn't see through the load of a
> double and then the conversion through doubleToRawLongBits:
> it could just load directly into the integer registers.
> 

The following webrev explicitly does what you mention above and is faster for 
mismatching double arrays (with no differing NaN values):

  
http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8136924-arrays-mismatch-vectorized-unsafe/webrev/

I would be very interested in your opinion on being able to make intrinsic on 
ARM the method ArraysSupport.vectorizedMismatch [1].

Paul.

[1] 
http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8136924-arrays-mismatch-vectorized-unsafe/webrev/src/java.base/share/classes/java/util/ArraysSupport.java.html


Re: Array equality, comparison and mismatch

2015-10-15 Thread Andrew Haley
On 10/15/2015 11:52 AM, Paul Sandoz wrote:
> I would be very interested in your opinion on being able to make
> intrinsic on ARM the method ArraysSupport.vectorizedMismatch [1].

Pretty easy, I'm sure.  We already have MacroAssembler::string_compare
which does essentially the same thing but with char arrays.

Andrew.


Re: Array equality, comparison and mismatch

2015-10-13 Thread Paul Sandoz

> On 13 Oct 2015, at 05:46, Mike Duigou  wrote:
> 
>>> - I apologize if this was discussed earlier in the thread but why is the 
>>> comparison of floats and doubles done by first == operator of the int bits 
>>> and only then the compare method ?
>> I was being consistent with the test used for the existing equals
>> methods of float[] and double[]. Note that the Float/Double.*to*Bits
>> methods are intrinsic.
> 
> I guess my worry is that the == floatToIntBits would be redundant as the 
> implementation of compare() might have exactly the same test as it's first 
> step--it would be a reasonable optimization since it would have the benefit 
> of loading the values into registers before doing the more expensive 
> relational comparison.
> 

I did not concentrate on this area too much since this code is likely to 
change, but i just looked a little more closely at some benchmark results and 
generated code.

Analysis so far indicate big gains are to be had on larger arrays with better 
or no impact on small arrays if i do the following instead:

  if (Double.doubleToRawLongBits(a[i]) !=
  Double.doubleToRawLongBits(b[i])) {
  int c = Double.compare(a[i], b[i]);
  if (c != 0) return c;
  }

(If C2 inlining occurs the registers correspond to the two array elements, e.g. 
xmm*, should be reused).

That gets the NaN checking off the critical path. I think that is reasonable to 
do given the assumption that varying forms of NaN’s are likely to be rare. That 
same assumption also applies to Unsafe vectorization which first compares raw 
bits.

Similar modifications can be made to the equals methods.

Even though these methods are likely to change i will probably modify the 
current float/double methods. Better to come out with good performance now if 
for some reason the unsafe vectorization does not make it in.

Thanks,
Paul.


Re: Array equality, comparison and mismatch

2015-10-13 Thread Andrew Haley
On 13/10/15 10:22, Paul Sandoz wrote:
> Analysis so far indicate big gains are to be had on larger arrays with better 
> or no impact on small arrays if i do the following instead:
> 
>   if (Double.doubleToRawLongBits(a[i]) !=
>   Double.doubleToRawLongBits(b[i])) {
>   int c = Double.compare(a[i], b[i]);
>   if (c != 0) return c;
>   }

I was about to make a similar comment.  My experiment was with

if (Double.doubleToRawLongBits(a[i]) != Double.doubleToRawLongBits(b[i])
&& (Double.doubleToLongBits(a[i]) !=
Double.doubleToLongBits(b[i])))
return Double.compare(a[i], b[i]);
}

which is about twice as fast as the original version, as is yours.
But yours is more elegant.  :-)

It's a shame that HotSpot doesn't see through the load of a
double and then the conversion through doubleToRawLongBits:
it could just load directly into the integer registers.

Andrew.



Re: Array equality, comparison and mismatch

2015-10-12 Thread Paul Sandoz

> On 12 Oct 2015, at 09:47, Andrej Golovnin  wrote:
> 
> Hi Paul,
> 
> wouldn't it be better to use Objects#equals(Object, Object) in the
> line 3293 in the Arrays class instead of the ternary operator (the
> same applies to the line 3238)?
> 

Thanks updated both the Arrays.equals and Arrays.mismatch methods (and also 
fixed a bug in the JavaDoc for the common prefix definition of those methods).

Paul.


Re: Array equality, comparison and mismatch

2015-10-12 Thread Paul Sandoz

> On 22 Sep 2015, at 18:30, Paul Sandoz  wrote:
> 
> Hi,
> 
> Please review the following which adds methods to Arrays for performing 
> equality, comparison and mismatch:
> 
>  https://bugs.openjdk.java.net/browse/JDK-8033148
>  
> http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8033148-Arrays-lexico-compare/webrev/
>  
> http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8033148-Arrays-lexico-compare/specdiff/overview-summary.html
> 

Updated the above with JavaDoc for all methods.

Paul.


Re: Array equality, comparison and mismatch

2015-10-12 Thread Andrej Golovnin
Hi Paul,

wouldn't it be better to use Objects#equals(Object, Object) in the
line 3293 in the Arrays class instead of the ternary operator (the
same applies to the line 3238)?

Best regards,
Andrej Golovnin

On Mon, Oct 12, 2015 at 9:31 AM, Paul Sandoz  wrote:
>
>> On 22 Sep 2015, at 18:30, Paul Sandoz  wrote:
>>
>> Hi,
>>
>> Please review the following which adds methods to Arrays for performing 
>> equality, comparison and mismatch:
>>
>>  https://bugs.openjdk.java.net/browse/JDK-8033148
>>  
>> http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8033148-Arrays-lexico-compare/webrev/
>>  
>> http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8033148-Arrays-lexico-compare/specdiff/overview-summary.html
>>
>
> Updated the above with JavaDoc for all methods.
>
> Paul.


Re: Array equality, comparison and mismatch

2015-10-12 Thread Mike Duigou



On 22 Sep 2015, at 18:30, Paul Sandoz  wrote:

Hi,

Please review the following which adds methods to Arrays for 
performing equality, comparison and mismatch:


 https://bugs.openjdk.java.net/browse/JDK-8033148
 
http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8033148-Arrays-lexico-compare/webrev/
 
http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8033148-Arrays-lexico-compare/specdiff/overview-summary.html




Updated the above with JavaDoc for all methods.

Paul.


There is a Kilimanjaro of tedium in building changes like these that are 
implemented across all the basic types. Thank you for taking this on so 
thoroughly.


A few comments.

- Inconsistent @since declarations. Both "9" and "1.9" are used. I know 
Henry Jen was working on normalizing this for the -platform javac work, 
but am uncertain what was chosen. It is perhaps time to drop the "1."


- Have you done side by side textual comparisons of the docs and 
implementations to make sure there are only expected differences of 
types and semantics (== vs equals vs compareUnsigned)? It's easy for an 
error to creep in as you go through many iterations by forgetting to 
make a fix in one implementation.


- I apologize if this was discussed earlier in the thread but why is the 
comparison of floats and doubles done by first == operator of the int 
bits and only then the compare method ? It would seem that the compare 
method could use this same technique if it wanted. Why not do the same 
for unsigned comparisons since == would also work for those?


I am *REALLY* looking forward to using these!

Mike


Re: Array equality, comparison and mismatch

2015-10-12 Thread Paul Sandoz

> On 12 Oct 2015, at 19:50, Mike Duigou  wrote:
> 
> 
>>> On 22 Sep 2015, at 18:30, Paul Sandoz  wrote:
>>> Hi,
>>> Please review the following which adds methods to Arrays for performing 
>>> equality, comparison and mismatch:
>>> https://bugs.openjdk.java.net/browse/JDK-8033148
>>> http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8033148-Arrays-lexico-compare/webrev/
>>> http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8033148-Arrays-lexico-compare/specdiff/overview-summary.html
>> Updated the above with JavaDoc for all methods.
>> Paul.
> 
> There is a Kilimanjaro of tedium in building changes like these that are 
> implemented across all the basic types. Thank you for taking this on so 
> thoroughly.

Tell me about it :-)


> 
> A few comments.
> 
> - Inconsistent @since declarations. Both "9" and "1.9" are used. I know Henry 
> Jen was working on normalizing this for the -platform javac work, but am 
> uncertain what was chosen. It is perhaps time to drop the "1.”
> 

I updated Byte/Short methods. At some point there is likely to be a global 
s/@since 1.9/@since 9/ but i have been trying to use 9 where possible.


> - Have you done side by side textual comparisons of the docs and 
> implementations to make sure there are only expected differences of types and 
> semantics (== vs equals vs compareUnsigned)? It's easy for an error to creep 
> in as you go through many iterations by forgetting to make a fix in one 
> implementation.
> 

Not specifically but kind of when i copied the template from the byte[] 
receiving methods. I have eyeballed them enough times in different contexts: 
code, javadoc, specdiff. Doing that caught some mistakes. There might be some 
mistakes still lurking...


> - I apologize if this was discussed earlier in the thread but why is the 
> comparison of floats and doubles done by first == operator of the int bits 
> and only then the compare method ?

I was being consistent with the test used for the existing equals methods of 
float[] and double[]. Note that the Float/Double.*to*Bits methods are intrinsic.


> It would seem that the compare method could use this same technique if it 
> wanted. Why not do the same for unsigned comparisons since == would also work 
> for those?
> 

I am not following, sorry. Are you suggesting say:

  for (int i = 0; i < length; i++) {
int c = Double.compare(a[i], b[i])
if (c != 0) return c;
  }

?

Note that this area will change when the unsafe vectored mismatch goes in.


> I am *REALLY* looking forward to using these!
> 

Thanks. You might like them even more when they get faster :-)

Paul.


Re: Array equality, comparison and mismatch

2015-10-12 Thread Mike Duigou


- I apologize if this was discussed earlier in the thread but why is 
the comparison of floats and doubles done by first == operator of the 
int bits and only then the compare method ?


I was being consistent with the test used for the existing equals
methods of float[] and double[]. Note that the Float/Double.*to*Bits
methods are intrinsic.


I guess my worry is that the == floatToIntBits would be redundant as the 
implementation of compare() might have exactly the same test as it's 
first step--it would be a reasonable optimization since it would have 
the benefit of loading the values into registers before doing the more 
expensive relational comparison.


Mike


Re: Array equality, comparison and mismatch

2015-10-07 Thread Paul Sandoz

> On 6 Oct 2015, at 10:58, Chris Hegarty  wrote:
>>> It was not immediately obvious to me that the common prefix can be 0. 
>>> Should this be called out specifically?
>>> 
>> 
>> When reading the documentation of compare or mismatch or both?
> 
> mismatch.  But maybe this is just me.
> 

I could add something following the paragraph defining the common prefix:

  Note that a common prefix length of {@code 0} indicates that the first 
elements from each array mismatch.

Paul.


Re: Array equality, comparison and mismatch

2015-10-07 Thread Chris Hegarty
On 7 Oct 2015, at 16:55, Paul Sandoz  wrote:

> 
>> On 6 Oct 2015, at 10:58, Chris Hegarty  wrote:
 It was not immediately obvious to me that the common prefix can be 0. 
 Should this be called out specifically?
 
>>> 
>>> When reading the documentation of compare or mismatch or both?
>> 
>> mismatch.  But maybe this is just me.
>> 
> 
> I could add something following the paragraph defining the common prefix:
> 
>  Note that a common prefix length of {@code 0} indicates that the first 
> elements from each array mismatch.

Thanks, that makes it explicit.

-Chris.

> Paul.



Re: Array equality, comparison and mismatch

2015-10-06 Thread Paul Sandoz

> On 5 Oct 2015, at 17:35, Chris Hegarty  wrote:
> 
> Paul,
> 
> On 22/09/15 17:30, Paul Sandoz wrote:
>> Hi,
>> 
>> Please review the following which adds methods to Arrays for performing 
>> equality, comparison and mismatch:
>> 
>>   https://bugs.openjdk.java.net/browse/JDK-8033148
>>   
>> http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8033148-Arrays-lexico-compare/webrev/
>>   
>> http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8033148-Arrays-lexico-compare/specdiff/overview-summary.html
> 
> This looks very good.
> 
> I know that there has been some discussion already about the behavior when 
> passed null, but it seems a little unfortunate that the range accepting 
> 'equals' methods don't behave in a similar manner to that of the non-range 
> 'equals' methods. But I do accept that it makes little sense, where would the 
> from/to indices come from. So I think NPE makes sense for these.
> 

Note that this consistent with other range accepting methods, such as on Arrays 
or Spliterators (which also usually throw null on the non-range methods too, i 
wish we could be consistent in that aspect).


> It was not immediately obvious to me that the common prefix can be 0. Should 
> this be called out specifically?
> 

When reading the documentation of compare or mismatch or both?


> Minor typo(s):
> 
>  "...the length of the smaller array and it follows THAT the
>   index is only valid ..."
> 
>  "If one array is a proper prefix of the other, over the specified
>   ranges, then the returned relative index is the length of the
>   smaller range and it follows THAT the relative index is only valid
>   for the array with the larger range.”
> 

Fixed.

Thanks,
Paul.


Re: Array equality, comparison and mismatch

2015-10-06 Thread Chris Hegarty

On 6 Oct 2015, at 09:50, Paul Sandoz  wrote:

>> 
>> On 5 Oct 2015, at 17:35, Chris Hegarty  wrote:
>> 
>> Paul,
>> 
>> On 22/09/15 17:30, Paul Sandoz wrote:
>>> Hi,
>>> 
>>> Please review the following which adds methods to Arrays for performing 
>>> equality, comparison and mismatch:
>>> 
>>>  https://bugs.openjdk.java.net/browse/JDK-8033148
>>>  
>>> http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8033148-Arrays-lexico-compare/webrev/
>>>  
>>> http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8033148-Arrays-lexico-compare/specdiff/overview-summary.html
>> 
>> This looks very good.
>> 
>> I know that there has been some discussion already about the behavior when 
>> passed null, but it seems a little unfortunate that the range accepting 
>> 'equals' methods don't behave in a similar manner to that of the non-range 
>> 'equals' methods. But I do accept that it makes little sense, where would 
>> the from/to indices come from. So I think NPE makes sense for these.
>> 
> 
> Note that this consistent with other range accepting methods, such as on 
> Arrays or Spliterators (which also usually throw null on the non-range 
> methods too, i wish we could be consistent in that aspect).

Right. 

>> It was not immediately obvious to me that the common prefix can be 0. Should 
>> this be called out specifically?
>> 
> 
> When reading the documentation of compare or mismatch or both?

mismatch.  But maybe this is just me.

-Chris.



Re: Array equality, comparison and mismatch

2015-10-05 Thread Chris Hegarty

Paul,

On 22/09/15 17:30, Paul Sandoz wrote:

Hi,

Please review the following which adds methods to Arrays for performing 
equality, comparison and mismatch:

   https://bugs.openjdk.java.net/browse/JDK-8033148
   
http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8033148-Arrays-lexico-compare/webrev/
   
http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8033148-Arrays-lexico-compare/specdiff/overview-summary.html


This looks very good.

I know that there has been some discussion already about the behavior 
when passed null, but it seems a little unfortunate that the range 
accepting 'equals' methods don't behave in a similar manner to that of 
the non-range 'equals' methods. But I do accept that it makes little 
sense, where would the from/to indices come from. So I think NPE makes 
sense for these.


It was not immediately obvious to me that the common prefix can be 0. 
Should this be called out specifically?


Minor typo(s):

  "...the length of the smaller array and it follows THAT the
   index is only valid ..."

  "If one array is a proper prefix of the other, over the specified
   ranges, then the returned relative index is the length of the
   smaller range and it follows THAT the relative index is only valid
   for the array with the larger range."


The motivation behind this is the use of Unsafe in popular libraries and 
frameworks to speed up the lexicographical comparison of byte arrays.

This issue focuses on the API and functional implementations. A follow up issue 
[1] tracks updating the implementations to use a common method that leverages 
Unsafe to improve performance. A further issue [2] tracks the intrinsification of 
that common method to support operating on > 64 bits in width and further 
improve performance. A further issue, yet to be created, will follow up on 
updating existing JDK code to use the public and/or internal methods where 
appropriate. Example candidates include String (compareTo, perhaps add a mismatch 
method and possibly reviewing existing intrinsics, including those for compact 
Strings), and managed and direct Buffers.

So far i have only documented the new methods operating on byte[], as that will 
act as the template for the other methods.

Some points:

- Methods operating on Object[] will compare Object elements using 
Object.equals or associated comparators (as is the case for the existing equals 
method operating on Object[]).

- Methods operating on float[] or double[] will check such array elements for 
equality using the IEEE bit layout  (as is the case for the existing equals 
method operating on float[] or double[]).

- Primitive array element comparison will be performed as if by the boxed 
primitive type’s compare or compareUnsigned method.

- Range-accepting methods do not support null array values.

- Non-range and range-accepting mismatch methods do not support null array 
values. (What value should be returned when a mismatch is performed on a null 
array and an empty array)?

- Speculation: it may be possible that Project Valhalla will enable us to 
“compress” down methods of certain groups to one “template” method. Even if 
that is not possible i am not overly concerned about the number of methods 
added, which represents the maximum set. We could reduce them without a 
fundamental loss of functionality, but that might have a “semantic” loss 
requiring developers to roll their own wrappers.


-Chris.


Thanks,
Paul.

[1]  https://bugs.openjdk.java.net/browse/JDK-8136924
http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8136924-arrays-mismatch-vectorized-unsafe/webrev/

[2] https://bugs.openjdk.java.net/browse/JDK-8044082



Re: Array equality, comparison and mismatch

2015-09-23 Thread Tagir F. Valeev
Hello!

Quite interesting feature. Isn't it a typo here?
http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8136924-arrays-mismatch-vectorized-unsafe/webrev/src/java.base/share/classes/java/util/ArraysSupport.java.html

419if (!Float.isNaN(a[aFromIndex + i]) || !Float.isNaN(b[aFromIndex + i]))
487if (!Double.isNaN(a[aFromIndex + i]) || !Double.isNaN(b[aFromIndex + i]))

Seems that it should be b[bFromIndex + i], not b[aFromIndex + i] in
both cases.

As for methods like compare(byte[] a, int aFromIndex, int aToIndex,
byte[] b, int bFromIndex, int bToIndex)

Is it better to introduce two to-indices instead of single length
argument (like in System.arraycopy)? Six method arguments looks
a little bit crowded.

As for method count, it would be really nice and user-friendly were it
be possible to add methods to the array types themselves, so users can
write:

byte[] b1;
byte[] b2;

int mismatchPos = b1.mismatch(b2);
Comparator cmp = byte[]::compareTo;

This somehow works with clone() method (I guess, there is some very
special code in javac for this). There could be some special abstract
classes like java.lang.ByteArray with private constructor (using
this[idx] to access elements) and all byte[] arrays could extend them,
so JDK developers could add new methods to arrays without modifying
the compiler. Well, that's just my fantasies about the better
world...

With best regards,
Tagir Valeev.

PS> Hi,

PS> Please review the following which adds methods to Arrays for
PS> performing equality, comparison and mismatch:

PS>   https://bugs.openjdk.java.net/browse/JDK-8033148
PS>  
PS> 
http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8033148-Arrays-lexico-compare/webrev/
PS>  
PS> 
http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8033148-Arrays-lexico-compare/specdiff/overview-summary.html

PS> The motivation behind this is the use of Unsafe in popular
PS> libraries and frameworks to speed up the lexicographical comparison of byte 
arrays.

PS> This issue focuses on the API and functional implementations. A
PS> follow up issue [1] tracks updating the implementations to use a
PS> common method that leverages Unsafe to improve performance. A
PS> further issue [2] tracks the intrinsification of that common
PS> method to support operating on > 64 bits in width and further
PS> improve performance. A further issue, yet to be created, will
PS> follow up on updating existing JDK code to use the public and/or
PS> internal methods where appropriate. Example candidates include
PS> String (compareTo, perhaps add a mismatch method and possibly
PS> reviewing existing intrinsics, including those for compact
PS> Strings), and managed and direct Buffers.

PS> So far i have only documented the new methods operating on
PS> byte[], as that will act as the template for the other methods.

PS> Some points:

PS> - Methods operating on Object[] will compare Object elements
PS> using Object.equals or associated comparators (as is the case for
PS> the existing equals method operating on Object[]).

PS> - Methods operating on float[] or double[] will check such array
PS> elements for equality using the IEEE bit layout  (as is the case
PS> for the existing equals method operating on float[] or double[]).

PS> - Primitive array element comparison will be performed as if by
PS> the boxed primitive type’s compare or compareUnsigned method.

PS> - Range-accepting methods do not support null array values.

PS> - Non-range and range-accepting mismatch methods do not support
PS> null array values. (What value should be returned when a mismatch
PS> is performed on a null array and an empty array)?

PS> - Speculation: it may be possible that Project Valhalla will
PS> enable us to “compress” down methods of certain groups to one
PS> “template” method. Even if that is not possible i am not overly
PS> concerned about the number of methods added, which represents the
PS> maximum set. We could reduce them without a fundamental loss of
PS> functionality, but that might have a “semantic” loss requiring
PS> developers to roll their own wrappers.

PS> Thanks,
PS> Paul.

PS> [1]  https://bugs.openjdk.java.net/browse/JDK-8136924
PS> 
http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8136924-arrays-mismatch-vectorized-unsafe/webrev/

PS> [2] https://bugs.openjdk.java.net/browse/JDK-8044082



Re: Array equality, comparison and mismatch

2015-09-23 Thread Paul Sandoz
On 23 Sep 2015, at 08:55, Tagir F. Valeev  wrote:
> Hello!
> 
> Quite interesting feature. Isn't it a typo here?
> http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8136924-arrays-mismatch-vectorized-unsafe/webrev/src/java.base/share/classes/java/util/ArraysSupport.java.html
> 
> 419if (!Float.isNaN(a[aFromIndex + i]) || !Float.isNaN(b[aFromIndex + i]))
> 487if (!Double.isNaN(a[aFromIndex + i]) || !Double.isNaN(b[aFromIndex + 
> i]))
> 
> Seems that it should be b[bFromIndex + i], not b[aFromIndex + i] in
> both cases.
> 

Oops, well spotted, thanks. Testing for NaNs can be tricky, now i need to work 
out why the tests did not capture that!


> As for methods like compare(byte[] a, int aFromIndex, int aToIndex,
>byte[] b, int bFromIndex, int bToIndex)
> 
> Is it better to introduce two to-indices instead of single length
> argument (like in System.arraycopy)?

Because that’s the style of usage in Arrays, and AFAICT the common style used 
by apps/frameworks that have rolled their own array comparators.


> Six method arguments looks
> a little bit crowded.
> 
> As for method count, it would be really nice and user-friendly were it
> be possible to add methods to the array types themselves, so users can
> write:
> 
> byte[] b1;
> byte[] b2;
> 
> int mismatchPos = b1.mismatch(b2);
> Comparator cmp = byte[]::compareTo;
> 
> This somehow works with clone() method (I guess, there is some very
> special code in javac for this). There could be some special abstract
> classes like java.lang.ByteArray with private constructor (using
> this[idx] to access elements) and all byte[] arrays could extend them,
> so JDK developers could add new methods to arrays without modifying
> the compiler. Well, that's just my fantasies about the better
> world…
> 


Perhaps it’s closer than you think, although further away than you might like 
:-) There are some experiments in Project Valhalla [1]. We would like all 
arrays to extend from a common array super type, from which we can then hang 
off further methods. That’s part of the Arrays 2.0 story.

You correctly guessed there is currently something special going in javac and 
the VM. I think it might be quite simple to extend javac. The tricky work is 
more likely in the VM to hook in the array super type and it's methods (perhaps 
patching into the underlying array class's v/itables).

Paul.

[1] http://hg.openjdk.java.net/valhalla/valhalla/jdk/rev/73ae1e2cbb20



Re: Array equality, comparison and mismatch

2015-09-23 Thread Peter Levart

Hi Paul,

Just some thoughts about nulls...

Simple compare and compareUnsigned methods (without ranges) accept null 
arrays. They specify that indirectly by stating that they are consistent 
with equals methods that do the same. The equals methods specify that 
two null array references are equal and by equal being an equivalence 
relation it follows that a null array reference is not equal to non-null 
reference (unless all arrays were equal), but compare[Unsigned] methods 
do not specify the ordering of null to non-null array reference. The 
implementation does order null array reference before any non-null 
reference.


With compare methods taking Object[] arrays there is another level of 
nullness to consider - the elements. The Arrays boolean equals(Object[] 
a, Object[] b) method uses the semantics of Objects.equals for comparing 
elements, therefore it allows null elements. So does Arrays Comparable> int compare(T[] a, T[] b), which considers null 
element as the lowest value. This seems ok although in TreeMap, for 
example, null keys are not allowed if Comparator is not specified, but 
for consistency with Arrays.equals this is a desired property. But 
Arrays  int compare(T[] a, T[] b, Comparator cmp) does the 
same - it treats null elements as the lowest value. This is not 
consistent with TreeMap, for example, where all decisions on ordering 
are delegated to Comparator which can order null elements (or reject 
them) as it pleases.


Regards, Peter


On 09/22/2015 06:30 PM, Paul Sandoz wrote:

Hi,

Please review the following which adds methods to Arrays for performing 
equality, comparison and mismatch:

   https://bugs.openjdk.java.net/browse/JDK-8033148
   
http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8033148-Arrays-lexico-compare/webrev/
   
http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8033148-Arrays-lexico-compare/specdiff/overview-summary.html

The motivation behind this is the use of Unsafe in popular libraries and 
frameworks to speed up the lexicographical comparison of byte arrays.

This issue focuses on the API and functional implementations. A follow up issue 
[1] tracks updating the implementations to use a common method that leverages 
Unsafe to improve performance. A further issue [2] tracks the intrinsification of 
that common method to support operating on > 64 bits in width and further 
improve performance. A further issue, yet to be created, will follow up on 
updating existing JDK code to use the public and/or internal methods where 
appropriate. Example candidates include String (compareTo, perhaps add a mismatch 
method and possibly reviewing existing intrinsics, including those for compact 
Strings), and managed and direct Buffers.

So far i have only documented the new methods operating on byte[], as that will 
act as the template for the other methods.

Some points:

- Methods operating on Object[] will compare Object elements using 
Object.equals or associated comparators (as is the case for the existing equals 
method operating on Object[]).

- Methods operating on float[] or double[] will check such array elements for 
equality using the IEEE bit layout  (as is the case for the existing equals 
method operating on float[] or double[]).

- Primitive array element comparison will be performed as if by the boxed 
primitive type’s compare or compareUnsigned method.

- Range-accepting methods do not support null array values.

- Non-range and range-accepting mismatch methods do not support null array 
values. (What value should be returned when a mismatch is performed on a null 
array and an empty array)?

- Speculation: it may be possible that Project Valhalla will enable us to 
“compress” down methods of certain groups to one “template” method. Even if 
that is not possible i am not overly concerned about the number of methods 
added, which represents the maximum set. We could reduce them without a 
fundamental loss of functionality, but that might have a “semantic” loss 
requiring developers to roll their own wrappers.

Thanks,
Paul.

[1]  https://bugs.openjdk.java.net/browse/JDK-8136924
http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8136924-arrays-mismatch-vectorized-unsafe/webrev/

[2] https://bugs.openjdk.java.net/browse/JDK-8044082





Re: Array equality, comparison and mismatch

2015-09-23 Thread Paul Sandoz
Hi Peter,

On 23 Sep 2015, at 10:09, Peter Levart  wrote:

> Hi Paul,
> 
> Just some thoughts about nulls...
> 

Thanks, this is helpful.


> Simple compare and compareUnsigned methods (without ranges) accept null 
> arrays.

Yes.


> They specify that indirectly by stating that they are consistent with equals 
> methods that do the same. The equals methods specify that two null array 
> references are equal and by equal being an equivalence relation it follows 
> that a null array reference is not equal to non-null reference (unless all 
> arrays were equal),

Right, i had the indirect specification on both methods, but of course that 
makes no sense for unsigned comparison, so it was removed.


> but compare[Unsigned] methods do not specify the ordering of null to non-null 
> array reference. The implementation does order null array reference before 
> any non-null reference.

I have clarified the doc the all the byte[] compare/Unsigned methods:

* A {@code null} array reference is considered lexicographically less
* than a non-{@code null} array reference.  Two {@code null} array
* references are considered equal.


> 
> With compare methods taking Object[] arrays there is another level of 
> nullness to consider - the elements.

Yes.


> The Arrays boolean equals(Object[] a, Object[] b) method uses the semantics 
> of Objects.equals for comparing elements, therefore it allows null elements. 
> So does Arrays > int compare(T[] a, T[] b), 
> which considers null element as the lowest value. This seems ok although in 
> TreeMap, for example, null keys are not allowed if Comparator is not 
> specified, but for consistency with Arrays.equals this is a desired property. 
> But Arrays  int compare(T[] a, T[] b, Comparator cmp) does the 
> same - it treats null elements as the lowest value. This is not consistent 
> with TreeMap, for example, where all decisions on ordering are delegated to 
> Comparator which can order null elements (or reject them) as it pleases.
> 

A good point. Same applies to mismatch as well. Updated.

There is another inconsistency. Array methods sort and binarySearch do not 
explicitly declare Comparable and instead this is implied if a null Comparator 
argument is given. I prefer separate methods for compare, making Comparable 
explicit, and throwing NPE for a null Comparator. the semantics are slightly 
different for mismatch where constraining elements to be of of Comparable is 
not necessary). However, for consistent i might be pushed towards 
sort/binarySearch pattern.

Thanks,
Paul.