[ 
https://issues.apache.org/jira/browse/NUMBERS-77?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17329409#comment-17329409
 ] 

Alex Herbert commented on NUMBERS-77:
-------------------------------------

{quote}Let's refer to what's in "master" (there were fixes).
{quote}
Master was updated as I was writing the post. The javadoc looks fine.

Here is some more thoughts on compare and the Comparator interface.

I think that we can handle NaN by specifying that the compare returns [-1, 0, 
1] for an ordering of non Nan numbers with equality (a zero result) defined by 
some measure of closeness. NaN inputs are handled using Double.compare. So two 
input NaNs will return 0:
{code:java}
public static int compareTo(double x, double y, double eps) {
    if (equals(x, y, eps)) {
        return 0;
    } else if (x < y) {
        return -1;
    } else if (x > y) {
        return 1;
    }
    // NaN input
    return Double.compare(x, y);
}
{code}
Using this creates a function that handles non-finite numbers as per 
Double.compare and can sort an array but not necessarily in its strict natural 
order. The sort function in Arrays is described as stable, meaning that two 
elements that have a compare result of 0 will not be reordered. Thus if you 
resort the array you get the same result.

Sorting Matt's example using Precision as is:
{code:java}
@Test
void testUnstableSort() {
    Double[] array = {0.02, 0.01, 1.0, 2.0, Double.NaN, Double.NaN};
    for (int i = 0; i < 10; i++) {
        Collections.shuffle(Arrays.asList(array));
        Arrays.sort(array, (a, b) -> Precision.compareTo(a, b, 0.1));
        System.out.println(Arrays.toString(array));
    }
}
{code}
{noformat}
[NaN, 0.02, 0.01, 1.0, 2.0, NaN]
[0.01, 2.0, NaN, 0.02, 1.0, NaN]
[0.02, NaN, 0.01, 1.0, 2.0, NaN]
[NaN, NaN, 0.01, 0.02, 1.0, 2.0]
[NaN, 0.02, 0.01, 1.0, 2.0, NaN]
[0.02, 0.01, 1.0, 2.0, NaN, NaN]
[NaN, 0.02, 0.01, 1.0, NaN, 2.0]
[1.0, NaN, 0.01, 0.02, 2.0, NaN]
[0.01, 0.02, 1.0, 2.0, NaN, NaN]
[1.0, NaN, NaN, 0.02, 0.01, 2.0]
{noformat}
With the change to the above code:
{noformat}
[0.01, 0.02, 1.0, 2.0, NaN, NaN]
[0.01, 0.02, 1.0, 2.0, NaN, NaN]
[0.01, 0.02, 1.0, 2.0, NaN, NaN]
[0.02, 0.01, 1.0, 2.0, NaN, NaN]
[0.02, 0.01, 1.0, 2.0, NaN, NaN]
[0.02, 0.01, 1.0, 2.0, NaN, NaN]
[0.02, 0.01, 1.0, 2.0, NaN, NaN]
[0.01, 0.02, 1.0, 2.0, NaN, NaN]
[0.02, 0.01, 1.0, 2.0, NaN, NaN]
[0.02, 0.01, 1.0, 2.0, NaN, NaN]
{noformat}
Looking at the javadoc for Comparator:
 # sgn(compare(x, y)) == -sgn(compare(y, x)) for all x and y
 # relation is transitive: (compare(x, y) > 0) & (compare(y, z) > 0))</tt> 
implies compare(x, z) > 0
 # compare(x, y)==0 implies that sgn(compare(x, z))==sgn(compare(y, z)) for all 
z
 # not strictly required that (compare(x, y)==0) == (x.equals( y )). Generally 
speaking, any comparator that violates this condition should clearly indicate 
this fact

I think that point 1 and 2 are covered. Point 4 is not but that is not 
required. 

But point 3 is not. Consider the points x > y > z:
{noformat}
Epsilon: |-----|

|-----|-----|
x     y     z


compare(x, y) == 0
compare(x, z) == 1
compare(y, z) == 0   (violation, it should be 1){noformat}
Here x is equal to y but above z. So y should be above z but it is not.

Thus we cannot implement the contract of a comparator. This means we do not 
have to change anything and can handle NaN in whatever way we like during 
comparison.

Some options are:
 # Leave the method and document it as unsuitable for use as part of the 
implementation for Comparator.
 # Update it to handle NaN as per Double.compare. It is still not strictly 
suitable as part of the implementation for Comparator due to violation of rule 
3. It works better for sorting but would return 0 when Precision.equals(NaN, 
NaN) returns false.
 # Remove the compare method from Precision, i.e. what is the use case if not 
for ordering.
 # Others...

 

> Move utilities from "Commons Geometry"
> --------------------------------------
>
>                 Key: NUMBERS-77
>                 URL: https://issues.apache.org/jira/browse/NUMBERS-77
>             Project: Commons Numbers
>          Issue Type: Task
>            Reporter: Gilles Sadowski
>            Priority: Major
>             Fix For: 1.1
>
>         Attachments: NUMBERS-77.diff
>
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> "Commons Geometry" defines utilities that would be a better fit in this 
> component.
> Duplication of general-purpose codes should be avoided, in order to benefit 
> from consolidated usage (bug reporting, performance enhancements, ...).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to