Re: [Numbers] Arrays of "Complex" objects and RAM

2019-11-09 Thread Gilles Sadowski
Le dim. 10 nov. 2019 à 00:33, Alex Herbert  a écrit :
>
>
>
> > On 9 Nov 2019, at 18:10, Alex Herbert  wrote:
> >
> >
> >
> >> On 9 Nov 2019, at 12:23, Gilles Sadowski  >> > wrote:
> >>
> >>> Snip
> >>
> >> I think that I tried
> >>  $ mvn -Dcheckstyle.failOnViolation=false test
> >>
> >> And still it wouldn't run the test (because I had introduced the
> >> "System.out" forbidden construct).
> >
> > Checkstyle is configured to run in the validate phase (right at the start) 
> > [1]. The default is normally verify. This was copied from commons RNG and 
> > predates my joining the project.
> >
> > So this is why you cannot run ‘mvn test’ as it runs checkstyle:check and 
> > fails you before doing the test.
> >
> > Now fail on violation is ‘true’ perhaps we should move the check back to 
> > validate. So you cannot do a 'mvn install' with any checkstyle errors. You 
> > can also run mvn checkstyle:check to run the goal manually.
> >
> > [1] http://maven.apache.org/ref/3.6.2/maven-core/lifecycles.html 
> > 
> Short answer:
>
> *** Checkstyle ignores command-line user properties in favour of the POM 
> configuration. ***
>
> (Examples are shown below)
>
> If you want to run tests and use System.out.print to debug stuff (which is a 
> very reasonable dev action) then:
>
> 1. We move checkstyle to run after the test phase (i.e. in verify)
> 2. You have to put '// CHECKSTYLE: stop all’ or e.g. ‘// CHECKSTYLE: stop 
> RegexpCheck’ (for System.out violations) in the file you are working on
> 3. You have to use false in the POM 
> (temporarily)
> 4. You can use an *unset* property from the POM: 
> -Dcheckstyle.maxAllowedViolations=10
>
> I favour 1 or 4.
>
> Option 1 would have less typing and less to remember.

+1 :-)

Gilles

> —
>
> Examples:
>
> Hypothesis: Checkstyle is not using the properties set using -D.
>
>
> User set properties should override the POM. This is how PMD works. If you 
> make a violation for PMD and do this:
>
> mvn pmd:check -Dpmd.failOnViolation=true -X
>
> You can see:
>
> [DEBUG]   (f) failOnViolation = true
>
> And it fails.
>
>
> mvn pmd:check -Dpmd.failOnViolation=false -X
>
> You can see:
>
> [DEBUG]   (f) failOnViolation = false
>
> And it is allowed.
>
>
> Trying the same with checkstyle:
>
> mvn checkstyle:check  -Dcheckstyle.failOnViolation=true/false 
> -Dcheckstyle.failsOnError=true/false -X
>
> All variants of true/false show the config from the POM:
>
> [DEBUG]   (f) failOnViolation = true
> [DEBUG]   (f) failsOnError = false
>
>
> Other properties work as documented:
>
> mvn checkstyle:check  -Dcheckstyle.maxAllowedViolations=10
>
> This shows as:
>
> [DEBUG]   (f) maxAllowedViolations = 10
>
> And the build passes.
>
>
> If I remove:
>
>   true
>
> From the checkstyle build plugin configuration in the POM then the command 
> line user properties work, i.e.
>
> mvn checkstyle:check  -Dcheckstyle.failOnViolation=false
>
> [DEBUG]   (f) failOnViolation = false
> -> PASS
>
> mvn checkstyle:check  -Dcheckstyle.failOnViolation=true
>
> [DEBUG]   (f) failOnViolation = true
> -> FAIL
>
> mvn checkstyle:check
>
> [DEBUG]   (f) failOnViolation = true
> -> FAIL
>
> The last case uses the default value of true. It is still logged to the 
> console. The 'effective POM' does not have the  property 
> anywhere so checkstyle is logging this even though it has not been configured.
>
> I tried checkstyle plugin 3.0.0 and 3.1.0 (the latest). Same result.
>
>
> As a control I put 10 in the 
> checkstyle POM configuration. Checkstyle then ignore the same command line 
> switch that worked before.
>
> *** So it seems checkstyle ignores command-line user properties in favour of 
> the POM configuration. ***
>
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [Numbers] Arrays of "Complex" objects and RAM

2019-11-09 Thread Alex Herbert


> On 9 Nov 2019, at 18:10, Alex Herbert  wrote:
> 
> 
> 
>> On 9 Nov 2019, at 12:23, Gilles Sadowski > > wrote:
>> 
>>> Snip
>> 
>> I think that I tried
>>  $ mvn -Dcheckstyle.failOnViolation=false test
>> 
>> And still it wouldn't run the test (because I had introduced the
>> "System.out" forbidden construct).
> 
> Checkstyle is configured to run in the validate phase (right at the start) 
> [1]. The default is normally verify. This was copied from commons RNG and 
> predates my joining the project.
> 
> So this is why you cannot run ‘mvn test’ as it runs checkstyle:check and 
> fails you before doing the test.
> 
> Now fail on violation is ‘true’ perhaps we should move the check back to 
> validate. So you cannot do a 'mvn install' with any checkstyle errors. You 
> can also run mvn checkstyle:check to run the goal manually.
> 
> [1] http://maven.apache.org/ref/3.6.2/maven-core/lifecycles.html 
> 
Short answer:

*** Checkstyle ignores command-line user properties in favour of the POM 
configuration. ***

(Examples are shown below)

If you want to run tests and use System.out.print to debug stuff (which is a 
very reasonable dev action) then:

1. We move checkstyle to run after the test phase (i.e. in verify)
2. You have to put '// CHECKSTYLE: stop all’ or e.g. ‘// CHECKSTYLE: stop 
RegexpCheck’ (for System.out violations) in the file you are working on
3. You have to use false in the POM 
(temporarily)
4. You can use an *unset* property from the POM: 
-Dcheckstyle.maxAllowedViolations=10

I favour 1 or 4.

Option 1 would have less typing and less to remember.

—

Examples:

Hypothesis: Checkstyle is not using the properties set using -D.


User set properties should override the POM. This is how PMD works. If you make 
a violation for PMD and do this:

mvn pmd:check -Dpmd.failOnViolation=true -X

You can see:

[DEBUG]   (f) failOnViolation = true

And it fails.


mvn pmd:check -Dpmd.failOnViolation=false -X

You can see:

[DEBUG]   (f) failOnViolation = false

And it is allowed.


Trying the same with checkstyle:

mvn checkstyle:check  -Dcheckstyle.failOnViolation=true/false 
-Dcheckstyle.failsOnError=true/false -X

All variants of true/false show the config from the POM:

[DEBUG]   (f) failOnViolation = true
[DEBUG]   (f) failsOnError = false


Other properties work as documented:

mvn checkstyle:check  -Dcheckstyle.maxAllowedViolations=10

This shows as:

[DEBUG]   (f) maxAllowedViolations = 10

And the build passes.


If I remove:

  true

From the checkstyle build plugin configuration in the POM then the command line 
user properties work, i.e.

mvn checkstyle:check  -Dcheckstyle.failOnViolation=false

[DEBUG]   (f) failOnViolation = false
-> PASS

mvn checkstyle:check  -Dcheckstyle.failOnViolation=true

[DEBUG]   (f) failOnViolation = true
-> FAIL

mvn checkstyle:check

[DEBUG]   (f) failOnViolation = true
-> FAIL

The last case uses the default value of true. It is still logged to the 
console. The 'effective POM' does not have the  property 
anywhere so checkstyle is logging this even though it has not been configured.

I tried checkstyle plugin 3.0.0 and 3.1.0 (the latest). Same result.


As a control I put 10 in the 
checkstyle POM configuration. Checkstyle then ignore the same command line 
switch that worked before.

*** So it seems checkstyle ignores command-line user properties in favour of 
the POM configuration. ***





Re: [Numbers] Arrays of "Complex" objects and RAM

2019-11-09 Thread Alex Herbert


> On 9 Nov 2019, at 12:23, Gilles Sadowski  wrote:
> 
>> Snip
> 
> I think that I tried
>  $ mvn -Dcheckstyle.failOnViolation=false test
> 
> And still it wouldn't run the test (because I had introduced the
> "System.out" forbidden construct).

Checkstyle is configured to run in the validate phase (right at the start) [1]. 
The default is normally verify. This was copied from commons RNG and predates 
my joining the project.

So this is why you cannot run ‘mvn test’ as it runs checkstyle:check and fails 
you before doing the test.

Now fail on violation is ‘true’ perhaps we should move the check back to 
validate. So you cannot do a 'mvn install' with any checkstyle errors. You can 
also run mvn checkstyle:check to run the goal manually.

[1] http://maven.apache.org/ref/3.6.2/maven-core/lifecycles.html 


Re: [Numbers] Arrays of "Complex" objects and RAM

2019-11-09 Thread Alex Herbert


> On 9 Nov 2019, at 12:23, Gilles Sadowski  wrote:
> 
> Hello Alex.
> 
 [...]
 I think at least the minimum implementation of the abstraction will be
 fairly easy. It can then be performance checked with a few variants.
 
 There currently is not a JMH module for numbers. Should I create one
 under a module included with an optional profile?
>>> 
>>> What is going to be benchmarked?
>> 
>> The operations in Complex applied via a ComplexList or a Complex[], or a
>> specialisation that works direct on the numbers and does not use Complex. If
>> one of the purposes of the ComplexList is efficient computation on large
>> amounts of data then it would be a start to show the efficiency. I can do
>> this on a branch and see if it is useful.
> 
> No need for a branch, as all expressed opinions agree on the usefulness
> of "ComplexList", at least to save RAM.
> Indeed, a JMH module would then hopefully show that no performance is
> lost by the one additional creation of a "Complex" instance when the data
> the abstraction is backed by "double[]" (wrt "Complex[]”).

Since this is the crux of my use case it will be interesting to see if using 
numbers can be as performant as what I am currently doing.

> 
 
 I have spent a fair bit of time trying to fix coverage of Complex. It is
 now at 100% with 95% branch coverage. It currently tests all the edge
 cases for the C standard. However it doesn’t really test actual
 computations.
>>> 
>>> Actual computations?
>>> Coveralls seems OK:
>>> 
>>> https://coveralls.io/builds/26869201/source?filename=commons-numbers-complex/src/main/java/org/apache/commons/numbers/complex/Complex.java
>>> 
>> 
>> The point is that you can get 100% coverage by calling all the methods. You
>> don’t have to do assertions on the results. So for example sin() has no test
>> on data that is not an edge case, for example:
>> 
>> complex(3, 4).sin() = ???
>> 
>> The only test is:
>> 
>> assertComplex(z.sin(), negI.multiply(iz.sinh()));
>> 
>> And the tests for sinh() tests the edge cases.
>> 
>> This is why the following PR [1] exists that swaps the sign for the
>> imaginary component in the sin() computation. There are no tests to show the
>> functions working on “normal” computations. Essentially the most basic tests
>> that compute the functions on data that should go down the common path are
>> all missing, Only the computation edge cases are asserted. So I’ll create
>> some using GNU g++ to create expected results.
>> 
>> [1] https://github.com/apache/commons-numbers/pull/31/files
>> 
> 
> Got it.  Thanks.
> 
>>> 
 I am going to continue on this to add more tests that the computations
 output the same values as a C reference implementation, and hit all the
 branches.
 
 When working on the class it raised a few questions:
 
 1. HashCode
 
 This is current the same for any number with a NaN component. However the
 C standard states that NaN components are allowed in infinites. So
 perhaps the hashCode should be updated to at least distinguish infinites
 from NaN.
>>> 
>>> What would be the purpose?
>> 
>> For storing in a HashSet. I noted this when first working on the class but
>> on reflection I admit that it won’t matter. The current equals method will
>> distinguish any binary difference so these are not equal:
>> 
>> Complex(inf, NaN)
>> Complex(-inf, NaN)
>> Complex(NaN, inf)
>> Complex(NaN, -inf)
>> 
>> They are all infinite but are different. They will be stored under the same
>> hash along with any other Complex with NaN. It may be of use for searching
>> to have any infinite stored under the same key but one that is different
>> from a NaN.
> 
> Finally, I think that the safer option is to follow the JDK convention
> and ensure that "hashCode is consistent with equals".
> 
> How about using "java.util.Arrays" to define both?

Arrays:
public static int hashCode(double a[]) {
if (a == null)
return 0;

int result = 1;
for (double element : a) {
long bits = Double.doubleToLongBits(element);
result = 31 * result + (int)(bits ^ (bits >>> 32));
}
return result;
}

Double
public static int hashCode(double value) {
long bits = doubleToLongBits(value);
return (int)(bits ^ (bits >>> 32));
}


So the same result as Arrays.hashCode should be this:

return 31 * (31 + Double.hashCode(real)) + Double.hashCode(imaginary);

It can be checked it is the same as Arrays.hashCode(new double[]{real, 
imaginary}).

The current hash is:

return 37 * (17 * Double.hashCode(imaginary) + Double.hashCode(real));

So not much difference except the factors, order of arguments and that the 
inner operation on imaginary is 

Re: [Numbers] Arrays of "Complex" objects and RAM

2019-11-09 Thread Gilles Sadowski
Hello Alex.

>>> [...]
>>> I think at least the minimum implementation of the abstraction will be
>>> fairly easy. It can then be performance checked with a few variants.
>>>
>>> There currently is not a JMH module for numbers. Should I create one
>>> under a module included with an optional profile?
>>
>> What is going to be benchmarked?
>
> The operations in Complex applied via a ComplexList or a Complex[], or a
> specialisation that works direct on the numbers and does not use Complex. If
> one of the purposes of the ComplexList is efficient computation on large
> amounts of data then it would be a start to show the efficiency. I can do
> this on a branch and see if it is useful.

No need for a branch, as all expressed opinions agree on the usefulness
of "ComplexList", at least to save RAM.
Indeed, a JMH module would then hopefully show that no performance is
lost by the one additional creation of a "Complex" instance when the data
the abstraction is backed by "double[]" (wrt "Complex[]").

>>>
>>> I have spent a fair bit of time trying to fix coverage of Complex. It is
>>> now at 100% with 95% branch coverage. It currently tests all the edge
>>> cases for the C standard. However it doesn’t really test actual
>>> computations.
>>
>> Actual computations?
>> Coveralls seems OK:
>>
>> https://coveralls.io/builds/26869201/source?filename=commons-numbers-complex/src/main/java/org/apache/commons/numbers/complex/Complex.java
>> 
>
> The point is that you can get 100% coverage by calling all the methods. You
> don’t have to do assertions on the results. So for example sin() has no test
> on data that is not an edge case, for example:
>
> complex(3, 4).sin() = ???
>
> The only test is:
>
> assertComplex(z.sin(), negI.multiply(iz.sinh()));
>
> And the tests for sinh() tests the edge cases.
>
> This is why the following PR [1] exists that swaps the sign for the
> imaginary component in the sin() computation. There are no tests to show the
> functions working on “normal” computations. Essentially the most basic tests
> that compute the functions on data that should go down the common path are
> all missing, Only the computation edge cases are asserted. So I’ll create
> some using GNU g++ to create expected results.
>
> [1] https://github.com/apache/commons-numbers/pull/31/files
> 

Got it.  Thanks.

>>
>>> I am going to continue on this to add more tests that the computations
>>> output the same values as a C reference implementation, and hit all the
>>> branches.
>>>
>>> When working on the class it raised a few questions:
>>>
>>> 1. HashCode
>>>
>>> This is current the same for any number with a NaN component. However the
>>> C standard states that NaN components are allowed in infinites. So
>>> perhaps the hashCode should be updated to at least distinguish infinites
>>> from NaN.
>>
>> What would be the purpose?
>
> For storing in a HashSet. I noted this when first working on the class but
> on reflection I admit that it won’t matter. The current equals method will
> distinguish any binary difference so these are not equal:
>
> Complex(inf, NaN)
> Complex(-inf, NaN)
> Complex(NaN, inf)
> Complex(NaN, -inf)
>
> They are all infinite but are different. They will be stored under the same
> hash along with any other Complex with NaN. It may be of use for searching
> to have any infinite stored under the same key but one that is different
> from a NaN.

Finally, I think that the safer option is to follow the JDK convention
and ensure that "hashCode is consistent with equals".

How about using "java.util.Arrays" to define both?

>>>
>>> 2. NaN definition
>>>
>>> What is not clear is if there exists a definition of NaN for a complex
>>> number. The C library has no way to test it with a single function call.
>>> You have to test both the real and imaginary components. On top of this
>>> you can do computation with NaNs. The GNU g++ compiler computes:
>>>
>>>(1e300 + i 1e300) * (1e30 + i NAN) = inf + i inf
>>>
>>> Thus this is allowing some computations with NaN to produce results. I
>>> don’t know if this is a bug in GNU g++ or intentional. The C standard
>>> states:
>>>
>>> “If both operands of the * operator are complex or if the second operand
>>> of the / operator
>>> is complex, the operator raises floating-point exceptions if appropriate
>>> for the calculation
>>> of the parts of the result, and may raise spurious floating-point
>>> exceptions.”
>>>
>>> That is pretty vague. It says that you can do what is “appropriate”.
>>>
>>> So it seems that perhaps we should only call a complex a NaN is both
>>> fields are NaN. If only one is NaN then the complex is semi-NaN and
>>> results of computations may or may not make sense.
>>
>> Do you have an example of a NaN that makes sense?
>> What is semi-NaN?
>
> semiNaN = 

Re: [Numbers] Arrays of "Complex" objects and RAM

2019-11-09 Thread Alex Herbert


> On 9 Nov 2019, at 02:42, Gilles Sadowski  wrote:
> 
> Hi.
> 
> Le sam. 9 nov. 2019 à 01:48, Alex Herbert  > a écrit :
>> 
>> 
>> 
>>> On 7 Nov 2019, at 23:29, Eric Barnhill  wrote:
>>> 
>>> On Thu, Nov 7, 2019 at 3:25 PM Gilles Sadowski  wrote:
>>> 
 Le jeu. 7 nov. 2019 à 18:36, Eric Barnhill  a
 écrit :
> 
> I should also add on this note, my use case for developing ComplexUtils
 in
> the first place was compatibility with JTransforms and JOCL. In both
 cases
> I wanted to convert Complex[] arrays into interleaved double[] arrays to
> feed into algorithms using those libraries.
 
 Implicit in my remark below is the question: Where does the "Complex[]"
 come from?  If it is never a good idea to create such an array, why provide
 code to convert from it?  Do we agree that we should rather create the
 "ComplexList" abstraction, including accessors that shape the data for
 use with e.g. JTransforms?
 
 
>>> I completely agree this is a superior solution and look forward to its
>>> development.
>> 
>> I think at least the minimum implementation of the abstraction will be 
>> fairly easy. It can then be performance checked with a few variants.
>> 
>> There currently is not a JMH module for numbers. Should I create one under a 
>> module included with an optional profile?
> 
> What is going to be benchmarked?

The operations in Complex applied via a ComplexList or a Complex[], or a 
specialisation that works direct on the numbers and does not use Complex. If 
one of the purposes of the ComplexList is efficient computation on large 
amounts of data then it would be a start to show the efficiency. I can do this 
on a branch and see if it is useful.

> 
>> 
>> I have spent a fair bit of time trying to fix coverage of Complex. It is now 
>> at 100% with 95% branch coverage. It currently tests all the edge cases for 
>> the C standard. However it doesn’t really test actual computations.
> 
> Actual computations?
> Coveralls seems OK:
>
> https://coveralls.io/builds/26869201/source?filename=commons-numbers-complex/src/main/java/org/apache/commons/numbers/complex/Complex.java
>  
> 

The point is that you can get 100% coverage by calling all the methods. You 
don’t have to do assertions on the results. So for example sin() has no test on 
data that is not an edge case, for example:

complex(3, 4).sin() = ???

The only test is:

assertComplex(z.sin(), negI.multiply(iz.sinh()));

And the tests for sinh() tests the edge cases.

This is why the following PR [1] exists that swaps the sign for the imaginary 
component in the sin() computation. There are no tests to show the functions 
working on “normal” computations. Essentially the most basic tests that compute 
the functions on data that should go down the common path are all missing, Only 
the computation edge cases are asserted. So I’ll create some using GNU g++ to 
create expected results.

[1] https://github.com/apache/commons-numbers/pull/31/files 



> 
>> I am going to continue on this to add more tests that the computations 
>> output the same values as a C reference implementation, and hit all the 
>> branches.
>> 
>> When working on the class it raised a few questions:
>> 
>> 1. HashCode
>> 
>> This is current the same for any number with a NaN component. However the C 
>> standard states that NaN components are allowed in infinites. So perhaps the 
>> hashCode should be updated to at least distinguish infinites from NaN.
> 
> What would be the purpose?

For storing in a HashSet. I noted this when first working on the class but on 
reflection I admit that it won’t matter. The current equals method will 
distinguish any binary difference so these are not equal:

Complex(inf, NaN)
Complex(-inf, NaN)
Complex(NaN, inf)
Complex(NaN, -inf)

They are all infinite but are different. They will be stored under the same 
hash along with any other Complex with NaN. It may be of use for searching to 
have any infinite stored under the same key but one that is different from a 
NaN.

> 
>> 
>> 2. NaN definition
>> 
>> What is not clear is if there exists a definition of NaN for a complex 
>> number. The C library has no way to test it with a single function call. You 
>> have to test both the real and imaginary components. On top of this you can 
>> do computation with NaNs. The GNU g++ compiler computes:
>> 
>>(1e300 + i 1e300) * (1e30 + i NAN) = inf + i inf
>> 
>> Thus this is allowing some computations with NaN to produce results. I don’t 
>> know if this is a bug in GNU g++ or intentional. The C standard states:
>> 
>> “If both operands of the * operator are complex or if the second operand of 
>> the / operator
>> is complex, the operator raises floating-point