Re: [Numbers] Arrays of "Complex" objects and RAM

Gilles Sadowski Fri, 08 Nov 2019 18:43:14 -0800

Hi.

Le sam. 9 nov. 2019 à 01:48, Alex Herbert <alex.d.herb...@gmail.com> a écrit :
>
>
>
> > On 7 Nov 2019, at 23:29, Eric Barnhill <ericbarnh...@gmail.com> wrote:
> >
> > On Thu, Nov 7, 2019 at 3:25 PM Gilles Sadowski <gillese...@gmail.com> wrote:
> >
> >> Le jeu. 7 nov. 2019 à 18:36, Eric Barnhill <ericbarnh...@gmail.com> a
> >> écrit :
> >>>
> >>> I should also add on this note, my use case for developing ComplexUtils
> >> in
> >>> the first place was compatibility with JTransforms and JOCL. In both
> >> cases
> >>> I wanted to convert Complex[] arrays into interleaved double[] arrays to
> >>> feed into algorithms using those libraries.
> >>
> >> Implicit in my remark below is the question: Where does the "Complex[]"
> >> come from?  If it is never a good idea to create such an array, why provide
> >> code to convert from it?  Do we agree that we should rather create the
> >> "ComplexList" abstraction, including accessors that shape the data for
> >> use with e.g. JTransforms?
> >>
> >>
> > I completely agree this is a superior solution and look forward to its
> > development.
>
> I think at least the minimum implementation of the abstraction will be fairly 
> easy. It can then be performance checked with a few variants.
>
> There currently is not a JMH module for numbers. Should I create one under a 
> module included with an optional profile?


What is going to be benchmarked?

>
> I have spent a fair bit of time trying to fix coverage of Complex. It is now 
> at 100% with 95% branch coverage. It currently tests all the edge cases for 
> the C standard. However it doesn’t really test actual computations.

Actual computations?
Coveralls seems OK:
    
https://coveralls.io/builds/26869201/source?filename=commons-numbers-complex/src/main/java/org/apache/commons/numbers/complex/Complex.java

> I am going to continue on this to add more tests that the computations output 
> the same values as a C reference implementation, and hit all the branches.
>
> When working on the class it raised a few questions:
>
> 1. HashCode
>
> This is current the same for any number with a NaN component. However the C 
> standard states that NaN components are allowed in infinites. So perhaps the 
> hashCode should be updated to at least distinguish infinites from NaN.

What would be the purpose?

>
> 2. NaN definition
>
> What is not clear is if there exists a definition of NaN for a complex 
> number. The C library has no way to test it with a single function call. You 
> have to test both the real and imaginary components. On top of this you can 
> do computation with NaNs. The GNU g++ compiler computes:
>
>         (1e300 + i 1e300) * (1e30 + i NAN) = inf + i inf
>
> Thus this is allowing some computations with NaN to produce results. I don’t 
> know if this is a bug in GNU g++ or intentional. The C standard states:
>
> “If both operands of the * operator are complex or if the second operand of 
> the / operator
> is complex, the operator raises floating-point exceptions if appropriate for 
> the calculation
> of the parts of the result, and may raise spurious floating-point exceptions.”
>
> That is pretty vague. It says that you can do what is “appropriate”.
>
> So it seems that perhaps we should only call a complex a NaN is both fields 
> are NaN. If only one is NaN then the complex is semi-NaN and results of 
> computations may or may not make sense.

Do you have an example of a NaN that makes sense?
What is semi-NaN?

>
> 3. Multiplication
>
> I applied a change to the algorithm provided in the C standard when 
> recovering (NaN,NaN) results of multiplying infinities. If you multiply 
> infinity by zero this should be NaN.
>
> “if one operand is an infinity and the other operand is a nonzero finite 
> number or an
> infinity, then the result of the * operator is an infinity;"
>
> Note the “non-zero” condition.
>
> But the algorithm provided does not do this and you can get an infinite 
> result back.
>
> This is what GNU g++ does:
>
> multiply(0 + i 0, 0 + i inf) = -nan + i -nan
> multiply(-0 + i 0, 0 + i inf) = -nan + i -nan
> multiply(0 + i 0, -0 + i inf) = -nan + i -nan
> multiply(-0 + i 0, -0 + i inf) = -nan + i -nan
> multiply(0 + i 0, 1 + i -inf) = -nan + i -nan
> multiply(-0 + i 0, 1 + i -inf) = -nan + i -nan
>
> So they have implemented the multiply differently because here multiply 
> infinity by zero is NaN. Note how printf distinguishes the sign of infinite 
> values. I don’t think Java does this. It collates NaN into one 
> representation. I will have to convert to raw long bits to get the sign for a 
> cross reference to a C version.

I don't follow.  Why would it be useful to distinguish -NaN from NaN or i NaN?

>
> 4. Divide
>
> I have applied a change to this algorithm to check for overflow in the 
> division. The reference text states:
>
> “if the first operand is a finite number and the second operand is an 
> infinity, then the
> result of the / operator is a zero;”
>
> But for example GNU g++ does this:
>
> divide(1e+308 + i 1e+308, inf + i inf) = -nan + i 0
>
> This is what the reference algorithm does if it is not changed. Here overflow 
> cause problems in the result (which should be zero). So I think my patch is 
> correct and the result should be (0 + i 0).

Not sure; I'd think that division by an imaginary infinity should yield NaN.

> The current unit tests are enforcing this requirement.
>
> 5. Edge cases
>
> Some of the methods have a large amount of if statements to test edge cases. 
> These are very verbose and hard to follow. They also leave the common path of 
> a non infinite, non nan, non zero value at the end. So a rearrangement may 
> allow better branch prediction for the common use case. I’ll carry on working 
> on this as I add more tests.
>
> 6. tanh
>
> I have put a TODO: note in the code about this. It tests edge case for 
> infinites. Then doubles the real and imaginary components. These could then 
> overflow to infinite and so trigger the edge case that has just been skipped. 
> I am going to compare the result to the GNU C version to decide how to handle 
> overflow.

Thanks for the review!

>
> Numbers overall
>
> I’ve now updated the configs for checkstyle, PMD and spotbugs. 
> Checkstyle:check

How to disable "failOnViolation" from the command line?

> and spotbugs:check are enabled. Pmd:check is not. There are some loose ends 
> to fix for that. Unfortunately the syntax for exclusions in the PMD config 
> file are poor. I’ve added some but I’ve not looked at everything yet. A few 
> decisions might need to be taken on whether to drops rules rather than fix 
> them to at least get pmd:check active. It spots a lot of things so is useful 
> for PRs.

Maybe.  But how does one know what to fix when the build fails because
of a CheckStyle
or PMD violation?

> I’ve also not looked at SonarCloud [1]. There are a few things on there to 
> fix too.

Most (all?) are false positives.

> [1] https://sonarcloud.io/dashboard?id=commons-numbers 
> <https://sonarcloud.io/dashboard?id=commons-numbers>


Best,
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [Numbers] Arrays of "Complex" objects and RAM

Reply via email to