Re: [Question] Regarding test cases and floating point errors

Kevin Pretterhofer Wed, 13 Jan 2021 07:40:10 -0800

Hi,

thanks for pointing to the compareMatricesBitAvgDistance function! Itried it out with different settings.2^14 for the maxUnitsOfLeastPrecision parameter was a nice guess,however probably too optimistic.Taking the log2 of some distances of the error output, sadly revealsvalues beyond the 30.

I will play around with it, and hopefully find some nice and adequatesettings.


Thanks again!


Best,
Kevin



On 13.01.21 13:13, Baunsgaard, Sebastian wrote:

Hi Kevin,


Great question, and thanks for posting it to the mailing list!
When comparing the floating point values i would suggest using our"distance in bits" for matrices containing double values. This givesyou the ability to specify a relative difference between the values,rather than the typical double comparison with an epsilon specified toan exact value.
You can find the method to compare in:

File:  src/test/java/org/apache/sysds/test/TestUtils.java

Method:    compareMatricesBitAvgDistance.
Note the bit distance is a long, that specify how many of the tailingbits of the double values distance is allowed. The long can be in theentire long positive value space with Long.MAX_VALUE, meaning totallydifferent values expected, to 0, meaning exactly the same encodeddouble value. I would suggest trying out using 2^14 to start with.
It is normal that values can be off by 2.0E80 if the values we aretalking about is in those orders of magnitude, so therefore it is okayfor those tests to use an epsilon like that. Furthermore in systemdswe use Kahan correction of our double values, that make them able tocorrect for rounding errors more detailed than the 64 bit doublevalues. This rounding can make the values deviate after a number ofoperations such that the difference becomes more exaggerated.
Best regards

Sebastian Baunsgaard



------------------------------------------------------------------------
*From:* Kevin Pretterhofer <[email protected]>
*Sent:* Wednesday, January 13, 2021 12:52:47 PM
*To:* [email protected]
*Subject:* [Question] Regarding test cases and floating point errors
Hi all,

I hope this is the right place to ask questions. If not I am sorry, but
it would be nice to direct me to the right place then.

So my question is about the unit tests. Currently I am implementing a
simple gaussian classifier. Besides the class prior probabilities,
this implementation also outputs the respective mean values,
determinants, and covariance matrices, respectively their inverses.

Now I face the problem, that the values of my SystemDS implementation
and my R implementation are quite off for random generated test
matrices. I assume that this is due to floating point errors / floating
point precision. At first glance they look quite similar, but since it
outputs scientific notation,
one can clearly see that the magnitude by which they are off is quite a
lot. E.g. for my determinant comparison I got the following:

(1,1): 1.2390121975770675E14 <--> 1.2390101941279517E14
(3,1): 1.510440018532407E85 <--> 1.5104388050968705E85
(2,1): 1.6420264128994816E38 <--> 1.6420263615987703E38
(5,1): 8.881025000211518E70 <--> 8.881037540234089E70
(4,1): 1.7888589555748764E22 <--> 1.78885700537877E22

I face similar issues with the inverses of my covariance matrices.

Since I use the eigenvalues and eigenvectors for calculating the
determinant and the inverse in SystemDL, I already compared them to the
eigenvalues and vectors which R
computes, and already there, differences (due to floating point
differences) are observable.

My question would be now, how to test, respectively compare such
matrices and vectors?
It seems a bit odd to me, to set the tolerance to something like
"2.0E80" or so.

Would be great if someone could help me out!

Best,
Kevin

Re: [Question] Regarding test cases and floating point errors

Reply via email to