Hi,
thanks for pointing to the compareMatricesBitAvgDistance function! I
tried it out with different settings.
2^14 for the maxUnitsOfLeastPrecision parameter was a nice guess,
however probably too optimistic.
Taking the log2 of some distances of the error output, sadly reveals
values beyond the 30.
I will play around with it, and hopefully find some nice and adequate
settings.
Thanks again!
Best,
Kevin
On 13.01.21 13:13, Baunsgaard, Sebastian wrote:
Hi Kevin,
Great question, and thanks for posting it to the mailing list!
When comparing the floating point values i would suggest using our
"distance in bits" for matrices containing double values. This gives
you the ability to specify a relative difference between the values,
rather than the typical double comparison with an epsilon specified to
an exact value.
You can find the method to compare in:
File: src/test/java/org/apache/sysds/test/TestUtils.java
Method: compareMatricesBitAvgDistance.
Note the bit distance is a long, that specify how many of the tailing
bits of the double values distance is allowed. The long can be in the
entire long positive value space with Long.MAX_VALUE, meaning totally
different values expected, to 0, meaning exactly the same encoded
double value. I would suggest trying out using 2^14 to start with.
It is normal that values can be off by 2.0E80 if the values we are
talking about is in those orders of magnitude, so therefore it is okay
for those tests to use an epsilon like that. Furthermore in systemds
we use Kahan correction of our double values, that make them able to
correct for rounding errors more detailed than the 64 bit double
values. This rounding can make the values deviate after a number of
operations such that the difference becomes more exaggerated.
Best regards
Sebastian Baunsgaard
------------------------------------------------------------------------
*From:* Kevin Pretterhofer <[email protected]>
*Sent:* Wednesday, January 13, 2021 12:52:47 PM
*To:* [email protected]
*Subject:* [Question] Regarding test cases and floating point errors
Hi all,
I hope this is the right place to ask questions. If not I am sorry, but
it would be nice to direct me to the right place then.
So my question is about the unit tests. Currently I am implementing a
simple gaussian classifier. Besides the class prior probabilities,
this implementation also outputs the respective mean values,
determinants, and covariance matrices, respectively their inverses.
Now I face the problem, that the values of my SystemDS implementation
and my R implementation are quite off for random generated test
matrices. I assume that this is due to floating point errors / floating
point precision. At first glance they look quite similar, but since it
outputs scientific notation,
one can clearly see that the magnitude by which they are off is quite a
lot. E.g. for my determinant comparison I got the following:
(1,1): 1.2390121975770675E14 <--> 1.2390101941279517E14
(3,1): 1.510440018532407E85 <--> 1.5104388050968705E85
(2,1): 1.6420264128994816E38 <--> 1.6420263615987703E38
(5,1): 8.881025000211518E70 <--> 8.881037540234089E70
(4,1): 1.7888589555748764E22 <--> 1.78885700537877E22
I face similar issues with the inverses of my covariance matrices.
Since I use the eigenvalues and eigenvectors for calculating the
determinant and the inverse in SystemDL, I already compared them to the
eigenvalues and vectors which R
computes, and already there, differences (due to floating point
differences) are observable.
My question would be now, how to test, respectively compare such
matrices and vectors?
It seems a bit odd to me, to set the tolerance to something like
"2.0E80" or so.
Would be great if someone could help me out!
Best,
Kevin