On 18/08/2020 22:17, Dr. Chavdar Ivanov wrote:
Andy, Richard,
Thank you for the feedback.
In the graph I have the 2 values as xsd:float so this is how the data is coming
In the SPAQL query I tried to cast the float to decimal by using
FILTER (xsd:decimal(?value1)!=xsd:decimal(?value1)).
I am not sure if this is correct way, but I am now seeing a difference in the
comparison result
0.1001244561 Is different from 0.1001234590 which is OK
^^ typo?
But these are reported as same 100123456.1 and 100123459.0
100123456.1 is not a floating point number. It has more precision than
xsf:float can represent.
It's "1.00123456E8"^^xsd:float
(Please copy and paste expressions into email.)
xsd:decimal(?value1)
is:
evaluate ?value1 to get an xsd:float.
which is
'"0.1001244561"^^xsd:float
Using Jena's expression evaluator:
qexpr '"0.1001244561"^^xsd:float+0'
==>
"0.100124456"^^xsd:float
See? Already lost precision.
Then turn it into a deciminal.
it is different to:
xsd:decimal(str(?value1))
which takes the lexical form, not the floating point value, of ?value1.
If I get the value before the comparison is executed the xsd:decimal of the two
values appears to be the same 100123456.0 so this is why != does not reports
the difference.
Here the decimal does not seem to help,
Because precision was lost making the decimal. Start with a decimal.
xsd:decimal("0.1001244561")
or "0.1001244561"^^xsd:decimal
or 0.1001244561 (in Turtle and SPARQL).
but I guess this falls in the same category that large absolute values are less
precise. So same effect as for xsd:float.
Best regards
Chavdar
-----Original Message-----
From: Andy Seaborne <[email protected]>
Sent: Tuesday, 18 August, 2020 19:07
To: [email protected]
Subject: Re: Float comparison
On 18/08/2020 10:31, Richard Cyganiak wrote:
The xsd:float datatype represents IEEE 754 single-precision floating point
numbers.
As with any floating-point datatype, the precision depends on the size of the
number. Numbers close to zero are very precise. Numbers with a large absolute
value (large positive or large negative) are less precise. For the gory details
see for example here:
https://en.wikipedia.org/wiki/Single-precision_floating-point_format#P
recision_limitations_on_decimal_values_in_[1,_16777216]
There is rarely a good reason to use xsd:float in RDF. xsd:double is much more
precise at a small increase of storage cost (4 more bytes, which is negligible
given the total size of an RDF triple). xsd:decimal provides arbitrary
precision (in theory), but is more expensive in storage and computation.
My general view is that if storage size and performance of mathematical
computations are a major concern for the application, RDF is probably not the
best choice—RDF optimises for other concerns. Therefore the best choice for
representing non-integer numbers in RDF is usually xsd:decimal—more expensive,
but no issues with precision.
Richard
xsd:decimal can record any decimal precision but division may loose precision - otherwise
"1/3" is infinite storage.
Jena uses 24 digit precision for division for inexact results like 1/3.
On 18 Aug 2020, at 05:48, Dr. Chavdar Ivanov <[email protected]> wrote:
Hello
I posted the message below to the TopBraid users mailing list and
already clarified that as sh:equals is based on RDF node equality,
values such as "1.0"^^xsd:float and "1"^^xsd:float count as distinct.
So I am keeping this for the interest of others in the list
SPARQL has both comparisons.
The "sameTerm()" operator for RDF termequality, and SPARQL "=" for value
comparison (by op:numeric-equal):
Andy
But on SPARQL float comparison I got an advise to check in this mailing list
for other opinions.
I understand that SPARQL comparison is mathematically based so 1.0 should be
equal to 1. However below in item 2 you will see the numbers I compared and I
am getting confused. Take into account that in the data graph the 2 compared
properties are typed literals with datatype float.
I wanted to know what is the precision when float is compared. So I
have 2 questions
* What is the precision? - is it 6th decimal and is it OK to compare
different forms of float, i.e. one is in scientific form
* Why I am getting wrong comparison result for bigger values such as
100123456.1 and 100123459 which are found as same
Best regards
Chavdar
========
Dear all,
I have a very basic question...
I need to compare literals that are floats and tried to use two ways.
1) using sh:equals to compare 2 properties and 2) using SPARQL where
I filter != different values
For the filter I tried using
FILTER (xsd:float(?value1)!=xsd:float(?value1)).
or
FILTER (?value1!=?value1).
Both give the same outcome.
Below I listed a summary of the tests I did
I think sh:equals treats the literals as strings even though they are floats.
It also gives 2 results. I thing this looks like according to the SHACL spec
although I didn't if the sh:equals ignores the datatype.
However In some cases the result form the SPARQL is kind of strange. It looks
like the precision is 10-6, but for the big numbers and when scientific form
on float number is used we have something different.
What is followed to define the difference?
If I use google calculator
100123456.1-100.123459E+06=-2.90000000596
Normally it should be OK to compare different forms of float.
1) using sh:equals in the property shape
Value1 ; value 2 ; comparisson result
1.123456 ; 1.123456 ; same
1.1234560 ; 1.1234561 ; different (sh:equals reports it twice)
31.1234560 ; 31.1234561 ;different (sh:equals reports it twice)
30 ; 30.0000001 ; different (sh:equals reports it twice)
30 ; 30.000001 ; different (sh:equals reports it twice)
100123456.0 ; 100123456.1 ; different (sh:equals reports it twice)
100123456.0 ; 100123456.0 ; same
100123456 ; 100.123456E6 ; different (sh:equals reports it twice)
100123456 ; 100.123456E+06 ; different (sh:equals reports it twice)
-0.123456789 ; -123.456789E-3 ; different (sh:equals reports it
twice)
-0.123456789 ; -123.456789E-03 ; different (sh:equals reports it
twice)
100123456.1 ; 100.123456E+06 ; different (sh:equals reports it twice)
100123456.1 ; 100.123459E+06 ; different (sh:equals reports it twice)
100123456.1 ; 100123459 ; different (sh:equals reports it twice)
100123456.1 ; 100123459.0 ; different (sh:equals reports it twice)
2) using SPARQL (in the property shape)
1.123456 ; 1.123456 ; same
1.1234560 ; 1.1234561 ; different
31.1234560 ; 31.1234561 ;different
30 ; 30.0000001 ; same
30 ; 30.000001 ; different
100123456.0 ; 100123456.1 ; same
100123456.0 ; 100123456.0 ; same
100123456 ; 100.123456E6 ; same
100123456 ; 100.123456E+06 ; same
-0.123456789 ; -123.456789E-3 ; same
-0.123456789 ; -123.456789E-03 ; same
100123456.1 ; 100.123456E+06 ; same
100123456.1 ; 100.123459E+06 ; same
100123456.1 ; 100123459 ; same
100123456.1 ; 100123459.0 ; same
Best regards
Chavdar