RE: Float comparison

Dr. Chavdar Ivanov Tue, 18 Aug 2020 14:18:00 -0700

Andy, Richard,
Thank you for the feedback.

In the graph  I have the 2 values as xsd:float so this is how the data is 
coming


In the SPAQL query I tried to cast the float to decimal by using 
FILTER (xsd:decimal(?value1)!=xsd:decimal(?value1)).

I am not sure if this is correct way, but I am now seeing a difference in the 
comparison result

0.1001244561 Is different from 0.1001234590 which is OK
But these are reported as same 100123456.1     and  100123459.0  
If I get the value before the comparison is executed the xsd:decimal of the two 
values appears to be the same 100123456.0 so this is why != does not reports 
the difference.
Here the decimal does not seem to help, but I guess this falls in the same 
category that large absolute values are less precise. So same effect as for 
xsd:float.

Best regards
Chavdar 



-----Original Message-----
From: Andy Seaborne <[email protected]> 
Sent: Tuesday, 18 August, 2020 19:07
To: [email protected]
Subject: Re: Float comparison



On 18/08/2020 10:31, Richard Cyganiak wrote:
> The xsd:float datatype represents IEEE 754 single-precision floating point 
> numbers.
> 
> As with any floating-point datatype, the precision depends on the size of the 
> number. Numbers close to zero are very precise. Numbers with a large absolute 
> value (large positive or large negative) are less precise. For the gory 
> details see for example here:
> 
> https://en.wikipedia.org/wiki/Single-precision_floating-point_format#P
> recision_limitations_on_decimal_values_in_[1,_16777216]
> 
> There is rarely a good reason to use xsd:float in RDF. xsd:double is much 
> more precise at a small increase of storage cost (4 more bytes, which is 
> negligible given the total size of an RDF triple). xsd:decimal provides 
> arbitrary precision (in theory), but is more expensive in storage and 
> computation.
> 
> My general view is that if storage size and performance of mathematical 
> computations are a major concern for the application, RDF is probably not the 
> best choice—RDF optimises for other concerns. Therefore the best choice for 
> representing non-integer numbers in RDF is usually xsd:decimal—more 
> expensive, but no issues with precision.
> 
> Richard

xsd:decimal can record any decimal precision but division may loose precision - 
otherwise "1/3" is infinite storage.

Jena uses 24 digit precision for division for inexact results like 1/3.

> 
> 
>> On 18 Aug 2020, at 05:48, Dr. Chavdar Ivanov <[email protected]> wrote:
>>
>> Hello
>>
>>
>>
>> I posted the message below to the TopBraid users mailing list and 
>> already clarified that as sh:equals is based on RDF node equality, 
>> values such as "1.0"^^xsd:float and "1"^^xsd:float count as distinct. 
>> So I am keeping this for the interest of others in the list

SPARQL has both comparisons.

The "sameTerm()" operator for RDF termequality, and SPARQL "=" for value 
comparison (by op:numeric-equal):

     Andy

>>
>>
>>
>> But on SPARQL float comparison I got an advise to check in this mailing list 
>> for other opinions.
>>
>> I understand that SPARQL comparison is mathematically based so 1.0 should be 
>> equal to 1. However below in item 2 you will see the numbers I compared and 
>> I am getting confused. Take into account that in the data graph the 2 
>> compared properties are typed literals with datatype float.
>>
>> I wanted to know what is the precision when float is compared. So I 
>> have 2 questions
>>
>> *       What is the precision? - is it 6th decimal and is it OK to compare 
>> different forms of float, i.e. one is in scientific form
>> *       Why I am getting wrong comparison result for bigger values such as   
>>  100123456.1     and  100123459     which are found as same
>>
>>
>>
>> Best regards
>>
>> Chavdar
>>
>>
>>
>>
>>
>> ========
>>
>>
>>
>>
>>
>> Dear all,
>>
>>
>>
>> I have a very basic question...
>>
>> I need to compare literals that are floats and tried to use two ways. 
>> 1) using sh:equals to compare 2 properties and 2) using SPARQL where 
>> I filter != different values
>>
>>
>>
>> For the filter I tried using
>>
>> FILTER (xsd:float(?value1)!=xsd:float(?value1)).
>>
>> or
>>
>> FILTER (?value1!=?value1).
>>
>> Both give the same outcome.
>>
>>
>>
>> Below I listed a summary of the tests I did
>>
>>
>>
>> I think sh:equals treats the literals as strings even though they are 
>> floats. It also gives 2 results. I thing this looks like according to the 
>> SHACL spec although I didn't if the sh:equals ignores the datatype.
>>
>>
>>
>> However In some cases the result form the SPARQL is kind of strange. It 
>> looks like the precision is 10-6, but for the big numbers  and when 
>> scientific form on float number is used we have something different.
>>
>>
>>
>> What is followed to define the difference?
>>
>> If I use google calculator
>>
>> 100123456.1-100.123459E+06=-2.90000000596
>>
>>
>>
>> Normally it should be OK to compare different forms of float.
>>
>>
>>
>>
>>
>> 1) using sh:equals in the property shape
>>
>> Value1 ; value 2  ; comparisson result
>>
>> 1.123456 ; 1.123456 ; same
>>
>> 1.1234560 ; 1.1234561 ; different (sh:equals reports it twice)
>>
>> 31.1234560 ; 31.1234561 ;different (sh:equals reports it twice)
>>
>> 30    ;      30.0000001 ; different (sh:equals reports it twice)
>>
>> 30     ;      30.000001 ; different (sh:equals reports it twice)
>>
>> 100123456.0  ; 100123456.1 ; different (sh:equals reports it twice)
>>
>> 100123456.0  ; 100123456.0 ; same
>>
>> 100123456    ;  100.123456E6 ; different (sh:equals reports it twice)
>>
>> 100123456    ;  100.123456E+06 ; different (sh:equals reports it twice)
>>
>> -0.123456789  ;  -123.456789E-3 ; different (sh:equals reports it 
>> twice)
>>
>> -0.123456789  ;  -123.456789E-03 ; different (sh:equals reports it 
>> twice)
>>
>> 100123456.1    ;  100.123456E+06  ; different (sh:equals reports it twice)
>>
>> 100123456.1     ;   100.123459E+06 ; different (sh:equals reports it twice)
>>
>> 100123456.1     ;  100123459      ; different (sh:equals reports it twice)
>>
>> 100123456.1     ;  100123459.0    ; different (sh:equals reports it twice)
>>
>>
>>
>> 2) using SPARQL (in the property shape)
>>
>> 1.123456 ; 1.123456 ; same
>>
>> 1.1234560 ; 1.1234561 ; different
>>
>> 31.1234560 ; 31.1234561 ;different
>>
>> 30    ;      30.0000001 ; same
>>
>> 30     ;      30.000001 ; different
>>
>> 100123456.0  ; 100123456.1 ; same
>>
>> 100123456.0  ; 100123456.0 ; same
>>
>> 100123456    ;  100.123456E6 ; same
>>
>> 100123456    ;  100.123456E+06 ; same
>>
>> -0.123456789  ;  -123.456789E-3 ; same
>>
>> -0.123456789  ;  -123.456789E-03 ; same
>>
>> 100123456.1    ;  100.123456E+06  ; same
>>
>> 100123456.1     ;   100.123459E+06 ; same
>>
>> 100123456.1     ;  100123459      ; same
>>
>> 100123456.1     ;  100123459.0    ; same
>>
>>
>>
>> Best regards
>>
>> Chavdar
>>
>>
>>
>

RE: Float comparison

Reply via email to