Andy, yes I would agree xsd:float can lead to some funky behavior here due
to precision. While you are at it this could also explain why ?y is bound
to ?x in the example below on blazegraph but still "correctly" mapped in
Jena. Simply a bug in wikidata/blazegraph that doesn't throw an error and
is not caught on the server side.

PREFIX xsd:<http://www.w3.org/2001/XMLSchema#>

SELECT ?x ?y ?z WHERE{
values ?x { "100123456.01"^^xsd:float }
values ?y { "100123459.01"^^xsd:float }
values ?z { "100123451.01"^^xsd:float}
}

Blazegraph

https://query.wikidata.org/#PREFIX%20xsd%3A%3Chttp%3A%2F%2Fwww.w3.org%2F2001%2FXMLSchema%23%3E%0A%0ASELECT%20%3Fx%20%3Fy%20%3Fz%20WHERE%7B%0Avalues%20%3Fx%20%7B%20%22100123456.01%22%5E%5Exsd%3Afloat%20%7D%0Avalues%20%3Fy%20%7B%20%22100123459.01%22%5E%5Exsd%3Afloat%20%7D%0Avalues%20%3Fz%20%7B%20%22100123451.01%22%5E%5Exsd%3Afloat%7D%0A%7D


x                      y                      z
100123456.01 100123456.01 100123451.01

Jena 3.15

http://www.lotico.com:3030/lotico/sparql?query=PREFIX+xsd%3A%3Chttp%3A%2F%2Fwww.w3.org%2F2001%2FXMLSchema%23%3E%0D%0A%0D%0ASELECT+%3Fx+%3Fy+%3Fz+WHERE%7B%0D%0Avalues+%3Fx+%7B+%22100123456.01%22%5E%5Exsd%3Afloat+%7D%0D%0Avalues+%3Fy+%7B+%22100123459.01%22%5E%5Exsd%3Afloat+%7D%0D%0Avalues+%3Fz+%7B+%22100123451.01%22%5E%5Exsd%3Afloat%7D%0D%0A%7D&output=text


x                                             y
               z
100123456.01                     100123459.01
100123451.01


On Wed, Aug 19, 2020 at 10:13 AM Andy Seaborne <a...@apache.org> wrote:

>
>
> On 18/08/2020 22:17, Dr. Chavdar Ivanov wrote:
> > Andy, Richard,
> > Thank you for the feedback.
> >
> > In the graph  I have the 2 values as xsd:float so this is how the data
> is coming
> >
> > In the SPAQL query I tried to cast the float to decimal by using
> > FILTER (xsd:decimal(?value1)!=xsd:decimal(?value1)).
> >
> > I am not sure if this is correct way, but I am now seeing a difference
> in the comparison result
> >
> > 0.1001244561 Is different from 0.1001234590 which is OK
>           ^^ typo?
>
>
> > But these are reported as same 100123456.1     and  100123459.0
>
>
> 100123456.1 is not a floating point number. It has more precision than
> xsf:float can represent.
>
> It's "1.00123456E8"^^xsd:float
>
>
> (Please copy and paste expressions into email.)
>
> xsd:decimal(?value1)
>
> is:
> evaluate ?value1 to get an xsd:float.
>
> which is
>
> '"0.1001244561"^^xsd:float
>
> Using Jena's expression evaluator:
>
> qexpr '"0.1001244561"^^xsd:float+0'
>   ==>
> "0.100124456"^^xsd:float
>
> See? Already lost precision.
>
> Then turn it into a deciminal.
>
> it is different to:
> xsd:decimal(str(?value1))
>
> which takes the lexical form, not the floating point value, of ?value1.
>
> > If I get the value before the comparison is executed the xsd:decimal of
> the two values appears to be the same 100123456.0 so this is why != does
> not reports the difference.
> > Here the decimal does not seem to help,
>
> Because precision was lost making the decimal.  Start with a decimal.
>
> xsd:decimal("0.1001244561")
>   or "0.1001244561"^^xsd:decimal
>   or 0.1001244561   (in Turtle and SPARQL).
>
> > but I guess this falls in the same category that large absolute values
> are less precise. So same effect as for xsd:float.
> >
> > Best regards
> > Chavdar
> >
> >
> >
> > -----Original Message-----
> > From: Andy Seaborne <a...@apache.org>
> > Sent: Tuesday, 18 August, 2020 19:07
> > To: users@jena.apache.org
> > Subject: Re: Float comparison
> >
> >
> >
> > On 18/08/2020 10:31, Richard Cyganiak wrote:
> >> The xsd:float datatype represents IEEE 754 single-precision floating
> point numbers.
> >>
> >> As with any floating-point datatype, the precision depends on the size
> of the number. Numbers close to zero are very precise. Numbers with a large
> absolute value (large positive or large negative) are less precise. For the
> gory details see for example here:
> >>
> >> https://en.wikipedia.org/wiki/Single-precision_floating-point_format#P
> >> recision_limitations_on_decimal_values_in_[1,_16777216]
> >>
> >> There is rarely a good reason to use xsd:float in RDF. xsd:double is
> much more precise at a small increase of storage cost (4 more bytes, which
> is negligible given the total size of an RDF triple). xsd:decimal provides
> arbitrary precision (in theory), but is more expensive in storage and
> computation.
> >>
> >> My general view is that if storage size and performance of mathematical
> computations are a major concern for the application, RDF is probably not
> the best choice—RDF optimises for other concerns. Therefore the best choice
> for representing non-integer numbers in RDF is usually xsd:decimal—more
> expensive, but no issues with precision.
> >>
> >> Richard
> >
> > xsd:decimal can record any decimal precision but division may loose
> precision - otherwise "1/3" is infinite storage.
> >
> > Jena uses 24 digit precision for division for inexact results like 1/3.
> >
> >>
> >>
> >>> On 18 Aug 2020, at 05:48, Dr. Chavdar Ivanov <ch.iva...@outlook.hu>
> wrote:
> >>>
> >>> Hello
> >>>
> >>>
> >>>
> >>> I posted the message below to the TopBraid users mailing list and
> >>> already clarified that as sh:equals is based on RDF node equality,
> >>> values such as "1.0"^^xsd:float and "1"^^xsd:float count as distinct.
> >>> So I am keeping this for the interest of others in the list
> >
> > SPARQL has both comparisons.
> >
> > The "sameTerm()" operator for RDF termequality, and SPARQL "=" for value
> comparison (by op:numeric-equal):
> >
> >       Andy
> >
> >>>
> >>>
> >>>
> >>> But on SPARQL float comparison I got an advise to check in this
> mailing list for other opinions.
> >>>
> >>> I understand that SPARQL comparison is mathematically based so 1.0
> should be equal to 1. However below in item 2 you will see the numbers I
> compared and I am getting confused. Take into account that in the data
> graph the 2 compared properties are typed literals with datatype float.
> >>>
> >>> I wanted to know what is the precision when float is compared. So I
> >>> have 2 questions
> >>>
> >>> *       What is the precision? - is it 6th decimal and is it OK to
> compare different forms of float, i.e. one is in scientific form
> >>> *       Why I am getting wrong comparison result for bigger values
> such as    100123456.1     and  100123459     which are found as same
> >>>
> >>>
> >>>
> >>> Best regards
> >>>
> >>> Chavdar
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> ========
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> Dear all,
> >>>
> >>>
> >>>
> >>> I have a very basic question...
> >>>
> >>> I need to compare literals that are floats and tried to use two ways.
> >>> 1) using sh:equals to compare 2 properties and 2) using SPARQL where
> >>> I filter != different values
> >>>
> >>>
> >>>
> >>> For the filter I tried using
> >>>
> >>> FILTER (xsd:float(?value1)!=xsd:float(?value1)).
> >>>
> >>> or
> >>>
> >>> FILTER (?value1!=?value1).
> >>>
> >>> Both give the same outcome.
> >>>
> >>>
> >>>
> >>> Below I listed a summary of the tests I did
> >>>
> >>>
> >>>
> >>> I think sh:equals treats the literals as strings even though they are
> floats. It also gives 2 results. I thing this looks like according to the
> SHACL spec although I didn't if the sh:equals ignores the datatype.
> >>>
> >>>
> >>>
> >>> However In some cases the result form the SPARQL is kind of strange.
> It looks like the precision is 10-6, but for the big numbers  and when
> scientific form on float number is used we have something different.
> >>>
> >>>
> >>>
> >>> What is followed to define the difference?
> >>>
> >>> If I use google calculator
> >>>
> >>> 100123456.1-100.123459E+06=-2.90000000596
> >>>
> >>>
> >>>
> >>> Normally it should be OK to compare different forms of float.
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> 1) using sh:equals in the property shape
> >>>
> >>> Value1 ; value 2  ; comparisson result
> >>>
> >>> 1.123456 ; 1.123456 ; same
> >>>
> >>> 1.1234560 ; 1.1234561 ; different (sh:equals reports it twice)
> >>>
> >>> 31.1234560 ; 31.1234561 ;different (sh:equals reports it twice)
> >>>
> >>> 30    ;      30.0000001 ; different (sh:equals reports it twice)
> >>>
> >>> 30     ;      30.000001 ; different (sh:equals reports it twice)
> >>>
> >>> 100123456.0  ; 100123456.1 ; different (sh:equals reports it twice)
> >>>
> >>> 100123456.0  ; 100123456.0 ; same
> >>>
> >>> 100123456    ;  100.123456E6 ; different (sh:equals reports it twice)
> >>>
> >>> 100123456    ;  100.123456E+06 ; different (sh:equals reports it twice)
> >>>
> >>> -0.123456789  ;  -123.456789E-3 ; different (sh:equals reports it
> >>> twice)
> >>>
> >>> -0.123456789  ;  -123.456789E-03 ; different (sh:equals reports it
> >>> twice)
> >>>
> >>> 100123456.1    ;  100.123456E+06  ; different (sh:equals reports it
> twice)
> >>>
> >>> 100123456.1     ;   100.123459E+06 ; different (sh:equals reports it
> twice)
> >>>
> >>> 100123456.1     ;  100123459      ; different (sh:equals reports it
> twice)
> >>>
> >>> 100123456.1     ;  100123459.0    ; different (sh:equals reports it
> twice)
> >>>
> >>>
> >>>
> >>> 2) using SPARQL (in the property shape)
> >>>
> >>> 1.123456 ; 1.123456 ; same
> >>>
> >>> 1.1234560 ; 1.1234561 ; different
> >>>
> >>> 31.1234560 ; 31.1234561 ;different
> >>>
> >>> 30    ;      30.0000001 ; same
> >>>
> >>> 30     ;      30.000001 ; different
> >>>
> >>> 100123456.0  ; 100123456.1 ; same
> >>>
> >>> 100123456.0  ; 100123456.0 ; same
> >>>
> >>> 100123456    ;  100.123456E6 ; same
> >>>
> >>> 100123456    ;  100.123456E+06 ; same
> >>>
> >>> -0.123456789  ;  -123.456789E-3 ; same
> >>>
> >>> -0.123456789  ;  -123.456789E-03 ; same
> >>>
> >>> 100123456.1    ;  100.123456E+06  ; same
> >>>
> >>> 100123456.1     ;   100.123459E+06 ; same
> >>>
> >>> 100123456.1     ;  100123459      ; same
> >>>
> >>> 100123456.1     ;  100123459.0    ; same
> >>>
> >>>
> >>>
> >>> Best regards
> >>>
> >>> Chavdar
> >>>
> >>>
> >>>
> >>
>


-- 


---
Marco Neumann
KONA

Reply via email to