[
https://issues.apache.org/jira/browse/MARMOTTA-603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dietmar Glachs reassigned MARMOTTA-603:
---------------------------------------
Assignee: (was: Dietmar Glachs)
> SPARQL OPTIONAL issues
> ----------------------
>
> Key: MARMOTTA-603
> URL: https://issues.apache.org/jira/browse/MARMOTTA-603
> Project: Marmotta
> Issue Type: Bug
> Components: KiWi Triple Store
> Affects Versions: 3.3.0
> Reporter: Rupert Westenthaler
> Priority: Critical
>
> The SPARQL implemenation of the KiWi triple store seams to have issues with
> the evaluation of OPTIONAL segments of SPARQL queries. In the following test
> data and test queries are provided.
> h2. Data
> {code}
> <urn:test.org:place.1> rdf:type schema:Palce ;
> schema:geo <urn:test.org:geo.1> ;
> schema:name "Place 1" .
> <urn:test.org:geo.1> rdf:type schema:GeoCoordinates ;
> schema:latitude "16"^^xsd:double ;
> schema:longitude "17"^^xsd:double ;
> schema:elevation "123"^^xsd:int .
> <urn:test.org:place.2> rdf:type schema:Palce ;
> schema:geo <urn:test.org:geo.2> ;
> schema:name "Place 2" .
> <urn:test.org:geo.2> rdf:type schema:GeoCoordinates ;
> schema:latitude "15"^^xsd:double ;
> schema:longitude "16"^^xsd:double ;
> schema:elevation "99"^^xsd:int .
> <urn:test.org:place.3> rdf:type schema:Palce ;
> schema:geo <urn:test.org:geo.3> ;
> schema:name "Place 3" .
> <urn:test.org:geo.3> rdf:type schema:GeoCoordinates ;
> schema:latitude "15"^^xsd:double ;
> schema:longitude "17"^^xsd:double .
> <urn:test.org:place.4> rdf:type schema:Palce ;
> schema:geo <urn:test.org:geo.4> ;
> schema:name "Place 4" .
> <urn:test.org:geo.4> rdf:type schema:GeoCoordinates ;
> schema:longitude "17"^^xsd:double ;
> schema:elevation "123"^^xsd:int .
> {code}
> Important is that `geo.1` and `geo.2` do have all latitude, longitude and
> elevation defined. `geo.3` has no elevation and `geo.4` is missing the
> latitude to simulate invalid geo coordinate data.
> h2. Test Case 1
> The following query using an OPTIONAL graph pattern including
> `schema:latitude` and `schema:longitude`. This assumes a user just want
> lat/long values of locations that do define both.
> {code}
> PREFIX schema: <http://schema.org/>
> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> SELECT * WHERE {
> ?entity schema:geo ?location
> OPTIONAL {
> ?location schema:latitude ?lat .
> ?location schema:longitude ?long .
> }
> }
> {code}
> translate to the Algebra
> {code}
> (base <http://example/base/>
> (prefix ((schema: <http://schema.org/>)
> (rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>))
> (leftjoin
> (bgp (triple ?entity schema:geo ?location))
> (bgp
> (triple ?location schema:latitude ?lat)
> (triple ?location schema:longitude ?long)
> ))))
> {code}
> The expected result are
> {code}
> entity,location,lat,long
> urn:test.org:place.1,urn:test.org:geo.1,16,17
> urn:test.org:place.2,urn:test.org:geo.2,15,16
> urn:test.org:place.3,urn:test.org:geo.3,15,17
> urn:test.org:place.4,urn:test.org:geo.4,,
> {code}
> All four locations are expected in the result set as the `OPTIONAL` graph
> pattern is translated to a `leftjoin` with `triple ?entity schema:geo
> ?location`.
> However for `geo.4` no value is expected for `?lat` AND `long` as this
> resource only defines a longitude and therefore does not match
> {code}
> (bgp
> (triple ?location schema:latitude ?lat)
> (triple ?location schema:longitude ?long)
> )
> {code}
> Marmotta responses with
> {code}
> entity,location,lat,long
> urn:test.org:place.1,urn:test.org:geo.1,16,17
> urn:test.org:place.2,urn:test.org:geo.2,15,16
> urn:test.org:place.3,urn:test.org:geo.3,15,17
> urn:test.org:place.4,urn:test.org:geo.4,,17
> {code}
> Note that the longitude is returned for the resource `geo.4`
> h2. Test Case 2
> As a variation we now also include the `schema:elevation` in the OPTIONAL
> graph pattern.
> {code}
> PREFIX schema: <http://schema.org/>
> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> SELECT * WHERE {
> ?entity schema:geo ?location
> OPTIONAL {
> ?location schema:latitude ?lat .
> ?location schema:longitude ?long .
> ?location schema:elevation ?alt .
> }
> }
> {code}
> This query translates to the following algebra
> {code}
> (base <http://example/base/>
> (prefix ((schema: <http://schema.org/>)
> (rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>))
> (leftjoin
> (bgp (triple ?entity schema:geo ?location))
> (bgp
> (triple ?location schema:latitude ?lat)
> (triple ?location schema:longitude ?long)
> (triple ?location schema:elevation ?alt)
> ))))
> {code}
> The expected result would have 4 result rows where `lat`, `long` and `alt`
> values are only provided for `geo.1` and `geo.2`.
> {code}
> entity,location,lat,long,alt
> urn:test.org:place.1,urn:test.org:geo.1,16,17,123
> urn:test.org:place.2,urn:test.org:geo.2,15,16,99
> urn:test.org:place.3,urn:test.org:geo.3,,,
> urn:test.org:place.4,urn:test.org:geo.4,,,
> {code}
> With this query Marmotta behaves very strange as the results depend on the
> ordering of the tripple patterns in the `OPTIONAL` graph pattern. I will not
> include all variations but just provide two examples:
> {code}
> OPTIONAL {
> ?location schema:latitude ?lat .
> ?location schema:longitude ?long .
> ?location schema:elevation ?alt .
> }
> {code}
> gives
> {code}
> entity,location,lat,long,alt
> urn:test.org:place.1,urn:test.org:geo.1,1.6E1,1.7E1,123
> urn:test.org:place.2,urn:test.org:geo.2,1.5E1,1.6E1,99
> urn:test.org:place.4,urn:test.org:geo.4,,1.7E1,123
> {code}
> while
> {code}
> OPTIONAL {
> ?location schema:longitude ?long .
> ?location schema:latitude ?lat .
> ?location schema:elevation ?alt .
> }
> {code}
> gives
> {code}
> entity,location,long,lat,alt
> urn:test.org:place.1,urn:test.org:geo.1,1.7E1,1.6E1,123
> urn:test.org:place.2,urn:test.org:geo.2,1.6E1,1.5E1,99
> {code}
> This behavior further indicates that `OPTIONAL` are wrongly processed.
> h2. Test Case 3
> Modifying the query to
> {code}
> PREFIX schema: <http://schema.org/>
> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> SELECT * WHERE {
> ?entity schema:geo ?location
> OPTIONAL {
> ?location schema:latitude ?lat .
> ?location schema:longitude ?long .
> }
> OPTIONAL {
> ?location schema:elevation ?alt .
> }
> }
> {code}
> results in a similar result to _Test Case 1_ where we have 4 results, but for
> `geo.4` we do get the unexpected value for `?long`.
> h2. Test Case 4
> This test case assumes that the user requires `lat` and `long` and optionally
> wants the `alt` but only for resources that do have a valid location.
> {code}
> PREFIX schema: <http://schema.org/>
> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> SELECT * WHERE {
> ?entity schema:geo ?location
> OPTIONAL {
> ?location schema:latitude ?lat .
> ?location schema:longitude ?long .
> OPTIONAL {
> ?location schema:elevation ?alt .
> }
> }
> }
> {code}
> This translates to the following algebra
> {code}
> (base <http://example/base/>
> (prefix ((schema: <http://schema.org/>)
> (rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>))
> (leftjoin
> (bgp (triple ?entity schema:geo ?location))
> (leftjoin
> (bgp
> (triple ?location schema:latitude ?lat)
> (triple ?location schema:longitude ?long)
> )
> (bgp (triple ?location schema:elevation ?alt))))))
> {code}
> So `lat` and `long` values are `leftjoin` with the `alt`. Than the result is
> in an other `leftjoin` with the results of `?entity schema:geo ?location`.
> Because expected results are as follows
> {code}
> entity,location,lat,long,alt
> urn:test.org:place.1,urn:test.org:geo.1,16,17,123
> urn:test.org:place.2,urn:test.org:geo.2,15,16,99
> urn:test.org:place.3,urn:test.org:geo.3,,,
> urn:test.org:place.4,urn:test.org:geo.4,,,
> {code}
> Marmotta however returns
> {code}
> entity,location,lat,long,alt
> urn:test.org:place.1,urn:test.org:geo.1,16,17,123
> urn:test.org:place.2,urn:test.org:geo.2,15,16,99
> urn:test.org:place.3,urn:test.org:geo.3,15,17,
> urn:test.org:place.4,urn:test.org:geo.4,,17,123
> {code}
> All test cases show that OPTIONAL query segments are not correctly evaluated
> by the SPARQL implementation of the KiWi triple store.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)