[
https://issues.apache.org/jira/browse/MARMOTTA-603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16052943#comment-16052943
]
Xavier Sumba commented on MARMOTTA-603:
---------------------------------------
A possible related issue with sub select in optional.
h1. Data
h2. Data for test case 1
{code:sparql}
<urn:s1> a <urn:C> .
<urn:s2> a <urn:C> .
<urn:s3> a <urn:C> .
<urn:s4> a <urn:C> .
<urn:s5> a <urn:C> .
<urn:s6> a <urn:C> .
<urn:s7> a <urn:C> .
<urn:s8> a <urn:C> .
<urn:s9> a <urn:C> .
<urn:s10> a <urn:C> .
<urn:s11> a <urn:C> .
<urn:s12> a <urn:C> .
<urn:s1> <urn:p> "01" .
<urn:s2> <urn:p> "02" .
<urn:s3> <urn:p> "03" .
<urn:s4> <urn:p> "04" .
<urn:s5> <urn:p> "05" .
<urn:s6> <urn:p> "06" .
<urn:s7> <urn:p> "07" .
<urn:s8> <urn:p> "08" .
<urn:s9> <urn:p> "09" .
<urn:s10> <urn:p> "10" .
<urn:s11> <urn:p> "11" .
<urn:s12> <urn:p> "12" .
{code}
h2. Data for test case 2:
{code:sparql}
<u:1> <u:r> <u:subject> .
<u:1> <u:v> 1 .
<u:1> <u:x> <u:x1> .
<u:2> <u:r> <u:subject> .
<u:2> <u:v> 2 .
<u:2> <u:x> <u:x2> .
<u:3> <u:r> <u:subject> .
<u:3> <u:v> 3 .
<u:3> <u:x> <u:x3> .
<u:4> <u:r> <u:subject> .
<u:4> <u:v> 4 .
<u:4> <u:x> <u:x4> .
<u:5> <u:r> <u:subject> .
<u:5> <u:v> 5 .
<u:5> <u:x> <u:x5> .
{code}
h1. Tests
h2. Test case 1:
Subquery select is getting values between "1" or "2", but it's returinig a
weird results.
{code:sparql}
SELECT ?s ?label
WHERE {
?s a <urn:C> .
OPTIONAL { {SELECT ?label WHERE {
?s <urn:p> ?label .
} ORDER BY ?label LIMIT 2
}
}
}
ORDER BY ?s
LIMIT 10
{code}
Query translated to SQL
{code:sql}
SELECT S2.V2 AS V2, P1.subject AS V1
FROM triples P1
INNER JOIN nodes AS P1_subject_V1 ON P1.subject = P1_subject_V1.id
LEFT JOIN
(SELECT P1.object AS V2, P1.subject AS V1
FROM triples P1
INNER JOIN nodes AS P1_object_V2 ON P1.object = P1_object_V2.id
WHERE P1.deleted = false
AND P1.predicate = 876129878216290304
ORDER BY P1_object_V2.svalue ASC
) AS S2 ON (P1.subject = S2.V1)
WHERE P1.deleted = false
AND P1.predicate = 876129878635720704
AND P1.object = 876129878069489664
ORDER BY P1_subject_V1.svalue ASC
LIMIT 10
{code}
Expected results
{code:sparql}
[s=urn:s1;label="01"^^<http://www.w3.org/2001/XMLSchema#string>]
[s=urn:s1;label="02"^^<http://www.w3.org/2001/XMLSchema#string>]
[s=urn:s10;label="01"^^<http://www.w3.org/2001/XMLSchema#string>]
[s=urn:s10;label="02"^^<http://www.w3.org/2001/XMLSchema#string>]
[s=urn:s11;label="01"^^<http://www.w3.org/2001/XMLSchema#string>]
[s=urn:s11;label="02"^^<http://www.w3.org/2001/XMLSchema#string>]
[s=urn:s12;label="01"^^<http://www.w3.org/2001/XMLSchema#string>]
[s=urn:s12;label="02"^^<http://www.w3.org/2001/XMLSchema#string>]
[s=urn:s2;label="01"^^<http://www.w3.org/2001/XMLSchema#string>]
[s=urn:s2;label="02"^^<http://www.w3.org/2001/XMLSchema#string>]
{code}
Resulsts obtained:
{code:sparql}
[s=urn:s1;label="01"^^xsd:string]
[s=urn:s10;label="10"^^xsd:string]
[s=urn:s11;label="11"^^xsd:string]
[s=urn:s12;label="12"^^xsd:string]
[s=urn:s2;label="02"^^xsd:string]
[s=urn:s3;label="03"^^xsd:string]
[s=urn:s4;label="04"^^xsd:string]
[s=urn:s5;label="05"^^xsd:string]
[s=urn:s6;label="06"^^xsd:string]
[s=urn:s7;label="07"^^xsd:string]
{code}
h2. Test Case 2:
Error in tranlation of query.
{code:sparql}
select ?x {
{ select ?v { ?v <u:r> <u:subject> filter (?v = <u:1>) } }.
optional { select ?val { ?v <u:v> ?val .} }
?v <u:x> ?x
}
{code}
Query translated to SQL
{code:sql}
SELECT NULL AS V1, NULL AS V3, P2.object AS V2
FROM triples P2
CROSS JOIN
(SELECT P1.subject AS V1
FROM triples P1
INNER JOIN nodes AS P1_subject_V1 ON P1.subject = P1_subject_V1.id
WHERE P1.deleted = false
AND P1.predicate = 876144626856243200
AND P1.object = 876144626914963456
AND P1.subject = 876144626751385600
) AS S1
LEFT JOIN
(SELECT NULL AS V1, P1.object AS V2
FROM triples P1
WHERE P1.deleted = false
AND P1.predicate = 876144627028209664
) AS S3
WHERE P2.deleted = false
AND P2.predicate = 876144627326005248
AND P2.subject = S1.V1
{code}
Expected result:
{code:sparql}
[x=u:x1]
[x=u:x1]
[x=u:x1]
[x=u:x1]
[x=u:x1]
{code}
Results obtained
{code:java}
ERROR: syntax error at or near "WHERE"
LINE 20: WHERE P2.deleted = false
{code}
This was found while migrating Marmotta to Sesame 2.8.11 [1] in a new test
case. The error persists in branches master and develop. For now, ignoring test
cases in MARMOTTA-659 [2].
[1] https://issues.apache.org/jira/browse/MARMOTTA-659
[2]
https://github.com/gmora1223/marmotta/commit/e0c9879c93471a475fd0366386e31667be148e6b
> SPARQL OPTIONAL issues
> ----------------------
>
> Key: MARMOTTA-603
> URL: https://issues.apache.org/jira/browse/MARMOTTA-603
> Project: Marmotta
> Issue Type: Bug
> Components: KiWi Triple Store
> Affects Versions: 3.3.0
> Reporter: Rupert Westenthaler
> Priority: Critical
>
> The SPARQL implemenation of the KiWi triple store seams to have issues with
> the evaluation of OPTIONAL segments of SPARQL queries. In the following test
> data and test queries are provided.
> h2. Data
> {code}
> <urn:test.org:place.1> rdf:type schema:Palce ;
> schema:geo <urn:test.org:geo.1> ;
> schema:name "Place 1" .
> <urn:test.org:geo.1> rdf:type schema:GeoCoordinates ;
> schema:latitude "16"^^xsd:double ;
> schema:longitude "17"^^xsd:double ;
> schema:elevation "123"^^xsd:int .
> <urn:test.org:place.2> rdf:type schema:Palce ;
> schema:geo <urn:test.org:geo.2> ;
> schema:name "Place 2" .
> <urn:test.org:geo.2> rdf:type schema:GeoCoordinates ;
> schema:latitude "15"^^xsd:double ;
> schema:longitude "16"^^xsd:double ;
> schema:elevation "99"^^xsd:int .
> <urn:test.org:place.3> rdf:type schema:Palce ;
> schema:geo <urn:test.org:geo.3> ;
> schema:name "Place 3" .
> <urn:test.org:geo.3> rdf:type schema:GeoCoordinates ;
> schema:latitude "15"^^xsd:double ;
> schema:longitude "17"^^xsd:double .
> <urn:test.org:place.4> rdf:type schema:Palce ;
> schema:geo <urn:test.org:geo.4> ;
> schema:name "Place 4" .
> <urn:test.org:geo.4> rdf:type schema:GeoCoordinates ;
> schema:longitude "17"^^xsd:double ;
> schema:elevation "123"^^xsd:int .
> {code}
> Important is that `geo.1` and `geo.2` do have all latitude, longitude and
> elevation defined. `geo.3` has no elevation and `geo.4` is missing the
> latitude to simulate invalid geo coordinate data.
> h2. Test Case 1
> The following query using an OPTIONAL graph pattern including
> `schema:latitude` and `schema:longitude`. This assumes a user just want
> lat/long values of locations that do define both.
> {code}
> PREFIX schema: <http://schema.org/>
> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> SELECT * WHERE {
> ?entity schema:geo ?location
> OPTIONAL {
> ?location schema:latitude ?lat .
> ?location schema:longitude ?long .
> }
> }
> {code}
> translate to the Algebra
> {code}
> (base <http://example/base/>
> (prefix ((schema: <http://schema.org/>)
> (rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>))
> (leftjoin
> (bgp (triple ?entity schema:geo ?location))
> (bgp
> (triple ?location schema:latitude ?lat)
> (triple ?location schema:longitude ?long)
> ))))
> {code}
> The expected result are
> {code}
> entity,location,lat,long
> urn:test.org:place.1,urn:test.org:geo.1,16,17
> urn:test.org:place.2,urn:test.org:geo.2,15,16
> urn:test.org:place.3,urn:test.org:geo.3,15,17
> urn:test.org:place.4,urn:test.org:geo.4,,
> {code}
> All four locations are expected in the result set as the `OPTIONAL` graph
> pattern is translated to a `leftjoin` with `triple ?entity schema:geo
> ?location`.
> However for `geo.4` no value is expected for `?lat` AND `long` as this
> resource only defines a longitude and therefore does not match
> {code}
> (bgp
> (triple ?location schema:latitude ?lat)
> (triple ?location schema:longitude ?long)
> )
> {code}
> Marmotta responses with
> {code}
> entity,location,lat,long
> urn:test.org:place.1,urn:test.org:geo.1,16,17
> urn:test.org:place.2,urn:test.org:geo.2,15,16
> urn:test.org:place.3,urn:test.org:geo.3,15,17
> urn:test.org:place.4,urn:test.org:geo.4,,17
> {code}
> Note that the longitude is returned for the resource `geo.4`
> h2. Test Case 2
> As a variation we now also include the `schema:elevation` in the OPTIONAL
> graph pattern.
> {code}
> PREFIX schema: <http://schema.org/>
> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> SELECT * WHERE {
> ?entity schema:geo ?location
> OPTIONAL {
> ?location schema:latitude ?lat .
> ?location schema:longitude ?long .
> ?location schema:elevation ?alt .
> }
> }
> {code}
> This query translates to the following algebra
> {code}
> (base <http://example/base/>
> (prefix ((schema: <http://schema.org/>)
> (rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>))
> (leftjoin
> (bgp (triple ?entity schema:geo ?location))
> (bgp
> (triple ?location schema:latitude ?lat)
> (triple ?location schema:longitude ?long)
> (triple ?location schema:elevation ?alt)
> ))))
> {code}
> The expected result would have 4 result rows where `lat`, `long` and `alt`
> values are only provided for `geo.1` and `geo.2`.
> {code}
> entity,location,lat,long,alt
> urn:test.org:place.1,urn:test.org:geo.1,16,17,123
> urn:test.org:place.2,urn:test.org:geo.2,15,16,99
> urn:test.org:place.3,urn:test.org:geo.3,,,
> urn:test.org:place.4,urn:test.org:geo.4,,,
> {code}
> With this query Marmotta behaves very strange as the results depend on the
> ordering of the tripple patterns in the `OPTIONAL` graph pattern. I will not
> include all variations but just provide two examples:
> {code}
> OPTIONAL {
> ?location schema:latitude ?lat .
> ?location schema:longitude ?long .
> ?location schema:elevation ?alt .
> }
> {code}
> gives
> {code}
> entity,location,lat,long,alt
> urn:test.org:place.1,urn:test.org:geo.1,1.6E1,1.7E1,123
> urn:test.org:place.2,urn:test.org:geo.2,1.5E1,1.6E1,99
> urn:test.org:place.4,urn:test.org:geo.4,,1.7E1,123
> {code}
> while
> {code}
> OPTIONAL {
> ?location schema:longitude ?long .
> ?location schema:latitude ?lat .
> ?location schema:elevation ?alt .
> }
> {code}
> gives
> {code}
> entity,location,long,lat,alt
> urn:test.org:place.1,urn:test.org:geo.1,1.7E1,1.6E1,123
> urn:test.org:place.2,urn:test.org:geo.2,1.6E1,1.5E1,99
> {code}
> This behavior further indicates that `OPTIONAL` are wrongly processed.
> h2. Test Case 3
> Modifying the query to
> {code}
> PREFIX schema: <http://schema.org/>
> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> SELECT * WHERE {
> ?entity schema:geo ?location
> OPTIONAL {
> ?location schema:latitude ?lat .
> ?location schema:longitude ?long .
> }
> OPTIONAL {
> ?location schema:elevation ?alt .
> }
> }
> {code}
> results in a similar result to _Test Case 1_ where we have 4 results, but for
> `geo.4` we do get the unexpected value for `?long`.
> h2. Test Case 4
> This test case assumes that the user requires `lat` and `long` and optionally
> wants the `alt` but only for resources that do have a valid location.
> {code}
> PREFIX schema: <http://schema.org/>
> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> SELECT * WHERE {
> ?entity schema:geo ?location
> OPTIONAL {
> ?location schema:latitude ?lat .
> ?location schema:longitude ?long .
> OPTIONAL {
> ?location schema:elevation ?alt .
> }
> }
> }
> {code}
> This translates to the following algebra
> {code}
> (base <http://example/base/>
> (prefix ((schema: <http://schema.org/>)
> (rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>))
> (leftjoin
> (bgp (triple ?entity schema:geo ?location))
> (leftjoin
> (bgp
> (triple ?location schema:latitude ?lat)
> (triple ?location schema:longitude ?long)
> )
> (bgp (triple ?location schema:elevation ?alt))))))
> {code}
> So `lat` and `long` values are `leftjoin` with the `alt`. Than the result is
> in an other `leftjoin` with the results of `?entity schema:geo ?location`.
> Because expected results are as follows
> {code}
> entity,location,lat,long,alt
> urn:test.org:place.1,urn:test.org:geo.1,16,17,123
> urn:test.org:place.2,urn:test.org:geo.2,15,16,99
> urn:test.org:place.3,urn:test.org:geo.3,,,
> urn:test.org:place.4,urn:test.org:geo.4,,,
> {code}
> Marmotta however returns
> {code}
> entity,location,lat,long,alt
> urn:test.org:place.1,urn:test.org:geo.1,16,17,123
> urn:test.org:place.2,urn:test.org:geo.2,15,16,99
> urn:test.org:place.3,urn:test.org:geo.3,15,17,
> urn:test.org:place.4,urn:test.org:geo.4,,17,123
> {code}
> All test cases show that OPTIONAL query segments are not correctly evaluated
> by the SPARQL implementation of the KiWi triple store.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)