Hi, I have been experimenting with federated SPARQL queries and I ran into a possible bug.
I am using Fuseki 1.0.0. I have two SPARQL endpoints. The first endpoint contains data about books (modified from the book example from Fuseki Data directory). Endpoint: http://localhost:3030/books/query @prefix dc: <http://purl.org/dc/elements/1.1/> . @prefix : <http://example.org/book/> . @prefix ns: <http://example.org/ns#> . @prefix vcard: <http://www.w3.org/2001/vcard-rdf/3.0#> . :book5 dc:creator "J.K. Rowling" ; dc:title "Harry Potter and the Order of the Phoenix" . :book3 dc:creator _:b0 ; dc:title "Harry Potter and the Prisoner Of Azkaban" . :book8 dc:creator <http://www-sop.inria.fr/members/Alice> ; dc:title "Distributed Query Processing for Linked Data" . :book1 dc:creator "J.K. Rowling" ; dc:title "Harry Potter and the Philosopher's Stone" . :book6 dc:creator "J.K. Rowling" ; dc:title "Harry Potter and the Half-Blood Prince" . :book4 dc:title "Harry Potter and the Goblet of Fire" . _:b0 vcard:FN "J.K. Rowling" ; vcard:N [ vcard:Family "Rowling" ; vcard:Given "Joanna" ] . :book2 dc:creator _:b0 ; dc:title "Harry Potter and the Chamber of Secrets" . :book7 dc:creator "J.K. Rowling" ; dc:title "Harry Potter and the Deathly Hallows" . The second endpoint contains data about people: Endpoint: ocalhost:3031/persons/query @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix cert: <http://www.w3.org/ns/auth/cert#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . <http://www-sop.inria.fr/members/Charlie> a foaf:Person ; foaf:knows <http://www-sop.inria.fr/members/Alice> , <http://www-sop.inria.fr/members/Bob> ; foaf:name "Charlie" . <http://www-sop.inria.fr/members/Alice> a foaf:Person ; foaf:knows <http://www-sop.inria.fr/members/Charlie> , <http://www-sop.inria.fr/members/Bob> ; foaf:name "Alice" . <http://www-sop.inria.fr/members/Bob> a foaf:Person ; foaf:knows <http://www-sop.inria.fr/members/Charlie> , <http://www-sop.inria.fr/members/Alice> ; foaf:name "Bob" . I am trying to run a federated SPARQL query with the SERVICE option to find the books for which the author’s names are specified in the second endpoint. The query looks like this: PREFIX : <http://example/> PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT * where { ?book dc:title ?title . ?book dc:creator ?author. SERVICE <http://localhost:3031/persons/query> { ?author foaf:name ?name } } While this query should return only one result <http://example.org/book/book8> "Distributed Query Processing for Linked Data" <http://www-sop.inria.fr/members/Alice> "Alice" It incorrectly returns: ------------------------------------------------------------------------------------------------------------------------------------------ | book | title | author | name | ========================================================================================================================================== | <http://example.org/book/book2> | "Harry Potter and the Chamber of Secrets" | _:b0 | "Bob" | | <http://example.org/book/book2> | "Harry Potter and the Chamber of Secrets" | _:b0 | "Alice" | | <http://example.org/book/book2> | "Harry Potter and the Chamber of Secrets" | _:b0 | "Charlie" | | <http://example.org/book/book8> | "Distributed Query Processing for Linked Data" | <http://www-sop.inria.fr/members/Alice> | "Alice" | | <http://example.org/book/book3> | "Harry Potter and the Prisoner Of Azkaban" | _:b0 | "Bob" | | <http://example.org/book/book3> | "Harry Potter and the Prisoner Of Azkaban" | _:b0 | "Alice" | | <http://example.org/book/book3> | "Harry Potter and the Prisoner Of Azkaban" | _:b0 | "Charlie" | ------------------------------------------------------------------------------------------------------------------------------------------ This problem occurs because subqueries with each variable bindings from the first endpoint are sent to the second endpoint, resulting the following subqueries for the second endpoint: SELECT * WHERE { "J.K. Rowling" <http://xmlns.com/foaf/0.1/name> ?name } SELECT * WHERE { _:b0 <http://xmlns.com/foaf/0.1/name> ?name } SELECT * WHERE { "J.K. Rowling" <http://xmlns.com/foaf/0.1/name> ?name } SELECT * WHERE { "J.K. Rowling" <http://xmlns.com/foaf/0.1/name> ?name } SELECT * WHERE { <http://www-sop.inria.fr/members/Alice> <http://xmlns.com/foaf/0.1/name> ?name } SELECT * WHERE { _:b0 <http://xmlns.com/foaf/0.1/name> ?name } SELECT * WHERE { "J.K. Rowling" <http://xmlns.com/foaf/0.1/name> ?name } (taken from fuseki log) The problem is because of the subqueries: SELECT * WHERE { _:b0 <http://xmlns.com/foaf/0.1/name> ?name } SELECT * WHERE { _:b0 <http://xmlns.com/foaf/0.1/name> ?name } As blank nodes in this case are treated like variables, this subquery in the second endpoint returns Bob, Alice, and Charlie: ------------- | name | ============= | "Bob" | | "Alice" | | "Charlie" | ------------- Then this is joined with the first part of the query and produces the incorrect results below: | <http://example.org/book/book2> | "Harry Potter and the Chamber of Secrets" | _:b0 | "Bob" | | <http://example.org/book/book2> | "Harry Potter and the Chamber of Secrets" | _:b0 | "Alice" | | <http://example.org/book/book2> | "Harry Potter and the Chamber of Secrets" | _:b0 | "Charlie" | | <http://example.org/book/book3> | "Harry Potter and the Prisoner Of Azkaban" | _:b0 | "Bob" | | <http://example.org/book/book3> | "Harry Potter and the Prisoner Of Azkaban" | _:b0 | "Alice" | | <http://example.org/book/book3> | "Harry Potter and the Prisoner Of Azkaban" | _:b0 | "Charlie" | I think the bounded blank nodes from the first endpoint should not have been sent to second endpoint in the subqueries. The scope of blank nodes is local and this should be considered also in this case. Finally, if I change the oder of the service part in the query, it does return the correct result. PREFIX : <http://example/> PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT * where { SERVICE <http://localhost:3031/persons/query> { ?author foaf:name ?name } ?book dc:title ?title . ?book dc:creator ?author. } ---------------------------------------------------------------------------------------------------------------------------------------- | author | name | book | title | ======================================================================================================================================== | <http://www-sop.inria.fr/members/Alice> | "Alice" | <http://example.org/book/book8> | "Distributed Query Processing for Linked Data" | ---------------------------------------------------------------------------------------------------------------------------------------- I was wondering if it is a bug or if I am missing something here. Best regards, Rakebul Hasan
