Hi,
I have been experimenting with federated SPARQL queries and I ran into a 
possible bug.

I am using Fuseki 1.0.0. I have two SPARQL endpoints.

The first endpoint contains data about books (modified from the book example 
from Fuseki Data directory). Endpoint: http://localhost:3030/books/query 

@prefix dc:    <http://purl.org/dc/elements/1.1/> .
@prefix :      <http://example.org/book/> .
@prefix ns:    <http://example.org/ns#> .
@prefix vcard: <http://www.w3.org/2001/vcard-rdf/3.0#> .

:book5  dc:creator  "J.K. Rowling" ;
        dc:title    "Harry Potter and the Order of the Phoenix" .

:book3  dc:creator  _:b0 ;
        dc:title    "Harry Potter and the Prisoner Of Azkaban" .

:book8  dc:creator  <http://www-sop.inria.fr/members/Alice> ;
        dc:title    "Distributed Query Processing for Linked Data" .

:book1  dc:creator  "J.K. Rowling" ;
        dc:title    "Harry Potter and the Philosopher's Stone" .

:book6  dc:creator  "J.K. Rowling" ;
        dc:title    "Harry Potter and the Half-Blood Prince" .

:book4  dc:title  "Harry Potter and the Goblet of Fire" .

_:b0    vcard:FN  "J.K. Rowling" ;
        vcard:N   [ vcard:Family  "Rowling" ;
                    vcard:Given   "Joanna"
                  ] .

:book2  dc:creator  _:b0 ;
        dc:title    "Harry Potter and the Chamber of Secrets" .

:book7  dc:creator  "J.K. Rowling" ;
        dc:title    "Harry Potter and the Deathly Hallows" .


The second endpoint contains data about people: Endpoint: 
ocalhost:3031/persons/query

@prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#> .
@prefix cert:  <http://www.w3.org/ns/auth/cert#> .
@prefix foaf:  <http://xmlns.com/foaf/0.1/> .
@prefix xsd:   <http://www.w3.org/2001/XMLSchema#> .
@prefix rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

<http://www-sop.inria.fr/members/Charlie>
        a           foaf:Person ;
        foaf:knows  <http://www-sop.inria.fr/members/Alice> , 
<http://www-sop.inria.fr/members/Bob> ;
        foaf:name   "Charlie" .

<http://www-sop.inria.fr/members/Alice>
        a           foaf:Person ;
        foaf:knows  <http://www-sop.inria.fr/members/Charlie> , 
<http://www-sop.inria.fr/members/Bob> ;
        foaf:name   "Alice" .

<http://www-sop.inria.fr/members/Bob>
        a           foaf:Person ;
        foaf:knows  <http://www-sop.inria.fr/members/Charlie> , 
<http://www-sop.inria.fr/members/Alice> ;
        foaf:name   "Bob" .


I am trying to run a federated SPARQL query with the SERVICE option to find the 
books for which the author’s names are specified in the second endpoint. The 
query looks like this:

PREFIX : <http://example/>
PREFIX  dc:     <http://purl.org/dc/elements/1.1/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT * where
{
  ?book dc:title ?title .
  ?book dc:creator ?author.
 
  SERVICE <http://localhost:3031/persons/query>
     { ?author foaf:name ?name  }

}

While this query should return only one result 

<http://example.org/book/book8> "Distributed Query Processing for Linked Data" 
<http://www-sop.inria.fr/members/Alice>  "Alice"   


It incorrectly returns:


------------------------------------------------------------------------------------------------------------------------------------------
| book                            | title                                       
   | author                                  | name      |
==========================================================================================================================================
| <http://example.org/book/book2> | "Harry Potter and the Chamber of Secrets"   
   | _:b0                                    | "Bob"     |
| <http://example.org/book/book2> | "Harry Potter and the Chamber of Secrets"   
   | _:b0                                    | "Alice"   |
| <http://example.org/book/book2> | "Harry Potter and the Chamber of Secrets"   
   | _:b0                                    | "Charlie" |
| <http://example.org/book/book8> | "Distributed Query Processing for Linked 
Data" | <http://www-sop.inria.fr/members/Alice> | "Alice"   |
| <http://example.org/book/book3> | "Harry Potter and the Prisoner Of Azkaban"  
   | _:b0                                    | "Bob"     |
| <http://example.org/book/book3> | "Harry Potter and the Prisoner Of Azkaban"  
   | _:b0                                    | "Alice"   |
| <http://example.org/book/book3> | "Harry Potter and the Prisoner Of Azkaban"  
   | _:b0                                    | "Charlie" |
------------------------------------------------------------------------------------------------------------------------------------------

This problem occurs because subqueries with each variable bindings from the 
first endpoint are sent to the second endpoint, resulting the following 
subqueries for the second endpoint:

SELECT  * WHERE   { "J.K. Rowling" <http://xmlns.com/foaf/0.1/name> ?name }
SELECT  * WHERE   { _:b0 <http://xmlns.com/foaf/0.1/name> ?name }
SELECT  * WHERE   { "J.K. Rowling" <http://xmlns.com/foaf/0.1/name> ?name }
SELECT  * WHERE   { "J.K. Rowling" <http://xmlns.com/foaf/0.1/name> ?name }
SELECT  * WHERE   { <http://www-sop.inria.fr/members/Alice> 
<http://xmlns.com/foaf/0.1/name> ?name }
SELECT  * WHERE   { _:b0 <http://xmlns.com/foaf/0.1/name> ?name }
SELECT  * WHERE   { "J.K. Rowling" <http://xmlns.com/foaf/0.1/name> ?name }

(taken from fuseki log)


The problem is because of the subqueries:
SELECT  * WHERE   { _:b0 <http://xmlns.com/foaf/0.1/name> ?name }
SELECT  * WHERE   { _:b0 <http://xmlns.com/foaf/0.1/name> ?name }

As blank nodes in this case are treated like variables, this subquery in the 
second endpoint returns Bob, Alice, and Charlie:

-------------
| name      |
=============
| "Bob"     |
| "Alice"   |
| "Charlie" |
-------------
Then this is joined with the first part of the query and produces the incorrect 
results below:
| <http://example.org/book/book2> | "Harry Potter and the Chamber of Secrets"   
   | _:b0                                    | "Bob"     |
| <http://example.org/book/book2> | "Harry Potter and the Chamber of Secrets"   
   | _:b0                                    | "Alice"   |
| <http://example.org/book/book2> | "Harry Potter and the Chamber of Secrets"   
   | _:b0                                    | "Charlie" |
| <http://example.org/book/book3> | "Harry Potter and the Prisoner Of Azkaban"  
   | _:b0                                    | "Bob"     |
| <http://example.org/book/book3> | "Harry Potter and the Prisoner Of Azkaban"  
   | _:b0                                    | "Alice"   |
| <http://example.org/book/book3> | "Harry Potter and the Prisoner Of Azkaban"  
   | _:b0                                    | "Charlie" |

I think the bounded blank nodes from the first endpoint should not have been 
sent to second endpoint in the subqueries. The scope of blank nodes is local 
and this should be considered also in this case.


Finally, if I change the oder of the service part in the query, it does return 
the correct result.

PREFIX : <http://example/>
PREFIX  dc:     <http://purl.org/dc/elements/1.1/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT * where
{
 
  SERVICE <http://localhost:3031/persons/query>
     { ?author foaf:name ?name  }
  ?book dc:title ?title .
  ?book dc:creator ?author.

}


----------------------------------------------------------------------------------------------------------------------------------------
| author                                  | name    | book                      
      | title                                          |
========================================================================================================================================
| <http://www-sop.inria.fr/members/Alice> | "Alice" | 
<http://example.org/book/book8> | "Distributed Query Processing for Linked 
Data" |
----------------------------------------------------------------------------------------------------------------------------------------



I was wondering if it is a bug or if I am missing something here.


Best regards,
Rakebul Hasan

Reply via email to