jena-arq: FROM / FROM NAMED clauses of SPARQL queries over in-memory Dataset are ignored

2015-12-09 Thread Claus Stadler

Hi,

It appears that the FROM / FROM NAMED clauses of SPARQL queries are ignored 
when executed over a Dataset.
In the example below, I would expect the first result set to yield the content 
of the file, whereas I expect the second one to be empty as the specified named 
graph does not exist, yet, I get the exact opposite.

Are there some magic switches or algebra transformations that can be applied 
programmatically for changing the behavior?

Similar queries on Virtuoso on http://dbpedia.org/sparql work to my expectation.

Tested with jena-arq 2.13.0 and 3.0.0
A related issue might be this one, but it appears a fix was only provided for 
Fuseki: https://issues.apache.org/jira/browse/JENA-1004

Cheers,
Claus

Code:
public class TestDatasetQuery {
@Test
public void test() throws IOException {
Dataset ds = DatasetFactory.createMem();
RDFDataMgr.read(ds, new 
ClassPathResource("test-person.nq").getInputStream(), Lang.NQUADS);

String graphName = ds.listNames().next();
Node s = 
ds.getNamedModel(graphName).listSubjects().toSet().iterator().next().asNode();
System.out.println("Got subject: " + s + " in graph " + graphName);

{
// Should yield some solutions - but actually doesn't
QueryExecution qe = QueryExecutionFactory.create("SELECT * FROM <" + graphName + 
"> { ?s ?p ?o }", ds);
ResultSet rs = qe.execSelect();
System.out.println(ResultSetFormatter.asText(rs));
}

{
// Should not return anything, as the named graph does not exist, 
yet, the original data is returned
QueryExecution qe = QueryExecutionFactory.create("SELECT * FROM NAMED 
 { Graph ?g { ?s ?p ?o } }", ds);
ResultSet rs = qe.execSelect();
System.out.println(ResultSetFormatter.asText(rs));
}
}
}

File: test-person.nq
  "John 
Doe"^^  .
  
"20"^^  .

Output:

Got subject: http://ex.org/JohnDoe in graph http://ex.org/graph/
-
| s | p | o |
=
-

---
| s   | p | o   
 | g  |
===
|  |    | 
"20"^^ |  |
|  |  | "John Doe"
   |  |
---


--
Dipl. Inf. Claus Stadler
Department of Computer Science, University of Leipzig
Research Group: http://aksw.org/
Workpage & WebID: http://aksw.org/ClausStadler
Phone: +49 341 97-32260



Re: jena-arq: FROM / FROM NAMED clauses of SPARQL queries over in-memory Dataset are ignored

2015-12-09 Thread Andy Seaborne

On 09/12/15 10:38, Claus Stadler wrote:

Hi,

It appears that the FROM / FROM NAMED clauses of SPARQL queries are
ignored when executed over a Dataset.


For the general dataset, supplying the dataset directly to 
QueryExecutionFactory overrides any query dataset description.



In the example below, I would expect the first result set to yield the
content of the file, whereas I expect the second one to be empty as the
specified named graph does not exist, yet, I get the exact opposite.


It is better to use GRAPH to access the named graph in a dataset.


Are there some magic switches or algebra transformations that can be
applied programmatically for changing the behavior?

Similar queries on Virtuoso on http://dbpedia.org/sparql work to my
expectation.


FWIW as does TDB.
https://jena.apache.org/documentation/tdb/dynamic_datasets.html

SPARQL says "there is a dataset description" - how it is interpreted is 
not fixed.  If you execute a query without dataset, it take the query 
and use the FROM/FROM NAMED to load from the web.


The issue is what is the pool of graphs from which to resolve URIs.

A URL should go to the graph - it really does not name a graph inside 
dataset if that graph is a copy of somewhere else.



Tested with jena-arq 2.13.0 and 3.0.0
A related issue might be this one, but it appears a fix was only
provided for Fuseki: https://issues.apache.org/jira/browse/JENA-1004


The commit shows the code that processes the query for this feature.

DynamicDatasets has the necessary code.

Dataset ds = ...
Query query = ...
// Get the description from the query
DatasetDescription desc = DatasetDescription.create(q) ;

// Create a new dataset using the original as the source of graphs.
Dataset ds1 = DynamicDatasets.dynamicDataset(desc, ds, false) ;
... QueryExecutionFactory.create(query, ds1);

but the proper way is to use GRAPH which is the SPARQL feature for 
accessing a graph in a dataset.


Andy



Cheers,
Claus

Code:
public class TestDatasetQuery {
 @Test
 public void test() throws IOException {
 Dataset ds = DatasetFactory.createMem();
 RDFDataMgr.read(ds, new
ClassPathResource("test-person.nq").getInputStream(), Lang.NQUADS);

 String graphName = ds.listNames().next();
 Node s =
ds.getNamedModel(graphName).listSubjects().toSet().iterator().next().asNode();

 System.out.println("Got subject: " + s + " in graph " +
graphName);

 {
 // Should yield some solutions - but actually doesn't
 QueryExecution qe = QueryExecutionFactory.create("SELECT *
FROM <" + graphName + "> { ?s ?p ?o }", ds);
 ResultSet rs = qe.execSelect();
 System.out.println(ResultSetFormatter.asText(rs));
 }

 {
 // Should not return anything, as the named graph does not
exist, yet, the original data is returned
 QueryExecution qe = QueryExecutionFactory.create("SELECT *
FROM NAMED  { Graph ?g { ?s ?p ?o } }", ds);
 ResultSet rs = qe.execSelect();
 System.out.println(ResultSetFormatter.asText(rs));
 }
 }
}

File: test-person.nq
  "John
Doe"^^  .
 
"20"^^  .

Output:

Got subject: http://ex.org/JohnDoe in graph http://ex.org/graph/
-
| s | p | o |
=
-

---

| s   | p |
o| g  |
===

|  |    |
"20"^^ |  |
|  |  | "John
Doe"   |  |
---