Re: results vary for the same query on same dataset for different engine

Andy Seaborne Wed, 28 Jan 2015 03:14:30 -0800

On 28/01/15 10:49, Qiaser Mehmood wrote:

Thanks Andy, I forgot to mention that I am using Jena to query both the Fuseki 
and Sesame, moreover I dumped the same data in both store.
So you mean that result difference over same data is due to the particular 
engine which return either duplicate (i.e. sesame) and set with no duplicate 
(i.e. Fuseki).
Thanks,Qaiser.


So what does

SELECT (count(distinct ?p) AS ?count ) { ?s ?p ?o }

return in each case?

And how are you counting results (listOfPropertiesInDataset is not Jenacode).


        Andy


      On Tuesday, January 27, 2015 8:50 PM, Andy Seaborne <[email protected]> 
wrote:


  On 27/01/15 17:32, Qiaser Mehmood wrote:

What could be the reason of results (listOfPropertiesInDataset) difference for 
the same query which runs on two different engine e.g. fuseki and sesame. I 
dumped the Kegg data into fuseki and sesame and when I run the following query 
the results vary.
PREFIX void: <http://rdfs.org/ns/void#> CONSTRUCT { <datasetUri> 
void:propertyPartition ?pUri . ?pUri void:property ?p . } WHERE { ?s ?p ?o . 
BIND(IRI(CONCAT(STR(<baseUri>),MD5(STR(?p)))) AS ?pUri)}

In fuseki it returns 42 and in sesame it returns back 740444
Best,Qaiser.


I guess there are 42 different predicates in the data.

SELECT (count(distinct ?p) AS ?count ) { ?s ?p /o }

Jena returns a model, a set of triples.  Set means no duplicates.

It looks liek you are using the form of execution in Sesame that returns
an iterator of stream of triples.  No suppression of duplicates.

In your query:

PREFIX  void: <http://rdfs.org/ns/void#>

CONSTRUCT
   { <http://example/base/datasetUri> void:propertyPartition ?pUri .
     ?pUri void:property ?p .}
WHERE
   { ?s ?p ?o
     BIND(iri(concat(str(<http://example/base/baseUri>), MD5(str(?p))))
AS ?pUri)
   }

Your query has massive duplicates - it projects out ?s and ?o..

Many ?s ?p ?o, few distinct ?p

Try this:

WHERE
   { SELECT DISTINCT ?p ?pUri {
     ?s ?p ?o
     BIND(iri(concat(str(<http://example/base/baseUri>), MD5(str(?p))))
AS ?pUri)
     }
   }


     Andy

Re: results vary for the same query on same dataset for different engine

Reply via email to