Thanks Andy, I forgot to mention that I am using Jena to query both the Fuseki
and Sesame, moreover I dumped the same data in both store.
So you mean that result difference over same data is due to the particular
engine which return either duplicate (i.e. sesame) and set with no duplicate
(i.e. Fuseki).
Thanks,Qaiser.
On Tuesday, January 27, 2015 8:50 PM, Andy Seaborne <[email protected]>
wrote:
On 27/01/15 17:32, Qiaser Mehmood wrote:
> What could be the reason of results (listOfPropertiesInDataset) difference
> for the same query which runs on two different engine e.g. fuseki and sesame.
> I dumped the Kegg data into fuseki and sesame and when I run the following
> query the results vary.
> PREFIX void: <http://rdfs.org/ns/void#> CONSTRUCT { <datasetUri>
> void:propertyPartition ?pUri . ?pUri void:property ?p . } WHERE { ?s ?p ?o .
> BIND(IRI(CONCAT(STR(<baseUri>),MD5(STR(?p)))) AS ?pUri)}
>
> In fuseki it returns 42 and in sesame it returns back 740444
> Best,Qaiser.
>
I guess there are 42 different predicates in the data.
SELECT (count(distinct ?p) AS ?count ) { ?s ?p /o }
Jena returns a model, a set of triples. Set means no duplicates.
It looks liek you are using the form of execution in Sesame that returns
an iterator of stream of triples. No suppression of duplicates.
In your query:
PREFIX void: <http://rdfs.org/ns/void#>
CONSTRUCT
{ <http://example/base/datasetUri> void:propertyPartition ?pUri .
?pUri void:property ?p .}
WHERE
{ ?s ?p ?o
BIND(iri(concat(str(<http://example/base/baseUri>), MD5(str(?p))))
AS ?pUri)
}
Your query has massive duplicates - it projects out ?s and ?o..
Many ?s ?p ?o, few distinct ?p
Try this:
WHERE
{ SELECT DISTINCT ?p ?pUri {
?s ?p ?o
BIND(iri(concat(str(<http://example/base/baseUri>), MD5(str(?p))))
AS ?pUri)
}
}
Andy