On 28/01/15 10:49, Qiaser Mehmood wrote:
Thanks Andy, I forgot to mention that I am using Jena to query both the Fuseki
and Sesame, moreover I dumped the same data in both store.
So you mean that result difference over same data is due to the particular
engine which return either duplicate (i.e. sesame) and set with no duplicate
(i.e. Fuseki).
Thanks,Qaiser.
So what does
SELECT (count(distinct ?p) AS ?count ) { ?s ?p ?o }
return in each case?
And how are you counting results (listOfPropertiesInDataset is not Jena
code).
Andy
On Tuesday, January 27, 2015 8:50 PM, Andy Seaborne <[email protected]>
wrote:
On 27/01/15 17:32, Qiaser Mehmood wrote:
What could be the reason of results (listOfPropertiesInDataset) difference for
the same query which runs on two different engine e.g. fuseki and sesame. I
dumped the Kegg data into fuseki and sesame and when I run the following query
the results vary.
PREFIX void: <http://rdfs.org/ns/void#> CONSTRUCT { <datasetUri>
void:propertyPartition ?pUri . ?pUri void:property ?p . } WHERE { ?s ?p ?o .
BIND(IRI(CONCAT(STR(<baseUri>),MD5(STR(?p)))) AS ?pUri)}
In fuseki it returns 42 and in sesame it returns back 740444
Best,Qaiser.
I guess there are 42 different predicates in the data.
SELECT (count(distinct ?p) AS ?count ) { ?s ?p /o }
Jena returns a model, a set of triples. Set means no duplicates.
It looks liek you are using the form of execution in Sesame that returns
an iterator of stream of triples. No suppression of duplicates.
In your query:
PREFIX void: <http://rdfs.org/ns/void#>
CONSTRUCT
{ <http://example/base/datasetUri> void:propertyPartition ?pUri .
?pUri void:property ?p .}
WHERE
{ ?s ?p ?o
BIND(iri(concat(str(<http://example/base/baseUri>), MD5(str(?p))))
AS ?pUri)
}
Your query has massive duplicates - it projects out ?s and ?o..
Many ?s ?p ?o, few distinct ?p
Try this:
WHERE
{ SELECT DISTINCT ?p ?pUri {
?s ?p ?o
BIND(iri(concat(str(<http://example/base/baseUri>), MD5(str(?p))))
AS ?pUri)
}
}
Andy