The query SELECT (count(distinct ?p) AS ?count ) { ?s ?p ?o } returns 21 in
both cases. listOfPropertiesInDataset is just name of actual query which I
execute and store in a model. mdl= qry.execConstruct
However, if I run the following code and get triples count for that query:
qry=QueryExecutionFactory.sparqlService(endpoint, query);int count= 0;Iterator
<Triple> triples = qry.execConstructTriples();while(triples.hasNext()){
triples.next();
count++;
}System.out.print("Triples count value is" + count);
The count value is different for Fuseki (42) and (740444) for Sesame. Although
the data is same in both stores. What could be a reason for this difference?.
On Wednesday, January 28, 2015 11:13 AM, Andy Seaborne <[email protected]>
wrote:
On 28/01/15 10:49, Qiaser Mehmood wrote:
> Thanks Andy, I forgot to mention that I am using Jena to query both the
> Fuseki and Sesame, moreover I dumped the same data in both store.
> So you mean that result difference over same data is due to the particular
> engine which return either duplicate (i.e. sesame) and set with no duplicate
> (i.e. Fuseki).
> Thanks,Qaiser.
So what does
SELECT (count(distinct ?p) AS ?count ) { ?s ?p ?o }
return in each case?
And how are you counting results (listOfPropertiesInDataset is not Jena
code).
Andy
>
> On Tuesday, January 27, 2015 8:50 PM, Andy Seaborne <[email protected]>
>wrote:
>
>
> On 27/01/15 17:32, Qiaser Mehmood wrote:
>> What could be the reason of results (listOfPropertiesInDataset) difference
>> for the same query which runs on two different engine e.g. fuseki and
>> sesame. I dumped the Kegg data into fuseki and sesame and when I run the
>> following query the results vary.
>> PREFIX void: <http://rdfs.org/ns/void#> CONSTRUCT { <datasetUri>
>> void:propertyPartition ?pUri . ?pUri void:property ?p . } WHERE { ?s ?p ?o .
>> BIND(IRI(CONCAT(STR(<baseUri>),MD5(STR(?p)))) AS ?pUri)}
>>
>> In fuseki it returns 42 and in sesame it returns back 740444
>> Best,Qaiser.
>>
>
> I guess there are 42 different predicates in the data.
>
> SELECT (count(distinct ?p) AS ?count ) { ?s ?p /o }
>
> Jena returns a model, a set of triples. Set means no duplicates.
>
> It looks liek you are using the form of execution in Sesame that returns
> an iterator of stream of triples. No suppression of duplicates.
>
> In your query:
>
> PREFIX void: <http://rdfs.org/ns/void#>
>
> CONSTRUCT
> { <http://example/base/datasetUri> void:propertyPartition ?pUri .
> ?pUri void:property ?p .}
> WHERE
> { ?s ?p ?o
> BIND(iri(concat(str(<http://example/base/baseUri>), MD5(str(?p))))
> AS ?pUri)
> }
>
> Your query has massive duplicates - it projects out ?s and ?o..
>
> Many ?s ?p ?o, few distinct ?p
>
> Try this:
>
> WHERE
> { SELECT DISTINCT ?p ?pUri {
> ?s ?p ?o
> BIND(iri(concat(str(<http://example/base/baseUri>), MD5(str(?p))))
> AS ?pUri)
> }
> }
>
>
> Andy
>
>
>
>
>