[ https://issues.apache.org/jira/browse/JENA-2288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17495101#comment-17495101 ]
Lorenz Bühmann commented on JENA-2288: -------------------------------------- Some comments: 1) I don't think it is the same to run {code:sql} OPTIONAL { SERVICE <> { } } {code} vs {code:sql} SERVICE <> { OPTIONAL { } } {code} The first is a left-join the latter a join in your query. 2) you should start debugging from within Wikidata: the second query with the value inlined is: {code:sql} select ?wikidata_iri (COUNT(?museum) as ?museum_count_in_city) where { OPTIONAL { ?museum (wdt:P131)+ <http://www.wikidata.org/entity/Q612> ; wdt:P31/(wdt:P279)* wd:Q33506 . } } group by ?wikidata_iri {code} and this returns 204 as result. The reason for this is probably how Blazegraph is handling the property paths and it just produces a lots of duplicates. Not that I understand why you would get 201 instead of 204, sound like magic. Doing {code:sql} (COUNT(DISTINCT ?museum) as ?museum_count_in_city){code} will solve this issue. Why the first query returns 3 because the value is not getting inline (see your other issue). It just runs {code:sql} SELECT ?wikidata_iri ?museum WHERE { OPTIONAL { ?museum (<http://www.wikidata.org/prop/direct/P131>)+ ?wikidata_iri . ?museum <http://www.wikidata.org/prop/direct/P31>/(<http://www.wikidata.org/prop/direct/P279>)* <http://www.wikidata.org/entity/Q33506> } } {code} on the endpoint, then does a join. So, it returns 213201 museums with its locations, then does a join on your value. Honestly I don't know why Blazegraph is producing different results for {code:sql} SELECT ?wikidata_iri ?museum WHERE { OPTIONAL { ?museum (<http://www.wikidata.org/prop/direct/P131>)+ ?wikidata_iri . ?museum <http://www.wikidata.org/prop/direct/P31>/(<http://www.wikidata.org/prop/direct/P279>)* <http://www.wikidata.org/entity/Q33506> } } {code} compared to the inlined variant {code:sql} SELECT ?wikidata_iri ?museum WHERE { OPTIONAL { ?museum (<http://www.wikidata.org/prop/direct/P131>)+ <http://www.wikidata.org/entity/Q612> . ?museum <http://www.wikidata.org/prop/direct/P31>/(<http://www.wikidata.org/prop/direct/P279>)* <http://www.wikidata.org/entity/Q33506> } } {code} but you can verify the difference when running both queries directly it on the Wikidata endpoint. Fun fact: when you run your whole Q2 on Wikidata it works without using DISTINCT, i.e. when Wikidata does a service request to itselft Long story short: - Blazegraph produces different results - Jena doesn't inline the data for Q1 but gets all results and does a join - Jena does inline the data for Q2 but Blazegraph produces lots of duplicates, COUNT(DISTINCT will help here - OPTIONAL inside a SERVICE request is not the same as using a SERVICE request inside an OPTIONAL > Counting aggregation inside SERVICE provides wrong result > --------------------------------------------------------- > > Key: JENA-2288 > URL: https://issues.apache.org/jira/browse/JENA-2288 > Project: Apache Jena > Issue Type: Bug > Affects Versions: Jena 4.4.0 > Reporter: Dmitry Zhelobanov > Priority: Major > > Here is a query which retrieves museums in the specific city: > {code:java} > PREFIX wd: <http://www.wikidata.org/entity/> > PREFIX wdt: <http://www.wikidata.org/prop/direct/> > SELECT ?wikidata_iri ?museum > WHERE { > VALUES (?wikidata_iri) { (<http://www.wikidata.org/entity/Q612>) } . > > SERVICE <https://query.wikidata.org/sparql> { > { > select ?wikidata_iri ?museum > where { > OPTIONAL { > ?museum (wdt:P131)+ ?wikidata_iri ; > wdt:P31/(wdt:P279)* wd:Q33506 . > } > } > } > } > } {code} > This query returns 3 results: > |<http://www.wikidata.org/entity/Q612>|<http://www.wikidata.org/entity/Q2125281>| > |<http://www.wikidata.org/entity/Q612>|<http://www.wikidata.org/entity/Q28736367>| > |<http://www.wikidata.org/entity/Q612>|<http://www.wikidata.org/entity/Q67737768>| > And here is a query which is supposed to count the number of the same museums > in the same city: > {code:java} > PREFIX wd: <http://www.wikidata.org/entity/> > PREFIX wdt: <http://www.wikidata.org/prop/direct/> > SELECT ?wikidata_iri ?museum_count_in_city > WHERE { > VALUES (?wikidata_iri) { (<http://www.wikidata.org/entity/Q612>) } . > > SERVICE <https://query.wikidata.org/sparql> { > { > select ?wikidata_iri (COUNT(?museum) as ?museum_count_in_city) > where { > OPTIONAL { > ?museum (wdt:P131)+ ?wikidata_iri ; > wdt:P31/(wdt:P279)* wd:Q33506 . > } > } group by ?wikidata_iri > } > } > }{code} > But the count value produced by the query is wrong: > |<[http://www.wikidata.org/entity/Q612]>|"201"{^}^^<[http://www.w3.org/2001/XMLSchema#integer]>{^}| > It outputs *201* instead of expected *3.* -- This message was sent by Atlassian Jira (v8.20.1#820001)