On 28/01/2020 07:50, Lorenz Buehmann wrote:
Yes, the intermediate result is large, I tried it on CLI:
|bin/rsparql --service https://query.wikidata.org/sparql "select *
{?wikidata_link <|||http://www.wikidata.org/prop/direct/|P18> ?image}" >
/tmp/res.sparql|
|It will most likely lead to an OOM error - unless your increase the JVM
heap memory. It's because either a large JSON or XML object will be
returned here and has to be parsed resp. processed.|
Then one possibility is it is pushing the Fuseki query execution into GC
overload. It is possible that teh GC is working very hard and making
minimal progress only to be triggered again for a full-GC. The fuseki
start-up script sets the JVM heap size but not to a huge amount. 3
million rows might trigger GC problems. You can override it with
JVM_ARGS.
|The subquery hint from Andy is a nice workaround, but indeed you would
get only partial results - this might not contain the Wikidata resources
from your dataset, thus, the result would be incomplete or even empty.
|
The thing is that SERVICE used to execute the other way - sending
several small requests. Except that causes other problems when the
query has a large number of possibilities to try from earlier in the query.
GC thrashing sounds possible.
Andy
|
|
On 27.01.20 23:47, Andy Seaborne wrote:
The query will try to pull a lot of data from query.wikidata.org/sparql.
(see comments on the SO query)
What might help is to write a subselect with LIMIT inside the SERVICE
and put a limit on that. That pushes a LIMIT to the far end which, as
written, does not happen.
Andy
On 27/01/2020 19:50, jani wrote:
Hi everybody,
I try to get some image links from wikidata by running a SPARQL-query
from my local Jena Fuseki instance. I want to merge it with data from
my local graph. Unfortunately the query isn't delivering any data but
runs and runs instead without any error message.
The sparql-query:
|PREFIX owl: <http://www.w3.org/2002/07/owl#> PREFIX wd:
<http://www.wikidata.org/entity/> PREFIX wdt:
<http://www.wikidata.org/prop/direct/> PREFIX foaf:
<http://xmlns.com/foaf/0.1/> SELECT ?name ?image WHERE { ?s foaf:name
?name. ?s owl:sameAs ?wikidata_link. FILTER
regex(str(?wikidata_link), "wikidata"). SERVICE
<https://query.wikidata.org/sparql> { ?wikidata_link wdt:P18 ?image.
} } LIMIT 10 |
The test data I have in my local graph on the Jena Fuseki server:
|@base <http://dmt.de/pages> . @prefix rdf:
<http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs:
<http://www.w3.org/2000/01/rdf-schema#> . @prefix foaf:
<http://xmlns.com/foaf/0.1/> . @prefix dbp:
<http://dbpedia.org/resource/> . @prefix wd:
<https://www.wikidata.org/entity/> . @prefix owl:
<http://www.w3.org/2002/07/owl#> . <#john-cage> a foaf:Person ;
foaf:name "John Cage"; owl:sameAs dbp:John_Cage, wd:Q180727.
<#karlheinz-stockhausen> a foaf:Person ; foaf:name "Karlheinz
Stockhausen"; owl:sameAs dbp:Karlheinz_Stockhausen, wd:Q154556.
<#arnold-schoenberg> a foaf:Person; foaf:name "Arnold Schönberg";
owl:sameAs dbp:Arnold_Schoenberg, wd:Q154770. |
I tried a similar query for dbpedia-data which run perfectly.
|PREFIX owl: <http://www.w3.org/2002/07/owl#> PREFIX dbp:
<http://dbpedia.org/resource/> PREFIX foaf:
<http://xmlns.com/foaf/0.1/> PREFIX dbo:
<http://dbpedia.org/ontology/> SELECT ?name ?dbpedia_link ?birthplace
WHERE { ?s foaf:name ?name. ?s owl:sameAs ?dbpedia_link. FILTER
regex(str(?dbpedia_link),"dbpedia.org"). SERVICE
<https://dbpedia.org/sparql> { ?dbpedia_link dbo:birthPlace
?birthplace. } } LIMIT 10 |
Any Ideas? Thanks in advance!
Jan Seipel
PS: also got this question on stackoverflow:
https://stackoverflow.com/questions/59937684/sparql-query-to-get-data-from-wikidata-not-working