[discovery] Re: Timeout

David Causse Mon, 12 Dec 2022 00:23:38 -0800

Hi Tomer,

Unfortunately your queries do work on a rather large portion of the data
(P625 has ~10 million items) and I could not find an obvious way to
optimize them.
Have you considered using other services like
https://qlever.cs.uni-freiburg.de/wikidata or
https://wikidata.demo.openlinksw.com/sparql to have a comparison of how
they perform?
It is very unlikely that we will allow longer timeouts in the near future
so if you plan to work on a large subset I think that using dumps (RDF or
json) might be a better option for you at the moment. WDQS is not fit to
extract large subsets of wikidata.
Another option might be to discuss and get advice on
https://www.wikidata.org/wiki/Wikidata:Request_a_query, there might be
different and more performant ways to do what you want?


Hope this helps a bit,

David.


On Fri, Dec 9, 2022 at 1:22 PM <[email protected]> wrote:

> Hi,
> I am getting frequent timeouts trying to use the SPARQL endpoint GUI at
> https://query.wikidata.org/ .
> I'll admit, I have some complex queries, bu I really feel like this is
> something that the system should be able to handle or at least allow me to
> request a longer timeout wait.
> For example, this query:
>
>             SELECT ?item ?item2
>             WHERE
>             {
>
>             ?item wdt:P625 ?location .
>             ?item <http://www.w3.org/2002/07/owl#sameAs> ?item2 .
>
>             }
>             LIMIT 10
>
> or this query:
>
> SELECT DISTINCT ?item ?itemname  ?location
>         WHERE {
>                 ?item wdt:P625 ?location ;
>                     wdt:P31 ?type ;
>                     rdfs:label ?itemname.
>                 ?type wdt:P279 ?supertype .
>
>                 FILTER(
>                 LANG(?itemname) = "en" &&
>                 ?supertype not in (wd:Q5, wd:Q4991371, wd:Q7283,
> wd:Q36180, wd:Q7094076, wd:Q905511, wd:Q1063801,
>                 wd:Q1062856, wd:Q35127, wd:Q68, wd:Q42848, wd:Q2858615,
> wd:Q241317 , wd:Q1662611, wd:Q7397, wd:Q151885,
>                 wd:Q1301371, wd:Q1068715, wd:Q7366 , wd:Q18602249,
> wd:Q16521, wd:Q746549, wd:Q13485782, wd:Q36963)
>                 )
>
>         }
> LIMIT 200000
>
> When I use python SPARQLwrapper things improve somewhat, but still timeout
> on some of my queries.
> I tried the first query above on an old wikidata dump we have from 2021
> that we loaded on Jena TDB and it managed to complete it (0 results, but I
> had to run it to figure that out...).
> Seems strange to get such poor performance.
> Cheers
> Tomer
> _______________________________________________
> Discovery mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
>

_______________________________________________
Discovery mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[discovery] Re: Timeout

Reply via email to