Thanks Rob for your quick reply! hummm I see, what you are saying indeed makes sense, so what you propose is to have a query like this?
PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dbo: <http://dbpedia.org/ontology/> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> SELECT DISTINCT ?p (GROUP_CONCAT(DISTINCT ?o;separator="-----") AS ?vals) ?id ?pr ?link WHERE { { SELECT DISTINCT ?link (STR(?o3) AS ?id) (STR(?o2) AS ?pr) WHERE { ?link dbo:wikiPageRank ?o2 . ?link dbo:wikiPageID ?o3 . FILTER NOT EXISTS{?link dbo:wikiPageRedirects ?x} . FILTER NOT EXISTS{?link dbo:wikiPageDisambiguates ?y} . } LIMIT 1 OFFSET %offset } { ?link ?p ?o . FILTER(DATATYPE(?o) = xsd:string || LANG(?o) = "en") . } UNION { VALUES ?p {dbo:wikiPageRedirects dbo:wikiPageDisambiguates} . ?x ?p ?link . ?x rdfs:label ?o . } UNION { VALUES ?p {rdf:type} . ?link ?p ?o . FILTER(CONTAINS(STR(?o), "http://dbpedia.org/ontology/")) . } } GROUP BY ?p ?id ?pr ?link Julien Plu PhD Student, EURECOM [email protected] <mailto:[email protected]> | [email protected] <mailto:[email protected]> http://jplu.github.io <http://jplu.github.io/> Campus SophiaTech 450 route des Chappes 06410 Biot, France Phone: +33 (0) 4 93008103 <tel:+33%20(0)4%2093008103> > Le 2 oct. 2017 à 11:58, Rob Vesse <[email protected]> a écrit : > > Julien > > At a glance your query is very broad in that it effectively selects the > entire dataset and applies string filters over the data e.g. the CONTAINS > filter. > > This will force TDB to read pretty much the entire dataset on every single > query.You may be better off moving the subquery with the limit on it to the > start of your query as then TDB can probably use the single result to limit > the amount of data it has to read to answer the rest of your query. > > Rob > > On 02/10/2017 10:30, "Julien Plu" <[email protected] on behalf of > [email protected]> wrote: > > Hello, > > The code I'm using can be found here: > https://gist.github.com/jplu/9d3aa4075145e31c2882f3372b1be3e3 > > My problem is that one iteration of my loop (line 88) takes a very long > time (between 3 and 5 minutes), and I don't understand why. > > I think it is because I'm certainly missing something in the usage of TDB, > but I don't see what. > > The dataset is DBpedia. > > Thanks in advance for any light. > > Regards. > > *Julien Plu* > PhD Student, EURECOM > [email protected] | [email protected] > *http://jplu.github.io* <http://jplu.github.io/> > Campus SophiaTech > 450 route des Chappes > 06410 Biot, France > Phone: +33 (0) 4 93008103 <+33%20(0)4%2093008103> > > > > >
signature.asc
Description: Message signed with OpenPGP
