Hi, The problem is the ORDER BY clause. I am expecting to obtain 8M of results and it makes the ordering very slow ( more than 26h ). Without ordering takes less than 3h.
I need the resultSet ordered by ?mID because I can't order it in a post process due to memory problems. Do you know if is possible to improve the performance of the query with the order by clause? > prefix fb: <http://rdf.freebase.com/ns/> > prefix fn: <http://www.w3.org/2005/xpath-functions#> > select ?mID ?e ?nf ?desc ?wikipedia_url > where > { > { > ?mID fb:type.object.type fb:people.person . > ?mID fb:type.object.name ?e . > ?mID fb:common.topic.notable_for ?notab_for . > ?notab_for fb:common.notable_for.display_name ?nf . > ?mID fb:common.topic.description ?desc . > FILTER (langMatches(lang(?e), "en") && langMatches(lang(?nf), "en") && > langMatches(lang(?desc), "en")) > } > > OPTIONAL > { > ?mID fb:common.topic.topic_equivalent_webpage ?wikipedia_url . > FILTER (regex (str(?wikipedia_url), "en.wikipedia", "i") && !regex > (str(?wikipedia_url), "curid=", "i")) . > } > } > > ORDER BY ?mID Regards, Diego. On 7 Mar 2014, at 22:16, Andy Seaborne <[email protected]> wrote: > You can try forcing the scope of the filter to be like your second query then > do the optional part: > > prefix fb: <http://rdf.freebase.com/ns/> > prefix fn: <http://www.w3.org/2005/xpath-functions#> > select ?mID ?e ?nf ?desc ?wikipedia_url > where > { > { > ?mID fb:type.object.type fb:people.person . > ?mID fb:type.object.name ?e . > ?mID fb:common.topic.notable_for ?notab_for . > ?notab_for fb:common.notable_for.display_name ?nf . > ?mID fb:common.topic.description ?desc . > FILTER (langMatches(lang(?e), "en") && langMatches(lang(?nf), "en") && > langMatches(lang(?desc), "en")) > } > > OPTIONAL > { > ?mID fb:common.topic.topic_equivalent_webpage ?wikipedia_url . > FILTER (regex (str(?wikipedia_url), "en.wikipedia", "i") && !regex > (str(?wikipedia_url), "curid=", "i")) . > } > } > > which may be more like the 3h query. > > There are improvements in progress for this, but they haven't reached TDB yet. > > The hardware you are running will be a big factor. > > Andy > > > > On 07/03/14 11:22, Paton, Diego wrote: >> >> Hi, >> >> I am working with the Freebase ontology stored in Apache JENA TDB and >> executing queries using Fuseki. >> >> What I want to retrieve is the mID, entity name, description and optionally >> the wikipedia url if present ( I expect to obtain more than 6M of results ). >> The problem is the query takes more than 24h to run. >> >> >> prefix fb: <http://rdf.freebase.com/ns/> >> prefix fn: <http://www.w3.org/2005/xpath-functions#> >> select ?mID ?e ?nf ?desc ?wikipedia_url >> where >> { >> { >> ?mID fb:type.object.type fb:people.person . >> ?mID fb:type.object.name ?e . >> ?mID fb:common.topic.notable_for ?notab_for . >> ?notab_for fb:common.notable_for.display_name ?nf . >> ?mID fb:common.topic.description ?desc . >> >> OPTIONAL >> { >> ?mID fb:common.topic.topic_equivalent_webpage ?wikipedia_url . >> FILTER (regex (str(?wikipedia_url), "en.wikipedia", "i") && !regex >> (str(?wikipedia_url), "curid=", "i")) . >> } >> >> FILTER (langMatches(lang(?e), "en") && langMatches(lang(?nf), "en") >> && langMatches(lang(?desc), "en")) >> } >> >> } >> >> ORDER BY ?mID >> >> >> This modified query below with optional attribute removed takes 3h. >> >> prefix fb: <http://rdf.freebase.com/ns/> >> prefix fn: <http://www.w3.org/2005/xpath-functions#> >> select ?mID ?e ?nf ?desc >> where >> { >> { >> ?mID fb:type.object.type fb:people.person . >> ?mID fb:type.object.name ?e . >> ?mID fb:common.topic.notable_for ?notab_for . >> ?notab_for fb:common.notable_for.display_name ?nf . >> ?mID_raw fb:common.topic.description ?desc . >> >> FILTER (langMatches(lang(?e), "en") && langMatches(lang(?nf), "en") >> && langMatches(lang(?desc), "en")) >> } >> BIND(REPLACE(str(?mID_raw), "http://rdf.freebase.com/ns/", "") as ?mID) >> } >> >> ORDER BY ?mID >> >> And the modified query below with filter removed in the optional clause >> takes more than 20h ( still running ) >> >> >> prefix fb: <http://rdf.freebase.com/ns/> >> prefix fn: <http://www.w3.org/2005/xpath-functions#> >> select ?mID ?e ?nf ?desc ?wikipedia_url >> where >> { >> { >> ?mID fb:type.object.type fb:people.person . >> ?mID fb:type.object.name ?e . >> ?mID fb:common.topic.notable_for ?notab_for . >> ?notab_for fb:common.notable_for.display_name ?nf . >> ?mID fb:common.topic.description ?desc . >> >> OPTIONAL >> { >> ?mID fb:common.topic.topic_equivalent_webpage ?wikipedia_url . >> } >> >> FILTER (langMatches(lang(?e), "en") && langMatches(lang(?nf), "en") >> && langMatches(lang(?desc), "en")) >> } >> >> } >> >> ORDER BY ?mID >> >> Do you have some ideas about how to improve the performance of the first >> query that is the one meets my requirements? >> >> Regards, >> >> Diego. >> >> >> >
