Hi,

The problem is the ORDER BY clause. I am expecting to obtain 8M of results and 
it makes the ordering very slow ( more than 26h ). Without ordering takes less 
than 3h.

I need the resultSet ordered by ?mID because I can't order it in a post process 
due to memory problems.

Do you know if is possible to improve the performance of the query with the 
order by clause?


> prefix fb: <http://rdf.freebase.com/ns/>
> prefix fn: <http://www.w3.org/2005/xpath-functions#>
> select ?mID ?e ?nf ?desc  ?wikipedia_url
> where
> {
>   {
>       ?mID fb:type.object.type fb:people.person .
>       ?mID fb:type.object.name ?e .
>       ?mID fb:common.topic.notable_for ?notab_for .
>       ?notab_for fb:common.notable_for.display_name ?nf .
>       ?mID fb:common.topic.description ?desc .
>       FILTER (langMatches(lang(?e), "en") && langMatches(lang(?nf), "en") && 
> langMatches(lang(?desc), "en"))
>    }
> 
>    OPTIONAL
>    {
>      ?mID fb:common.topic.topic_equivalent_webpage ?wikipedia_url .
>      FILTER (regex (str(?wikipedia_url), "en.wikipedia", "i") && !regex 
> (str(?wikipedia_url), "curid=", "i")) .
>    }
> }
> 
> ORDER BY ?mID


Regards,

Diego.


On 7 Mar 2014, at 22:16, Andy Seaborne <[email protected]> wrote:

> You can try forcing the scope of the filter to be like your second query then 
> do the optional part:
> 
> prefix fb: <http://rdf.freebase.com/ns/>
> prefix fn: <http://www.w3.org/2005/xpath-functions#>
> select ?mID ?e ?nf ?desc  ?wikipedia_url
> where
> {
>   {
>       ?mID fb:type.object.type fb:people.person .
>       ?mID fb:type.object.name ?e .
>       ?mID fb:common.topic.notable_for ?notab_for .
>       ?notab_for fb:common.notable_for.display_name ?nf .
>       ?mID fb:common.topic.description ?desc .
>       FILTER (langMatches(lang(?e), "en") && langMatches(lang(?nf), "en") && 
> langMatches(lang(?desc), "en"))
>    }
> 
>    OPTIONAL
>    {
>      ?mID fb:common.topic.topic_equivalent_webpage ?wikipedia_url .
>      FILTER (regex (str(?wikipedia_url), "en.wikipedia", "i") && !regex 
> (str(?wikipedia_url), "curid=", "i")) .
>    }
> }
> 
> which may be more like the 3h query.
> 
> There are improvements in progress for this, but they haven't reached TDB yet.
> 
> The hardware you are running will be a big factor.
> 
>       Andy
> 
> 
> 
> On 07/03/14 11:22, Paton, Diego wrote:
>> 
>> Hi,
>> 
>> I am working with the Freebase ontology stored in Apache JENA TDB and 
>> executing queries using Fuseki.
>> 
>> What I want to retrieve is the mID, entity name, description and optionally 
>> the wikipedia url if present ( I expect to obtain more than 6M of results ). 
>> The problem is the query takes more than 24h to run.
>> 
>> 
>> prefix fb: <http://rdf.freebase.com/ns/>
>> prefix fn: <http://www.w3.org/2005/xpath-functions#>
>> select ?mID ?e ?nf ?desc  ?wikipedia_url
>> where
>> {
>>     {
>>        ?mID fb:type.object.type fb:people.person .
>>        ?mID fb:type.object.name ?e .
>>        ?mID fb:common.topic.notable_for ?notab_for .
>>        ?notab_for fb:common.notable_for.display_name ?nf .
>>        ?mID fb:common.topic.description ?desc .
>> 
>>        OPTIONAL
>>        {
>>           ?mID fb:common.topic.topic_equivalent_webpage ?wikipedia_url .
>>           FILTER (regex (str(?wikipedia_url), "en.wikipedia", "i") && !regex 
>> (str(?wikipedia_url), "curid=", "i")) .
>>        }
>> 
>>        FILTER (langMatches(lang(?e), "en") && langMatches(lang(?nf), "en") 
>> && langMatches(lang(?desc), "en"))
>>     }
>> 
>> }
>> 
>> ORDER BY ?mID
>> 
>> 
>> This modified query below with optional attribute removed takes 3h.
>> 
>> prefix fb: <http://rdf.freebase.com/ns/>
>> prefix fn: <http://www.w3.org/2005/xpath-functions#>
>> select ?mID ?e ?nf ?desc
>> where
>> {
>>     {
>>        ?mID fb:type.object.type fb:people.person .
>>        ?mID fb:type.object.name ?e .
>>        ?mID fb:common.topic.notable_for ?notab_for .
>>        ?notab_for fb:common.notable_for.display_name ?nf .
>>        ?mID_raw fb:common.topic.description ?desc .
>> 
>>        FILTER (langMatches(lang(?e), "en") && langMatches(lang(?nf), "en") 
>> && langMatches(lang(?desc), "en"))
>>     }
>>     BIND(REPLACE(str(?mID_raw), "http://rdf.freebase.com/ns/";, "") as ?mID)
>> }
>> 
>> ORDER BY ?mID
>> 
>> And the modified query below with filter removed in the optional clause 
>> takes more than 20h ( still running )
>> 
>> 
>> prefix fb: <http://rdf.freebase.com/ns/>
>> prefix fn: <http://www.w3.org/2005/xpath-functions#>
>> select ?mID ?e ?nf ?desc  ?wikipedia_url
>> where
>> {
>>     {
>>        ?mID fb:type.object.type fb:people.person .
>>        ?mID fb:type.object.name ?e .
>>        ?mID fb:common.topic.notable_for ?notab_for .
>>        ?notab_for fb:common.notable_for.display_name ?nf .
>>        ?mID fb:common.topic.description ?desc .
>> 
>>        OPTIONAL
>>        {
>>           ?mID fb:common.topic.topic_equivalent_webpage ?wikipedia_url .
>>        }
>> 
>>        FILTER (langMatches(lang(?e), "en") && langMatches(lang(?nf), "en") 
>> && langMatches(lang(?desc), "en"))
>>     }
>> 
>> }
>> 
>> ORDER BY ?mID
>> 
>> Do you have some ideas about how to improve the performance of the first 
>> query that is the one meets my requirements?
>> 
>> Regards,
>> 
>> Diego.
>> 
>> 
>> 
> 

Reply via email to