Hi,

I am working with the Freebase ontology stored in Apache JENA TDB and executing 
queries using Fuseki.

What I want to retrieve is the mID, entity name, description and optionally the 
wikipedia url if present ( I expect to obtain more than 6M of results ). The 
problem is the query takes more than 24h to run.


prefix fb: <http://rdf.freebase.com/ns/>
prefix fn: <http://www.w3.org/2005/xpath-functions#>
select ?mID ?e ?nf ?desc  ?wikipedia_url
where
{
    {
       ?mID fb:type.object.type fb:people.person .
       ?mID fb:type.object.name ?e .
       ?mID fb:common.topic.notable_for ?notab_for .
       ?notab_for fb:common.notable_for.display_name ?nf .
       ?mID fb:common.topic.description ?desc .

       OPTIONAL
       {
          ?mID fb:common.topic.topic_equivalent_webpage ?wikipedia_url .
          FILTER (regex (str(?wikipedia_url), "en.wikipedia", "i") && !regex 
(str(?wikipedia_url), "curid=", "i")) .
       }

       FILTER (langMatches(lang(?e), "en") && langMatches(lang(?nf), "en") && 
langMatches(lang(?desc), "en"))
    }

}

ORDER BY ?mID


This modified query below with optional attribute removed takes 3h.

prefix fb: <http://rdf.freebase.com/ns/>
prefix fn: <http://www.w3.org/2005/xpath-functions#>
select ?mID ?e ?nf ?desc
where
{
    {
       ?mID fb:type.object.type fb:people.person .
       ?mID fb:type.object.name ?e .
       ?mID fb:common.topic.notable_for ?notab_for .
       ?notab_for fb:common.notable_for.display_name ?nf .
       ?mID_raw fb:common.topic.description ?desc .

       FILTER (langMatches(lang(?e), "en") && langMatches(lang(?nf), "en") && 
langMatches(lang(?desc), "en"))
    }
    BIND(REPLACE(str(?mID_raw), "http://rdf.freebase.com/ns/";, "") as ?mID)
}

ORDER BY ?mID

And the modified query below with filter removed in the optional clause takes 
more than 20h ( still running )


prefix fb: <http://rdf.freebase.com/ns/>
prefix fn: <http://www.w3.org/2005/xpath-functions#>
select ?mID ?e ?nf ?desc  ?wikipedia_url
where
{
    {
       ?mID fb:type.object.type fb:people.person .
       ?mID fb:type.object.name ?e .
       ?mID fb:common.topic.notable_for ?notab_for .
       ?notab_for fb:common.notable_for.display_name ?nf .
       ?mID fb:common.topic.description ?desc .

       OPTIONAL
       {
          ?mID fb:common.topic.topic_equivalent_webpage ?wikipedia_url .
       }

       FILTER (langMatches(lang(?e), "en") && langMatches(lang(?nf), "en") && 
langMatches(lang(?desc), "en"))
    }

}

ORDER BY ?mID

Do you have some ideas about how to improve the performance of the first query 
that is the one meets my requirements?

Regards,

Diego.


Reply via email to