Thanks for your help, it works fine :-) I also found this solution which
works fine too and it not depends of a Virtuoso endpoint :

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT DISTINCT ?s
WHERE {
    {
          SELECT DISTINCT ?s
          WHERE {
                ?s rdfs:label ?o ;
                     rdf:type yago:AmericanFilmActors .
                FILTER regex(str(?o), "^Brad") .
          } LIMIT 100
     }
}

I don't have all the results but if I put this query in a loop with a OFFSET
this should be good.

Thanks again for your quickly answer.

Regards.

----------------------------------------------------------------------------
-----------------------------------------------
Julien Plu

Etudiant en 2ème année de Master Ingénierie en Intelligence Artificielle à
l’Université Montpellier 2.
Mes projets Web sémantique : http://data.lirmm.fr
Page personnelle : http://jplu.developpez.com
Responsable de la rubrique Web sémantique de Developpez.com.
Fichier FOAF (version RDF) : http://www.pipm.fr/foaf/foaf.rdf
Fichier FOAF (version RDFa) : http://www.pipm.fr/foaf/rdfa.html
Adresse mail personnelle : [email protected]
Adresse mail universitaire : [email protected]


-----Message d'origine-----
De : Patrick van Kleef [mailto:[email protected]] 
Envoyé : mardi 10 janvier 2012 01:15
À : [email protected]
Cc : [email protected]
Objet : Re: [Dbpedia-discussion] Problem with DBPedia SPARQL endpoint

Hello Julien,

>
> I have a problem with a DBPedia SPARQL endpoint. When I am on the 
> http://dbpedia.org/sparql  website, I put this SPARQL query :
>
> PREFIX foaf: <http://xmlns.com/foaf/0.1/>
>
> select distinct ?s where {
>     ?s foaf:name ?o .
>     FILTER regex(str(?o), "^Brad")
> }
>
> And this error occurred :
>
> Virtuoso S1T00 Error SR171: Transaction timed out
>
> SPARQL query:
> define sql:big-data-const 0
> #output-format:text/html
> define sql:signal-void-variables 1 define input:default-graph-uri 
> <http://dbpedia.org
> > PREFIX foaf: <http://xmlns.com/foaf/0.1/>
>
> select distinct ?s where {
>     ?s foaf:name ?o .
>     FILTER regex(str(?o), "^Brad")
> }
>
> The same thing occurred when I add a LIMIT. It’s because my SPARQL 
> query is wrong or for other else ?



Your query takes so long that it runs over the limits set on the  
dbpedia endpoint at this time.

I ran the query on the dbpedia.org cluster it took 779139 msec which  
is about 13 minutes to run your original query and return 1441 rows.

The problem with your query is the REGEX function which basically  
forces a table scan on the data as this cannot be done via an index to  
speed the query up.

Note that since you use the DISTINCT keyword, the virtuoso cluster  
basically cannot shortcut the query.

The good news is that Virtuoso has a couple of options that you can  
use to speed up your query:


1. Use an ANYTIME query

    This means that you fill in the "Execution Timeout" field in the / 
sparql form with say a
    value of 5000 (in msec) which means that your query basically is  
transformed into:

       Select all the triples that satisfy where the predicate is  
foaf:name and where the string
       of the name begins with Brad, that you can find within about 5  
seconds of processing, and
       then return the unique ?subject.

    Obviously this will not return all the triples you might expect,  
but it does return quickly.



2. Use BIF:CONTAINS

    Virtuoso can search very efficiently on the ?o as it maintains a  
complete freetext index on it.

    You can use the following query:

        PREFIX foaf: <http://xmlns.com/foaf/0.1/>

        select distinct ?s where {
                ?s foaf:name ?o .
                ?o bif:contains "Brad" .
        }


    which will basically return every triple that contains the word  
Brad anywhere in the foaf:name.
    This is very very efficient, but will get you not only people like:

        "Brad Davis"@en
        "Brad Strickland"@en

    but also

        "Edward Brad Titchener"@en


    You can combine this with the FILTER if you really only want  
foaf:name that start with Brad
    and use:

        PREFIX foaf: <http://xmlns.com/foaf/0.1/>

        select distinct ?s where {
                ?s foaf:name ?o .
                ?o bif:contains "Brad" .
                FILTER regex(str(?o), "^Brad")
        }

    Since the bif:contains works over a very efficient index, the  
FILTER only has to go through a
    very small number of triples and still return the exact same  
result you would have expected
    from your original query.


Hope this solves your problem.

Patrick 
---
OpenLink Software=


------------------------------------------------------------------------------
Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create 
new or port existing apps to sell to consumers worldwide. Explore the 
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to