Hello Julien,
>
> I have a problem with a DBPedia SPARQL endpoint. When I am on the
> http://dbpedia.org/sparql
> website, I put this SPARQL query :
>
> PREFIX foaf: <http://xmlns.com/foaf/0.1/>
>
> select distinct ?s where {
> ?s foaf:name ?o .
> FILTER regex(str(?o), "^Brad")
> }
>
> And this error occurred :
>
> Virtuoso S1T00 Error SR171: Transaction timed out
>
> SPARQL query:
> define sql:big-data-const 0
> #output-format:text/html
> define sql:signal-void-variables 1 define input:default-graph-uri
> <http://dbpedia.org
> > PREFIX foaf: <http://xmlns.com/foaf/0.1/>
>
> select distinct ?s where {
> ?s foaf:name ?o .
> FILTER regex(str(?o), "^Brad")
> }
>
> The same thing occurred when I add a LIMIT. It’s because my SPARQL
> query is wrong or for other else ?
Your query takes so long that it runs over the limits set on the
dbpedia endpoint at this time.
I ran the query on the dbpedia.org cluster it took 779139 msec which
is about 13 minutes to run your original query and return 1441 rows.
The problem with your query is the REGEX function which basically
forces a table scan on the data as this cannot be done via an index to
speed the query up.
Note that since you use the DISTINCT keyword, the virtuoso cluster
basically cannot shortcut the query.
The good news is that Virtuoso has a couple of options that you can
use to speed up your query:
1. Use an ANYTIME query
This means that you fill in the "Execution Timeout" field in the /
sparql form with say a
value of 5000 (in msec) which means that your query basically is
transformed into:
Select all the triples that satisfy where the predicate is
foaf:name and where the string
of the name begins with Brad, that you can find within about 5
seconds of processing, and
then return the unique ?subject.
Obviously this will not return all the triples you might expect,
but it does return quickly.
2. Use BIF:CONTAINS
Virtuoso can search very efficiently on the ?o as it maintains a
complete freetext index on it.
You can use the following query:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
select distinct ?s where {
?s foaf:name ?o .
?o bif:contains "Brad" .
}
which will basically return every triple that contains the word
Brad anywhere in the foaf:name.
This is very very efficient, but will get you not only people like:
"Brad Davis"@en
"Brad Strickland"@en
but also
"Edward Brad Titchener"@en
You can combine this with the FILTER if you really only want
foaf:name that start with Brad
and use:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
select distinct ?s where {
?s foaf:name ?o .
?o bif:contains "Brad" .
FILTER regex(str(?o), "^Brad")
}
Since the bif:contains works over a very efficient index, the
FILTER only has to go through a
very small number of triples and still return the exact same
result you would have expected
from your original query.
Hope this solves your problem.
Patrick
---
OpenLink Software
------------------------------------------------------------------------------
Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create
new or port existing apps to sell to consumers worldwide. Explore the
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion