Re: DBpedia: limit of triples

Hugh Williams Tue, 09 Aug 2011 05:29:17 -0700

Hi 

The http://dbpedia.org/sparql endpoint has both rate limiting on the  number of 
connections/sec you can make, as well as restrictions on  resultset and query 
time, as per the following settings:


   [SPARQL]
   ResultSetMaxRows           = 2000
   MaxQueryExecutionTime      = 120
   MaxQueryCostEstimationTime = 1500


These are in place to make sure that everyone has a equal chance to 
de-reference data from dbpedia.org, as well as to guard against badly  written 
queries/robots.

The following options are at your disposal to get round these  limitations:

1. Use the LIMIT and OFFSET keywords

   You can tell a SPARQL query to return a partial result set and how  many 
records to skip e.g.:

        select ?s where { ?s a ?o }
        LIMIT 1000 OFFSET 2000


2. Setup a dbpedia database in your own network

   The dbpedia project provides full datasets, so you can setup your  own 
installation  on a sufficiently powerful box using Virtuoso Open Source Edition.


3. Setup a preconfigured installation of Virtuoso + database using  Amazon EC2 
(not free)

   See: http://www.openlinksw.com/dataspace/dav/wiki/Main/VirtAWSDBpedia351C


Best Regards
Hugh Williams
Professional Services
OpenLink Software
Web: http://www.openlinksw.com
Support: http://support.openlinksw.com
Forums: http://boards.openlinksw.com/support
Twitter: http://twitter.com/OpenLink

On 9 Aug 2011, at 13:04, Jörn Hees wrote:

> On 9. Aug. 2011, at 13:15, Pablo Mendes wrote:
>>> 'yes, i also consider DBpedia buggy in this sense (hence the crossposting)'
>> Just a small note.
>> I think you mean that the SPARQL engine behind a particular deployment of 
>> DBpedia is behaving differently from what you would desire. Although there 
>> are bugs in DBpedia, this is not one of them. :) I think it is important to 
>> make this distinction between DBpedia and the SPARQL endpoints serving its 
>> contents exactly to point out that you could provide your own 
>> implementation/wrapper that sorts/limits results the way you want.
> 
> Yes, this was imprecise. I was not talking about the SPARQL endpoint (which 
> in fact is able to return more than 2001 triples per subject). I was talking 
> about the standard thing that many people do with a http URI: dereference it.
> 
> I agree that other / local SPARQL endpoints are useful for mass queries and 
> to take load of the DBpedia servers, but i don't see how they help in my 
> case, as dereferencing still goes to the server(s) at dbpedia.org.
> 
> Cheers,
> Jörn
> 
>

Re: DBpedia: limit of triples

Reply via email to