Michael Haas wrote:
> Hello,
>
> as pointed out earlier, I'm having some issues with the new SPARQL
> endpoint. I'm currently using DBpedia to generate dictionaries for a
> task in an information extraction class I'm taking.
>
> For this task, I need a list of entities, e.g. actors. Consider the
> following query:
>
> SELECT ?name WHERE { ?a
> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
> <http://dbpedia.org/ontology/Actor> . ?a
> <http://www.w3.org/2000/01/rdf-schema#label> ?name }
>
> With DBpedia 3.2, this would work just fine. With the current release,
> the query will time out after a while, giving me a partial result list.
> This actually is a feature called "Anytime" queries. [0]
>
>
> I wonder if enabling the Anytime feature is a good idea - not because I 
> can't get my list of actors, but because it's broken, undocumented and 
> proprietary:
>
> Kingsley Idehen wrote:
>   
>> There is a bit of a doc mess up right now. New docs are in progress etc..
>>     
>
>   
>> Re. SPARQL protocol, this should come through HTTP response, but weare
>> still working on this part.
>>     
>
> That's not exactly in the "SPARQL Protocol for RDF" recommendation.
>
>
> There is no way right now to let a SPARQL-compliant client know there
> are more results. AFAIK, it is also impossible to set these timeouts
> using the SPARQL Protocol. I don't think proprietary protocol extensions
> are the right thing for an Open project.
>
>
> Additionally, handing out different result sets for the same query
> depending on what kind of data is cached and how far subordinate clauses
> from *previous* queries have been evaluated (see [0]) sounds broken. In
> fact, I don't believe the SPARQL W3C recommendation allows that (section
> 12.5, "Evaluation Semantics").
>
>
>
> I do acknowledge that handling web-scale data sets presents a problem,
> but I'd rather see a query language which can do proper chunking of
> results instead of breaking SPARQL.
>
>
> Anyways - I tried to work around this issue by using the LIMIT and
> OFFSET solution sequence modifiers. The W3C recommendation states:
> "Using LIMIT and OFFSET to select different subsets of the query
> solutions will not be useful unless the order is made predictable by
> using ORDER BY." - so throw in an ORDER BY as well. This will break
> after some iterations:
>
> 22023 Error SR353: Sorted TOP clause specifies more then 10100 rows to
> sort. Only 10000 are allowed. Either decrease the offset and/or row
> count or use a scrollable cursor
>
> SPARQL query:
> define sql:signal-void-variables 1 define input:default-graph-uri
> <http://dbpedia.org> SELECT ?name WHERE { ?a
> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
> <http://dbpedia.org/ontology/Actor> . ?a
> <http://www.w3.org/2000/01/rdf-schema#label> ?name } ORDER BY ?name
> LIMIT 100 OFFSET 10000
>
>
> Any ideas?
>
>
> Regards,
>
> Michael
>
>
>
>
>
> [0] http://www.openlinksw.com/weblog/oerling/?id=1494
>
>
> ------------------------------------------------------------------------------
> _______________________________________________
> Dbpedia-discussion mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>
>   

Micheal,

BTW - set the timeout to "0" and the "Anytime Query" feature is disabled 
re. the endpoint.

-- 


Regards,

Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/~kidehen
President & CEO 
OpenLink Software     Web: http://www.openlinksw.com





------------------------------------------------------------------------------
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to