Re: How to publish SPARQL endpoint limits/metadata?

Frans Knibbe | Geodan Tue, 15 Oct 2013 02:09:56 -0700

On 2013-10-14 20:05, Kingsley Idehen wrote:

Plus:
7. query timeout (in milliseconds) -- which determines how muchprocessing time threshold per query .
Ideally, you want to use a combination of timeouts, result size(max. results per query), offet, and limit to enable paging throughdata .
Yes, enabling paging is the main thing I was thinking about. For thatto work, one needs to know the maximum page size allowed. SPARQLORDER BY, OFFSET and LIMIT can be used to build requests for pages.But how does the query timeout setting of the server come in to play?
In our case (re., Virtuoso) its the time taken to produce a solution,bearing in mind the LIMIT and OFFSET query values. If a completequery solution isn't produced, you get a partial solution, and theability to retry with an extended timeout.

Ah, I understand, thank you. I can see that this way of handlingtimeouts has its merits. And now I see that this limit is much like themaximum results per request: The client gets a response, but if it isnot aware that the response is partial then it could have undesirableeffects.

Is this behaviour standardized? Or do different flavours of endpointshandle timeouts differently? If the latter is the case, maybe it makessense to also publish something about how timeouts are handled in theendpoint description.


Regards,
Frans

I assume that if a request for a page of data times out you would geta timeout error code (522 probably).
Partial result. The idea is that its like a quiz whereby you have ananswer provided within allotted time, or you attempt to answer inextended time i.e., you request that or it cycles back to you becausethe opponent couldn't provide an answer etc..
What would be gained with prior knowledge of the timeout setting?
The timeout is the set time for producing a compete or partial querysolution.
What we need to do, which will help others, is get all the HTTPresponses and controls for paging properly returns via HTTP responses.This matter has been discussed in the past (on this list) and we arecommitted to getting the HTTP responses in line, as I've described.
Action item for us: demonstrate what I am describing using RESTfulinteraction patterns via cURL. Once in place, we would have thefoundation of a pattern that anyone could *optionally* incorporate etc..
Kingsley
Regards,
Frans
Kingsley
On 8-10-2013 17:45, Leigh Dodds wrote:
Hi,

As others have suggested, extending service descriptions would be the
best way to do this. This might make a nice little community project.

It would be useful to itemise a list of the type of limits that might
be faced, then look at how best to model them.

Perhaps something we could do on the list?

Cheers,

L.



On Tue, Oct 8, 2013 at 10:46 AM, Frans Knibbe | Geodan
<[email protected]>  wrote:
Hello,

I am experimenting with running SPARQL endpoints and I notice the need to
impose some limits to prevent overloading/abuse. The easiest and I believe
fairly common way to do that is to LIMIT the number of results that the
endpoint will return for a single query.

I now wonder how I can publish the fact that my SPARQL endpoint has a LIMIT
and that is has a certain value.

I have read the thread Public SPARQL endpoints:managing (mis)-use and
communicating limits to users, but that seemed to be about how to
communicate limits during querying. I would like to know if there is a way
to communicate limits before querying is started.

It seems to me that a logical place to publish a limit would be in the
metadata of the SPARQL endpoint. Those metadata could contain all limits
imposed on the endpoint, and perhaps other things like a SLA or a
maintenance schedule... data that could help in the proper use of the
endpoint by both software agents and human users.

So perhaps my enquiry really is about a standard for publishing SPARQL
endpoint metadata, and how to access them.

Greetings,
Frans

Re: How to publish SPARQL endpoint limits/metadata?

Reply via email to