On 30/08/2019 11:57, Élie Roux wrote:
Hi all,
It is "graceful termination" - the query engine checks whether the
timeout has gone off, not have something interrupt a thread.
Looking at the code, the Lucene call is returning an array of ScoreDocs
in one go so that is going to all happen.
The rest of the query processing will terminate on timeout.
I see, thanks for the answer! There seem be some way to have a timeout
in the Lucene API, but it doesn't look very straightforward... should
I open a separate issue to track that?
Yes
Are you able to contribute for this additional feature?
But are you finding it is the Lucene step that is taking a long time?
For some requests yes, for others it can be just a simple misspelling
in a variable name that creates a query that will gratuitously go
through millions of triples in a very inefficient way...
The query exits with a "QueryCancelledException" which Fuseki tries to
send back to the client. UIt can only do so if no results have been
streamed back (so works nicely for ORDER BY).
Unfortunately, in HTTP, the response code is sent back first. So it is
possible that the client can get the 200, then some rows, then the
result stream is truncated. Fuseki writes syntactically illegal results
to force an error processing the results. This is something SPARQL 1.2
CG may address. There are several issues that touch on this known aspect
of HTTP.
Hmmm... that's an interesting problem indeed... good idea to have the
CG address that! What does the Jena code (in the client) do when it
receives such a malformed result? Is there any chance the exception
can be spotted and returned?
The client side also streams - so it returns results until it hits the
bad syntax and throws an exception.
The only way to have the server return an HTTP status code 4xx is run he
query to completion, buffering the results, before starting the response
so it can send the status code. That is a big impact on large results.
Andy
Best,