[
https://issues.apache.org/jira/browse/JENA-29?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12994141#comment-12994141
]
Andy Seaborne commented on JENA-29:
-----------------------------------
Excellent patch. Looking through it, I have some questions and comments. This
is an important feature that touches several areas so I want to make sure I
understand the implications across the system.
Timeouts:
We need to add a timeout mechanism on top of this patch. Timing out a query is
(I think) going to be the #1 use case. One way to do it is to set off a thread
when the query starts to call cancel() when the timer goes off. But most
queries won't timeout (!!!) so this seems a little heavy. Instead, I suggest
that QuerIteratorBase periodically check the clocks if a timeout is set.
QueryExecution.setTimeout(millis)
QueryIterator.setTimeout(millis)
It might as well be possible to set/reset during query execution. Timeouts are
"best effort" anyway, a good contract is that an execution will not be timeout
out before the timeout setting.
It's going to be hard to implement timeouts everywhere (e.g. SDB will be
influenced by the DB capabilities).
QueryIterator is an interface and is implemented by QueryIteratorBase. This
has the machinery for hasNextBinding, moveToNextBinding. Even though there are
non-execution query iterators, this seems to me to be one of the places to add
the mechanism.
The alternative is that QueryIter does the work - it is the root of iterators
associated with an execution, with registration and tracking
We'll need to timeout on sorting and grouping as well, and maybe any
materializing iterators.
What to do on timeout?
For API use, that's relatively easy. An exception can be thrown on .hasNext()
or .next().
The effect of this on the results output (XML, JSON) needs to be consider
though. The catch is that HTTP needs the status code as the first line of the
HTTP response. And if it's 2xx, it means success (of the operation).
"206 Partial Content" is for partial GETs so is not applicable
This leaves, taking the XML form, I see the following choices:
1/ Silent truncation - just return less (and wrong) results.
2/ Return illegal XML - truncate when the exception occurs.
3/ Buffer - get all the results before sending any, then the HTTP status code
can be set.
(3) is ideal functionally but looses streaming.
Some questions, discussion points:
1/ Do we need a separate "cancel" from "abort", given that abort currently does
a close and has to be called on the execution thread. We could extend the
contract of abort to cover non-execution threads, making it cancel. The cancel
mechanism of the patch would be the mechanism.
2/ I didn't understand the difference between cancel, requestCancel and
requestSubCancel. As far as I can see, just a way to pass cancel down, with
the contract it's potentially asynchronous. This is what the patch for
QueryIter2 does, QueryIter1 has requestCancel/requestSubCancel and
QueryIterBase has requestCancel. Can these all be "cancel". Interface
QueryIterator already has abort.
3/ QueryIterSort , QueryIterGroup
In trunk, these do all the work in constructing function, then return an
iterator over the results. In the patch they return return an iterator that
delays the work until hasNext is called. But hasNext is going to be called
immediately anyway because the results are pulled by the application. I didn't
undertand the need for the iterator - what am I missing?.
For sort, the sorted is pulling on an underlying query iterator so the cancel
mechanism will there. Maybe we need the comparator to also test the cancel
flags - the javadoc for Arrays.sort does not say much here.
Note the extenal sort patch in JENA-44.
For group, the grouping is pulling on an underlying query iterator and the
group operation is not a significant cost.
> cancellation during query execution
> -----------------------------------
>
> Key: JENA-29
> URL: https://issues.apache.org/jira/browse/JENA-29
> Project: Jena
> Issue Type: Improvement
> Components: ARQ, TDB
> Reporter: Simon Helsen
> Assignee: Andy Seaborne
> Attachments: JENA-29_ARQ_r8489.patch, JENA-29_TDB_r8489.patch,
> JENA-29_tests_ARQ_r8489.patch, jena.patch, jenaAddition.patch
>
>
> The requested improvement and proposed patch is made by Simon Helsen on
> behalf of IBM
> ARQ query execution currently does not have a satisfactory way to cancel a
> running query in a safe way. Moreover, cancel (unlike a hard abort) is
> especially useful if it is able to provide partial result sets (i.e. all the
> results it managed to compute up to when the cancellation was requested).
> Although the exact cancellation behavior depends on the capabilities of the
> underlying triple store, the proposed patch merely relies on the iterators
> used by ARQ.
> Here is a more detailed explanation of the proposed changes:
> 1) the cancel() method in the QueryIterator initiates a cancellation request
> (first boolean flag). In analogy with closeIterator(), it propagates through
> all chained iterators, so the entire calculation is aware that a cancellation
> is requested
> 2) to ensure a thread-safe semantics, the cancelRequest becomes a real cancel
> once nextBinding() has been called. It sets the second boolean which is used
> in hasNext(). This 2-phase approach is critical since the cancel() method can
> be called at any time during a query execution by the external thread. And
> because the behavior of hasNext() is such that it has to return the *same*
> value until next() is called, this is the only way to guarantee semantic
> safety when cancel() is invoked (let me re-phrase this: it is the only way I
> was able to make it actually work)
> 3) cancel() does not close anything since it allows execution to finish
> normally and the client is responsible to call close() just like with a
> regular execution. Note that the client has to call cancel() explicitly
> (typically in another thread) and has to assume that the returning result set
> may be incomplete if this method is called (it is undetermined whether the
> result is _actually_ incomplete)
> 4) in order to deal with order-by and groups, I had to make two more changes.
> First, I had to make QueryIterSort and QueryIterGroup a slightly bit more
> lazy. Currently, the full result set is calculated during plan calculation.
> With my proposed adjustments, this full result set is called on the first
> call to any of its Iterator methods (e.g. hasNext). This change does not
> AFAIK affect the semantics. Second, because the desired behavior of
> cancelling a sort or group query is to make sure everything is sorted/grouped
> even if the total result set is not completed, I added an exception which
> reverses the cancellation request of the encompassing iterator (as an example
> see cancel() in QueryIterSort). This makes sure that the entire subset of
> found and sorted elements is returned, not just the first element. However,
> it also implies in the case of sort that when a query is cancelled, it will
> first sort the partially complete result set before returning to the client.
> the attached patch is based on ARQ 2.8.5 (and a few classes in TDB 0.8.7 ->
> possibly the other triple store implementations need adjustement as well)
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira