In order to come up with a design and an implementation for Statement.setQueryTimeout() and Statement.cancel(), we need to agree on the semantics.

For the rest of the discussion, I will use the term "cancellation" to refer to cancelling a statement either through the use of setQueryTimeout() or cancel().

There has been some discussion on this topic on this mailing list earlier, with different opinions and open questions:
- Should cancellation affect select queries only?
- Should it affect updates (executeUpdate)?
- What if the statement is done executing and has an open result set? Should it limit the time rows can be fetched from the result set?
- How does it affect the thread executing the statement?
- What is the effect on the state of the database, connection, transaction, statement, result set and resources used?
- What if the query executes user-defined functions?


Also, I would like to add these questions regarding performance requirements:
1) What is an acceptable level of responsiveness to cancellation? (How long can a statement keep executing after it has timed out or been cancelled?)
2) What is an acceptable performance degradation when enabling statement cancellation


There will most likely be a tradeoff between the two measures above. Also, the first measure may be difficult to quantify, it may relate to execution operations of some kind (such as tree nodes), rather than seconds.

By performance degradation, I'm referring to throughput as well as reponse time.

My suggestions:
* Cancellation should apply to any type of statement (UPDATE, SELECT, DELETE, INSERT).
* Cancellation is only active in the execute phase - when a result set has been returned, the application can for itself decide to stop fetching rows from it (and close it) at any time. (*)
* Cancellation makes the statement's execute method throw an exception. SQLSTATE will be set to indicate what happened.
* Cancellation has the same effect to the executing thread, the database, connection, transaction, statement and used resources as any other typical exception that can be thrown from the statement's execute method (such as if the statement performs a division by zero): The work done by the statement is rolled back. The transaction is in the same state as before the statement was executed. The transaction is not rolled back.
* While the executing thread is executing user-defined functions, it is unresponsive to cancellation.


As to the performance requirements, I think what we should do is decide on point 2) first: what is an acceptable performance degradation? Then we should do a best-effort on point 1) within these restrictions.

So, what are sensible performance requirements? Maximum 0,5 percent drop in throughput and 0,5 percent increase in response time? I'm just throwing some numbers out in the air here, as a basis for discussion.

Other ideas:
* The tradeoff between cancellation responsiveness and performance degradation could be made configurable with a property set by the user.
* There are typically two reasons for wanting to set a query timeout:
a) The application is impatient, and will do something else if it does not get an answer within a certain time
b) Disallow "runaway" queries, hogging resources
Considering b), one could have a system property defining the maximum allowed query timeout. An application can not set a query timeout larger than the maximum.


----------
(*) It is possible to write long-running statements which a certain database implementation, in theory, can defer execution of, until the application calls ResultSet.next(). In such a case, the execute phase will be quick, but fetching a single record may take longer time than the query timeout that was set for the statement. In practice, this will rarely be the case. If it should prove to be a problem later, one could consider applying the query timeout to the execution of ResultSet.next(), but for now I don't think we should worry about it.


--
Oyvind Bakksjo
[EMAIL PROTECTED]
Office: +47 73842119
Mobile: +47 92283522



Reply via email to