On 25/04/13 18:48, Simon Sperl wrote:
Hi,

@Andy
You running a 32 bit JVM? yes
You are trying to stop a long running query from another thread? yes

My usecase is that I am trying to write a jena plugin for Rapidminer.
What that means is that I have gui components representing "sparql query",
"sparql service query", "open tdb", "constant model",.. and stick their
inputs/outputs together to form a rapidminer-process.
And these processes can be halted (from the gui), which Rapidminer does by
calling Thread.interrupt(), so in essence I don't have full control over
the execution/interruptions.


I can understand/empathize not supporting Thread.interrupt(), but once the
system is inconsistent can I recover somehow?

There's nothing wrong with Thread.interrupt (unlike Thread.stop) expect that Jena doesn't do it that way.

It looks like the thread interrupt has caused a java.nio.channels.ClosedByInterruptException exception so any I/O operation is open to an exception. Sometimes the system is performing two or more I/O operations in a coordinated way and coping with an interrupt at any point would be tedious and hard to get right.

Transactions don't help - the ideal of transactions is that the system is either working or not. This is a half-way partial death where the internal state of TDB is a unknown.

I daresay it could have been written to use Thread.interrupt but would be hard when any I/O operation can return incomplete. At least the cancellation flag is only tested at convenient (but quite fine grained) points in query execution.

The other point is that it relies on query execution being single-threaded/same-thread. That is not guaranteed and indeed hasn't always been true (RDQL used to be two-threaded). ARQ (and TDB) uses a lot of iterators - running them across threads is quite natural to do (caution on granularity and overheads costing more than any gains).

Systems need to support multiple independent requests. There are only so many real threads (although it's going up quite rapidly these days) so splitting the workload to make request fairer makes sense currently.


What to do about it:

If you need a design for rapidminer where it can use hread.interrupt, then maybe this will work:

On the thread where rapidminer thinks the request is, have an ExecutorService to fork query execution and return the result set in a Future<ResultSet> (e.g. FutureTask<>).

It mist be a copy of the ResultSet, not the ResultSet retuned by execSelect because that is tied to query execution.

Wait on the Future to get the results.

If the Future.get receives InterruptedException, call QueryExecution.abort to kill the query on the second thread.

        Andy


-Simon

Reply via email to