"It's not clear to me if the problem exists in HSQLDB, the test, or tail step"
This had nothing to do with the TailStep bug. That one is resolved for the most part. For the rest, where the problem is, is itself part of the problem. Thread.interrupt() has rather weak semantics having many different behaviors. Some reset the flag, some throw an exception some swallow and some do a combination of all. I don't think engaging 3rd parties with regards to this is an option. Firstly there are way to many 3rd parties where the InterruptException is being caught to even start. Secondly I reckon as the semantics are weak every 3rd party engagement will turn into a discussion itself. From what I gather many 3rd parties call Thread.wait/join/sleep and handle the InterruptException for the interrupt that they are expecting. I imagine they are swallowing the exception and not resetting the flag with good cause. Regarding delegating the query to a separate thread, even if Sqlg executes the sql in a different thread there are still many other 3rd party libraries that might interfere with the expected interrupt logic outside of just the sql query. This makes me of the opinion that Thread.interrupt is an unreliable mechanism for interrupting a traversal. Regarding asynchronous or synchronous I'd say the interrupt request should be asynchronous with a Future that returns on a successful cancellation. That way you can wait for it or not. >From what I understand the complexity is more in GremlinServer that executes scripts and has no real concept of a traversal. It does not even really want to interrupt a traversal as such but rather a script which may itself contain many traversals. I reckon it will have to pass in a object when executing the script which the graph will store in a threadvar. The graph can then register all traversals executing in the thread on that object. And when the time comes to interrupt a script GremlinServer will call interrupt on that object which in turn will interrupt the current executing traversal. Something like that is what I am thinking of. Cheers Pieter On 22/07/2016 14:24, Robert Dale wrote: > Trying to summarize the concerns I think I'm hearing: > 1. cancelling the gremlin job > 2. cancelling the task in the backend database, this implies handling > at minimum: > a. commit state: interruptable > b: rollback state: probably not interruptable > 3. responding to the client, returning the thread > > Should these things done synchronously or asynchronously or some > combination? The answer may depend on how decoupled they are. > > Separately, are tests doing the right thing? It's not clear to me if > the problem exists in HSQLDB, the test, or tail step. > > I think if Thread.interrupt() is the right way, then that's the way it > should be done regardless of bad citizen libraries. > > Handle 3rd party bad citizens by: > - filing a bug with them. Maybe they will fix or justify the behavior. > - tracking them in a Known Issues list > - workaround them as close as possible to the problem: > I'm not familiar with how providers work so I don't know how generally > applicable this would be, but in the case of Sqlg, the sql query > itself could be delegated to a separate thread in which special > interrupt strategies could be implemented such as the while loop. > > Side question: are there management tools in gremlin server to see > currently running tasks and kill them? Or is that something that would > be delegated to the backend database? >