I refactored a bit and built parameterized tests, so we have better coverage over the different variations in the steps and can easily add new ones:
https://github.com/apache/incubator-tinkerpop/blob/7a9f7de6ebed5d4293708524c772db0f0b2c3bac/gremlin-test/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/TraversalInterruptionTest.java I've run the tests for TinkerGraph and Neo4j - both pass nicely. On Tue, Apr 19, 2016 at 7:13 PM, Stephen Mallette <spmalle...@gmail.com> wrote: > > It can be done, but would require the provider implementations to > change as they would need to message a "job kill." > > Again, one thread is interrupted, but what about all the other threads… > Its not that its "not possible," just that we will have design around this > like for OLAP above. > > Agreed - should be possible - just something to think about as we move > forward. > > > We should also add test cases to VertexStep, PropertyStep, GraphStep, > etc. that use interrupt > > I added some more test cases to cover proper interrupt semantics for > implementations. It took some thinking to come up with a way to test steps > other than GraphStep though. To test, you have to start iteration of a > Traversal in one thread and interrupt it in another. It's a bit imperfect > because it's not immediately obvious how to ensure that my call to > interrupt the thread will trigger a TraversalInterruptedException in a > specific step. In other words, i can test: > > g.V() > > because that will use some variation of GraphStep and that's all there is, > but when you have: > > g.V().out() > > I won't know for sure if if the test passes because the thread may have > interrupted in GraphStep or VertexStep. Anyway, I came up with a way to do > it with what I think is a clever use of sideEffect() to block at the right > point and hold the traversal to force it to fail on the right step. You > can see that work here: > > > https://github.com/apache/incubator-tinkerpop/commit/fd16fabd7595470bd33f30b882b1f0297d05b55a > > I covered VertexStep, PropertyStep, GraphStep as you suggested. I didn't > do a lot of variations on them (e.g. didn't do g.E()). Any thoughts on how > much coverage is "right" for this? > > as a side note, I was going to retarget the branch at tp31 but I'm > starting to feel like this change is sufficiently big a feature that it > should probably exist on the 3.2.x line where it can live with the > benchmarks. > > On Mon, Apr 18, 2016 at 12:06 PM, Marko Rodriguez <okramma...@gmail.com> > wrote: > >> Hi, >> >> > Well - I don't think this code is highly specialized. It's good general >> > practice to respect Thread.interrupted(). I think you'd find that >> sentiment >> > in just about any java concurrency programming book. … >> >> Huh, I didn't know this was a "standard pattern." If so, cool. Then that >> solves that. >> >> >> 1. In OLAP, where there can be multiple threads how does this work? >> >> 2. In Giraph/Spark, how does this effect job execution and failure >> > responses? >> > >> > I ended my initial post in this thread by deleting the last paragraph i >> > wrote about OLAP. :) I guess there's still some question there as to >> how >> > that will work. If I interrupt the thread that was executing the OLAP >> > traversal, it's only going to kill it waiting for the result from Spark >> or >> > wherever. The traversal will still be executing in the context of >> spark. >> > I assume the way to deal with this is on an implementation specific >> basis >> > where I assume there is a way to cancel a running spark job (or running >> > giraph job or whatever). If the Traversal that waits for interrupt could >> > signal that cancellation somehow, i guess that would be the way to >> > implement that. I don't know enough about the specifics of spark for how >> > that would work but it sounds plausible, no? >> >> Yea, I don't know how this would work either as it would be master/slave >> traversals needing to coordinate. It can be done, but would require the >> provider implementations to change as they would need to message a "job >> kill." We could add test cases to GraphComputerTest that ensure that all >> OLAP engines handle such interrupts correctly. We should also add test >> cases to VertexStep, PropertyStep, GraphStep, etc. that use interrupt as >> these are the steps that most providers will implement/extend and we need >> to ensure they are doing the interruptions correctly. >> >> >> 3. When we move into threaded OLTP, how will this be >> triggered/effected? >> > >> > I'm not sure how that feature will be implemented - so i'm not sure how >> to >> > comment on that. >> >> It would be similar to OLAP, where you have a master traversal and >> slave/parallel traversals. Again, one thread is interrupted, but what about >> all the other threads… Its not that its "not possible," just that we will >> have design around this like for OLAP above. >> >> Marko. >> >> >> > On Mon, Apr 18, 2016 at 11:28 AM, Marko Rodriguez <okramma...@gmail.com >> > >> > wrote: >> > >> >> Hi, >> >> >> >> I think a problem with this is that it requires every step >> implementation >> >> to have this construct in it -- though many steps simply extend the >> base >> >> FlatMapStep, MapStep, FilterStep, etc. However, not all and thus, this >> >> requires all providers to know what this about and write their code >> >> accordingly. >> >> >> >> A few questions: >> >> >> >> 1. In OLAP, where there can be multiple threads how does this >> work? >> >> 2. In Giraph/Spark, how does this effect job execution and >> failure >> >> responses? >> >> 3. When we move into threaded OLTP, how will this be >> >> triggered/effected? >> >> 4. This doesn't work for "infinite loop" lambdas or "hung >> >> databases." >> >> >> >> I know this is the oldest ticket in the books and a million solutions >> have >> >> been proposed, but it would be nice if this didn't require specialized >> code >> >> in all the steps. We are bound to "forget." >> >> >> >> Thanks, >> >> Marko. >> >> >> >> http://markorodriguez.com >> >> >> >> On Apr 18, 2016, at 8:56 AM, Stephen Mallette <spmalle...@gmail.com> >> >> wrote: >> >> >> >>> Do you mean: >> >>> >> >>> if (Thread.currentThread().isInterrupted()) throw new >> >>> TraversalInterruptedException(); >> >>> >> >>> If so, Thread.interrupted() basically does that under the covers >> >>> >> >>> On Mon, Apr 18, 2016 at 10:51 AM, Ted Wilmes <twil...@gmail.com> >> wrote: >> >>> >> >>>> Yeah, looks like benchmark-wise it's a wash, which is good. I wasn't >> >> aware >> >>>> of the difference between the static interrupted() and non-static >> >>>> isInterrupted(). I was wondering if in this case it should be >> >>>> isInterrupted(), but I think how you did it is good because it'll be >> >>>> evaluated within the traversal thread regardless. >> >>>> >> >>>> --Ted >> >>>> >> >>>> On Mon, Apr 18, 2016 at 6:11 AM, Stephen Mallette < >> spmalle...@gmail.com >> >>> >> >>>> wrote: >> >>>> >> >>>>> A while back, I brought up the issue of being able to interrupt >> >>>> traversals: >> >>>>> >> >>>>> >> >>>>> >> >>>> >> >> >> https://pony-poc.apache.org/thread.html/e6477fc9c58d37a5bdcb5938a0eaa285456ad15aa39e16446290e2ff@1444993523@%3Cdev.tinkerpop.apache.org%3E >> >>>>> https://issues.apache.org/jira/browse/TINKERPOP-946 >> >>>>> >> >>>>> As a quick refresher, making Traversal respect Thread.interrupted() >> is >> >>>>> important as you otherwise can quite easily lock up applications >> like >> >>>>> Gremlin Server with a few poorly conceived or errant queries. We'd >> left >> >>>>> that last thread with liking the idea, but there were concerns about >> >> the >> >>>>> complexity of the changes and performance hits. >> >>>>> >> >>>>> Given that we now have gremlin-benchmark, I decided to see what the >> >>>>> performance hit would be for making this change. I took a rough >> stab at >> >>>> it >> >>>>> introducing Thread.interrupted() in all steps where it seemed to >> make >> >>>> sense >> >>>>> to do so and then ran the benchmark before and after the change. >> >>>>> >> >>>>> https://gist.github.com/spmallette/ed21267f2e7e17bb3fbd5a8d1a568d2b >> >>>>> >> >>>>> I'm not seeing a whole lot of difference between supporting this >> >> feature >> >>>>> and not supporting this feature. Here's the branch I implemented >> this >> >> in >> >>>>> in case you want to look around: >> >>>>> >> >>>>> https://github.com/apache/incubator-tinkerpop/tree/TINKERPOP-946 >> >>>>> >> >>>>> I'm not sure that my changes are completely bulletproof at this >> point, >> >>>> but >> >>>>> I'm reasonably sure that these changes would handle a good majority >> of >> >>>>> calls for thread interruption. I expect to re-target my branch at >> tp31 >> >>>>> (currently from master so that i could use the benchmark suite) if >> this >> >>>>> becomes a pull request. >> >>>>> >> >>>>> Any thoughts on the benchmark, the implementation, etc? >> >>>>> >> >>>> >> >> >> >> >> >> >