> It can be done, but would require the provider implementations to change
as they would need to message a "job kill."
> Again, one thread is interrupted, but what about all the other threads…
Its not that its "not possible," just that we will have design around this
like for OLAP above.

Agreed - should be possible - just something to think about as we move
forward.

> We should also add test cases to VertexStep, PropertyStep, GraphStep,
etc. that use interrupt

I added some more test cases to cover proper interrupt semantics for
implementations. It took some thinking to come up with a way to test steps
other than GraphStep though. To test, you have to start iteration of a
Traversal in one thread and interrupt it in another. It's a bit imperfect
because it's not immediately obvious how to ensure that my call to
interrupt the thread will trigger a TraversalInterruptedException in a
specific step. In other words, i can test:

g.V()

because that will use some variation of GraphStep and that's all there is,
but when you have:

g.V().out()

I won't know for sure if if the test passes because the thread may have
interrupted in GraphStep or VertexStep. Anyway, I came up with a way to do
it with what I think is a clever use of sideEffect() to block at the right
point and hold the traversal to force it to fail on the right step.  You
can see that work here:

https://github.com/apache/incubator-tinkerpop/commit/fd16fabd7595470bd33f30b882b1f0297d05b55a

I covered VertexStep, PropertyStep, GraphStep as you suggested. I didn't do
a lot of variations on them (e.g. didn't do g.E()). Any thoughts on how
much coverage is "right" for this?

as a side note, I was going to retarget the branch at tp31 but I'm starting
to feel like this change is sufficiently big a feature that it should
probably exist on the 3.2.x line where it can live with the benchmarks.

On Mon, Apr 18, 2016 at 12:06 PM, Marko Rodriguez <okramma...@gmail.com>
wrote:

> Hi,
>
> > Well - I don't think this code is highly specialized. It's good general
> > practice to respect Thread.interrupted(). I think you'd find that
> sentiment
> > in just about any java concurrency programming book.  …
>
> Huh, I didn't know this was a "standard pattern." If so, cool. Then that
> solves that.
>
> >> 1. In OLAP, where there can be multiple threads how does this work?
> >> 2. In Giraph/Spark, how does this effect job execution and failure
> > responses?
> >
> > I ended my initial post in this thread by deleting the last paragraph i
> > wrote about OLAP.  :)  I guess there's still some question there as to
> how
> > that will work. If I interrupt the thread that was executing the OLAP
> > traversal, it's only going to kill it waiting for the result from Spark
> or
> > wherever.  The traversal will still be executing in the context of spark.
> > I assume the way to deal with this is on an implementation specific basis
> > where I assume there is a way to cancel a running spark job (or running
> > giraph job or whatever). If the Traversal that waits for interrupt could
> > signal that cancellation somehow, i guess that would be the way to
> > implement that. I don't know enough about the specifics of spark for how
> > that would work but it sounds plausible, no?
>
> Yea, I don't know how this would work either as it would be master/slave
> traversals needing to coordinate. It can be done, but would require the
> provider implementations to change as they would need to message a "job
> kill." We could add test cases to GraphComputerTest that ensure that all
> OLAP engines handle such interrupts correctly. We should also add test
> cases to VertexStep, PropertyStep, GraphStep, etc. that use interrupt as
> these are the steps that most providers will implement/extend and we need
> to ensure they are doing the interruptions correctly.
>
> >> 3. When we move into threaded OLTP, how will this be triggered/effected?
> >
> > I'm not sure how that feature will be implemented - so i'm not sure how
> to
> > comment on that.
>
> It would be similar to OLAP, where you have a master traversal and
> slave/parallel traversals. Again, one thread is interrupted, but what about
> all the other threads… Its not that its "not possible," just that we will
> have design around this like for OLAP above.
>
> Marko.
>
>
> > On Mon, Apr 18, 2016 at 11:28 AM, Marko Rodriguez <okramma...@gmail.com>
> > wrote:
> >
> >> Hi,
> >>
> >> I think a problem with this is that it requires every step
> implementation
> >> to have this construct in it -- though many steps simply extend the base
> >> FlatMapStep, MapStep, FilterStep, etc. However, not all and thus, this
> >> requires all providers to know what this about and write their code
> >> accordingly.
> >>
> >> A few questions:
> >>
> >>        1. In OLAP, where there can be multiple threads how does this
> work?
> >>        2. In Giraph/Spark, how does this effect job execution and
> failure
> >> responses?
> >>        3. When we move into threaded OLTP, how will this be
> >> triggered/effected?
> >>        4. This doesn't work for "infinite loop" lambdas or "hung
> >> databases."
> >>
> >> I know this is the oldest ticket in the books and a million solutions
> have
> >> been proposed, but it would be nice if this didn't require specialized
> code
> >> in all the steps. We are bound to "forget."
> >>
> >> Thanks,
> >> Marko.
> >>
> >> http://markorodriguez.com
> >>
> >> On Apr 18, 2016, at 8:56 AM, Stephen Mallette <spmalle...@gmail.com>
> >> wrote:
> >>
> >>> Do you mean:
> >>>
> >>> if (Thread.currentThread().isInterrupted()) throw new
> >>> TraversalInterruptedException();
> >>>
> >>> If so, Thread.interrupted() basically does that under the covers
> >>>
> >>> On Mon, Apr 18, 2016 at 10:51 AM, Ted Wilmes <twil...@gmail.com>
> wrote:
> >>>
> >>>> Yeah, looks like benchmark-wise it's a wash, which is good.  I wasn't
> >> aware
> >>>> of the difference between the static interrupted() and non-static
> >>>> isInterrupted().  I was wondering if in this case it should be
> >>>> isInterrupted(), but I think how you did it is good because it'll be
> >>>> evaluated within the traversal thread regardless.
> >>>>
> >>>> --Ted
> >>>>
> >>>> On Mon, Apr 18, 2016 at 6:11 AM, Stephen Mallette <
> spmalle...@gmail.com
> >>>
> >>>> wrote:
> >>>>
> >>>>> A while back, I brought up the issue of being able to interrupt
> >>>> traversals:
> >>>>>
> >>>>>
> >>>>>
> >>>>
> >>
> https://pony-poc.apache.org/thread.html/e6477fc9c58d37a5bdcb5938a0eaa285456ad15aa39e16446290e2ff@1444993523@%3Cdev.tinkerpop.apache.org%3E
> >>>>> https://issues.apache.org/jira/browse/TINKERPOP-946
> >>>>>
> >>>>> As a quick refresher, making Traversal respect Thread.interrupted()
> is
> >>>>> important as you otherwise can quite easily lock up applications like
> >>>>> Gremlin Server with a few poorly conceived or errant queries. We'd
> left
> >>>>> that last thread with liking the idea, but there were concerns about
> >> the
> >>>>> complexity of the changes and performance hits.
> >>>>>
> >>>>> Given that we now have gremlin-benchmark, I decided to see what the
> >>>>> performance hit would be for making this change. I took a rough stab
> at
> >>>> it
> >>>>> introducing Thread.interrupted() in all steps where it seemed to make
> >>>> sense
> >>>>> to do so and then ran the benchmark before and after the change.
> >>>>>
> >>>>> https://gist.github.com/spmallette/ed21267f2e7e17bb3fbd5a8d1a568d2b
> >>>>>
> >>>>> I'm not seeing a whole lot of difference between supporting this
> >> feature
> >>>>> and not supporting this feature.  Here's the branch I implemented
> this
> >> in
> >>>>> in case you want to look around:
> >>>>>
> >>>>> https://github.com/apache/incubator-tinkerpop/tree/TINKERPOP-946
> >>>>>
> >>>>> I'm not sure that my changes are completely bulletproof at this
> point,
> >>>> but
> >>>>> I'm reasonably sure that these changes would handle a good majority
> of
> >>>>> calls for thread interruption. I expect to re-target my branch at
> tp31
> >>>>> (currently from master so that i could use the benchmark suite) if
> this
> >>>>> becomes a pull request.
> >>>>>
> >>>>> Any thoughts on the benchmark, the implementation, etc?
> >>>>>
> >>>>
> >>
> >>
>
>

Reply via email to