Re: [DISCUSS] Interrupting Traversals

Marko Rodriguez Mon, 18 Apr 2016 09:07:07 -0700

Hi,

> Well - I don't think this code is highly specialized. It's good general
> practice to respect Thread.interrupted(). I think you'd find that sentiment
> in just about any java concurrency programming book.  …


Huh, I didn't know this was a "standard pattern." If so, cool. Then that solves 
that.

>> 1. In OLAP, where there can be multiple threads how does this work?
>> 2. In Giraph/Spark, how does this effect job execution and failure
> responses?
> 
> I ended my initial post in this thread by deleting the last paragraph i
> wrote about OLAP.  :)  I guess there's still some question there as to how
> that will work. If I interrupt the thread that was executing the OLAP
> traversal, it's only going to kill it waiting for the result from Spark or
> wherever.  The traversal will still be executing in the context of spark.
> I assume the way to deal with this is on an implementation specific basis
> where I assume there is a way to cancel a running spark job (or running
> giraph job or whatever). If the Traversal that waits for interrupt could
> signal that cancellation somehow, i guess that would be the way to
> implement that. I don't know enough about the specifics of spark for how
> that would work but it sounds plausible, no?

Yea, I don't know how this would work either as it would be master/slave 
traversals needing to coordinate. It can be done, but would require the 
provider implementations to change as they would need to message a "job kill." 
We could add test cases to GraphComputerTest that ensure that all OLAP engines 
handle such interrupts correctly. We should also add test cases to VertexStep, 
PropertyStep, GraphStep, etc. that use interrupt as these are the steps that 
most providers will implement/extend and we need to ensure they are doing the 
interruptions correctly.

>> 3. When we move into threaded OLTP, how will this be triggered/effected?
> 
> I'm not sure how that feature will be implemented - so i'm not sure how to
> comment on that.

It would be similar to OLAP, where you have a master traversal and 
slave/parallel traversals. Again, one thread is interrupted, but what about all 
the other threads… Its not that its "not possible," just that we will have 
design around this like for OLAP above.

Marko.


> On Mon, Apr 18, 2016 at 11:28 AM, Marko Rodriguez <okramma...@gmail.com>
> wrote:
> 
>> Hi,
>> 
>> I think a problem with this is that it requires every step implementation
>> to have this construct in it -- though many steps simply extend the base
>> FlatMapStep, MapStep, FilterStep, etc. However, not all and thus, this
>> requires all providers to know what this about and write their code
>> accordingly.
>> 
>> A few questions:
>> 
>>        1. In OLAP, where there can be multiple threads how does this work?
>>        2. In Giraph/Spark, how does this effect job execution and failure
>> responses?
>>        3. When we move into threaded OLTP, how will this be
>> triggered/effected?
>>        4. This doesn't work for "infinite loop" lambdas or "hung
>> databases."
>> 
>> I know this is the oldest ticket in the books and a million solutions have
>> been proposed, but it would be nice if this didn't require specialized code
>> in all the steps. We are bound to "forget."
>> 
>> Thanks,
>> Marko.
>> 
>> http://markorodriguez.com
>> 
>> On Apr 18, 2016, at 8:56 AM, Stephen Mallette <spmalle...@gmail.com>
>> wrote:
>> 
>>> Do you mean:
>>> 
>>> if (Thread.currentThread().isInterrupted()) throw new
>>> TraversalInterruptedException();
>>> 
>>> If so, Thread.interrupted() basically does that under the covers
>>> 
>>> On Mon, Apr 18, 2016 at 10:51 AM, Ted Wilmes <twil...@gmail.com> wrote:
>>> 
>>>> Yeah, looks like benchmark-wise it's a wash, which is good.  I wasn't
>> aware
>>>> of the difference between the static interrupted() and non-static
>>>> isInterrupted().  I was wondering if in this case it should be
>>>> isInterrupted(), but I think how you did it is good because it'll be
>>>> evaluated within the traversal thread regardless.
>>>> 
>>>> --Ted
>>>> 
>>>> On Mon, Apr 18, 2016 at 6:11 AM, Stephen Mallette <spmalle...@gmail.com
>>> 
>>>> wrote:
>>>> 
>>>>> A while back, I brought up the issue of being able to interrupt
>>>> traversals:
>>>>> 
>>>>> 
>>>>> 
>>>> 
>> https://pony-poc.apache.org/thread.html/e6477fc9c58d37a5bdcb5938a0eaa285456ad15aa39e16446290e2ff@1444993523@%3Cdev.tinkerpop.apache.org%3E
>>>>> https://issues.apache.org/jira/browse/TINKERPOP-946
>>>>> 
>>>>> As a quick refresher, making Traversal respect Thread.interrupted() is
>>>>> important as you otherwise can quite easily lock up applications like
>>>>> Gremlin Server with a few poorly conceived or errant queries. We'd left
>>>>> that last thread with liking the idea, but there were concerns about
>> the
>>>>> complexity of the changes and performance hits.
>>>>> 
>>>>> Given that we now have gremlin-benchmark, I decided to see what the
>>>>> performance hit would be for making this change. I took a rough stab at
>>>> it
>>>>> introducing Thread.interrupted() in all steps where it seemed to make
>>>> sense
>>>>> to do so and then ran the benchmark before and after the change.
>>>>> 
>>>>> https://gist.github.com/spmallette/ed21267f2e7e17bb3fbd5a8d1a568d2b
>>>>> 
>>>>> I'm not seeing a whole lot of difference between supporting this
>> feature
>>>>> and not supporting this feature.  Here's the branch I implemented this
>> in
>>>>> in case you want to look around:
>>>>> 
>>>>> https://github.com/apache/incubator-tinkerpop/tree/TINKERPOP-946
>>>>> 
>>>>> I'm not sure that my changes are completely bulletproof at this point,
>>>> but
>>>>> I'm reasonably sure that these changes would handle a good majority of
>>>>> calls for thread interruption. I expect to re-target my branch at tp31
>>>>> (currently from master so that i could use the benchmark suite) if this
>>>>> becomes a pull request.
>>>>> 
>>>>> Any thoughts on the benchmark, the implementation, etc?
>>>>> 
>>>> 
>> 
>>

Re: [DISCUSS] Interrupting Traversals

Reply via email to