I'm going to frame this in the context of Gremlin Server but I think this
issue generally applies to any threaded application development done with
TinkerPop.

Gremlin Server does a number of things to protect itself from crazy scripts
(i.e. long run scripts that may have been accidentally or maliciously sent
to it).  The one thing it doesn't do perfectly is properly kill running
scripts in the midst of a long run traversal.  It attempts to stop a
running script by interrupting the thread that is processing it, but if the
thread is at a point in the script that doesn't respect Thread.interrupt(),
it basically just chews up that thread until it reaches a spot that does.

I think Traversal could do a better job respecting interrupts.  Put in code
speak, we should look to get this test to pass for all steps:

@Test
public void shouldRespectThreadInterruption() throws Exception {
    final AtomicBoolean exceptionThrown = new AtomicBoolean(false);
    final CountDownLatch startedIterating = new CountDownLatch(1000);
    final List<Integer> l = IntStream.range(0,
1000000).boxed().collect(Collectors.toList());

    final Thread t = new Thread(() -> {
        try {
            __.inject(l).unfold().sideEffect(i ->
startedIterating.countDown()).iterate();
            fail("Should have respected the thread interrupt");
        } catch (Exception ie) {
            exceptionThrown.set(ie instanceof RuntimeException);
        }
    });

    t.start();

    startedIterating.await();
    t.interrupt();

    t.join();
    assertThat(exceptionThrown.get(), CoreMatchers.is(true));
}

Two things to note...First, the changes to make this test pass shouldn't be
"big".  I got this test to pass with the addition of this line of code in
FlatMapStep.processNextStart() on the first line after the start of the
while(true).

if(Thread.interrupted) throw new RuntimeException();

That single line inserted in the right places should get interrupt working
properly across Traversal.

Second, note that I'm throwing/asserting RuntimeException.  I probably
should be throwing a InterruptedException but that is a checked exception
and that would really pollute up our Traversal code.  So, my suggestion is
to create our own "InterruptedRuntimeException" that we can trap separately
that would potentially wrap an actual InterruptedException.

I'd like to tack this issue in to 3.1.0 if possible - please let me know
your thoughts if you have any.

Reply via email to