GitHub user dalaro opened a pull request:
https://github.com/apache/incubator-tinkerpop/pull/164
TINKERPOP3-988 Change GraphComputer executors
This supersedes https://github.com/apache/incubator-tinkerpop/pull/154.
This PR is similar to its predecessor, but with two differences:
* This PR shuts down executors it creates in a finally block instead of
leaking them
* This PR covers both SparkGraphComputer and GiraphGraphComputer
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/dalaro/incubator-tinkerpop TINKERPOP3-988-bis
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-tinkerpop/pull/164.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #164
----
commit 562d43af8f0063aa5a77d730480c4efa9b8b992f
Author: Dan LaRocque <[email protected]>
Date: 2015-12-01T13:23:42Z
TINKERPOP3-988 Change GraphComputer executors
Changed (Giraph|Spark)GraphComputer's submit methods to use a custom
executor instead of ForkJoinPool.commonPool.
These submit methods do very little in the calling thread; most of the
work is done in an asynchronously-executed completable future.
If the async task executes on ForkJoinPool.commonPool (the default if
no executor is supplied), then two problems can arise:
1. Loss of context classloader
The context classloader of the thread that called submit is not
necessarily the same as the context classloader common forkjoin pool
threads. This matters because multiple bits of code reachable from
submit's async task rely on the context classloader. SparkMemory is
one; Hadoop's UserGroupInformation is another, depending on the
credentials configuration (UGI is reached indirectly via
FileSystem.get). This basically means that the caller has to use
whatever context classloader is currently in use by the fork join
common pool, or else bad things can happen, such as
nonsensical-looking ClassCastExceptions.
2. Inability to set context classloader
When System.getSecurityManager() != null, the common forkjoin pool
switches from its default worker thread factory implementation to a
more restrictive alternative called
InnocuousForkJoinWorkerThreadFactory. Threads created by this factory
can't call setContextClassLoader. Attempting to do so throws a
SecurityException. However, UserGroupInformation.newLoginContext must
be able to call setContextClassLoader. It saves the CCL to a variable,
does some work, then restores the CCL from a variable. This is
impossible if the method throws a SecurityException. So, if a security
manager is present in the VM, submit's async task can die in
FileSystem.get -> UGI before any useful work even begins.
This commit introduces shared logic around making and shutting down a
Executors.newSingleThreadExecutor. I considered pulling it up to
AbstractHadoopGraphComputer, the common public base class of both
Spark- and GiraphGraphComputer, but I couldn't see a way to do it that
would preserve compatibility for a third-party computer implementation
that extends the same base class. That's why I put the common logic
in a new static helper method instead.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---