GitHub user dalaro opened a pull request:

    https://github.com/apache/incubator-tinkerpop/pull/154

    TINKERPOP3-988 Change SparkGraphComputer executor

    https://issues.apache.org/jira/browse/TINKERPOP3-988 describes the problem 
and my justification of this change.
    
    ForkJoinPool.commonPool shouldn't be used here, but I'm also not certain 
that creating one single-threaded executor per `submit` call is a reasonable 
alternative.  My assumption is that `SparkGraphComputer.submit` is an operation 
with pretty heavyweight processing implications, so this isn't something you'd 
want to fire off ten thousand times a second, and the overhead of one thread 
per call isn't that bad.  I can rework it if you think this is not acceptable 
though.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dalaro/incubator-tinkerpop master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-tinkerpop/pull/154.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #154
    
----
commit df930bb5b3f36daea33f40e13996cf75fe307979
Author: Dan LaRocque <[email protected]>
Date:   2015-11-23T21:42:46Z

    TINKERPOP3-988 Change SparkGraphComputer executor
    
    Changed SparkGraphComputer's submit method to use a custom executor
    instead of ForkJoinPool.commonPool.
    
    The submit method does very little in the calling thread; most of its
    work is done in an asynchronously-executed completable future.
    
    If the async task executes on ForkJoinPool.commonPool (the default if
    no executor is supplied), then two problems can arise:
    
    1. Loss of context classloader
    
    The context classloader of the thread that called submit is not
    necessarily the same as the context classloader common forkjoin pool
    threads. This matters because multiple bits of code reachable from
    submit's async task rely on the context classloader. SparkMemory is
    one; Hadoop's UserGroupInformation is another, depending on the
    credentials configuration (UGI is reached indirectly via
    FileSystem.get). This basically means that the caller has to use
    whatever context classloader is currently in use by the fork join
    common pool, or else bad things can happen, such as
    nonsensical-looking ClassCastExceptions.
    
    2. Inability to set context classloader
    
    When System.getSecurityManager() != null, the common forkjoin pool
    switches from its default worker thread factory implementation to a
    more restrictive alternative called
    InnocuousForkJoinWorkerThreadFactory. Threads created by this factory
    can't call setContextClassLoader. Attempting to do so throws a
    SecurityException. However, UserGroupInformation.newLoginContext must
    be able to call setContextClassLoader. It saves the CCL to a variable,
    does some work, then restores the CCL from a variable. This is
    impossible if the method throws a SecurityException. So, if a security
    manager is present in the VM, submit's async task can die in
    FileSystem.get -> UGI before any useful work even begins.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to