[
https://issues.apache.org/jira/browse/TINKERPOP3-988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15024724#comment-15024724
]
ASF GitHub Bot commented on TINKERPOP3-988:
-------------------------------------------
Github user dalaro commented on the pull request:
https://github.com/apache/incubator-tinkerpop/pull/154#issuecomment-159313974
I think this problem also applies to GiraphGraphComputer. Consider the
following sequence of operations in SparkGraphComputer when `null !=
System.getSecurityManager()`.
* submit passes a task to CompletableFuture.supplyAsync (served by the
forkjoin common pool)
* submitted task calls FileSystem.get
* FileSystem.get calls a bunch of Hadoop permission guts and eventually
gets to UserGroupInformation.newLoginContext
* UserGroupInformation.newLoginContext calls setContextClassLoader on the
current thread
* setContextClassLoader throws SecurityException if `null !=
System.getSecurityManager()` and the task is running in a common forkjoin pool
thread, killing the task
The FileSystem.get call is visible in SGC's source. The succeeding calls
happen inside FileSystem.get and aren't visible in TP's source. But GGC calls
FileSystem.get with the same parameter signature as SGC, so they should both be
affected by this problem.
I don't know whether the same problem applies to TinkerGraphComputer, but
my guess is no, unless TinkerGraphComputer's internals touch the context
classloader or something else that does, like Hadoop.
I'll rework the PR to apply to both SGC and GGC (but not TGC).
> SparkGraphComputer.submit shouldn't use ForkJoinPool.commonPool
> ---------------------------------------------------------------
>
> Key: TINKERPOP3-988
> URL: https://issues.apache.org/jira/browse/TINKERPOP3-988
> Project: TinkerPop 3
> Issue Type: Improvement
> Components: hadoop
> Affects Versions: 3.1.0-incubating
> Reporter: Dan LaRocque
> Assignee: Dan LaRocque
> Fix For: 3.1.1-incubating
>
>
> {{SparkGraphComputer.submit}} delegates most of its work to a closure that
> executes on the common forkjoin pool. The closure does a lot of stuff. It
> calls into both Spark and Hadoop.
> This approach has two problems:
> 1. Inability to customize the context classloader used within the closure
> The context classloader of the thread that called {{submit}} is not
> necessarily the same as the context classloader common forkjoin pool threads.
> This matters because multiple bits of code reachable from {{submit}}'s
> closure rely on the context classloader. SparkMemory is one; Hadoop's
> UserGroupInformation is another, depending on the credentials configuration
> (UGI is reached indirectly via {{FileSystem.get}}). This basically means
> that the caller has to use whatever context classloader is currently in use
> by the fork join common pool, or else bad things can happen, such as
> nonsensical-looking ClassCastExceptions.
> 2. Inability to override the context classloader inside the closure
> When {{System.getSecurityManager() != null}}, the common forkjoin pool
> switches from its default worker thread factory implementation to a more
> restrictive alternative called InnocuousForkJoinWorkerThreadFactory. Threads
> created by this factory can't call {{setContextClassLoader}}. Attempting to
> do so throws a SecurityException. However,
> UserGroupInformation.newLoginContext must be able to call
> {{setContextClassLoader}}. It saves the CCL to a variable, does some work,
> then restores the CCL from a variable. This is impossible if the method
> throws a SecurityException. So, if a security manager is present in the VM,
> {{submit}}'s closure can die in {{FileSystem.get}} -> UGI before any useful
> work even begins.
> I set the Affects Version to the version on which I observed it, but it might
> affect earlier versions too.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)