[
https://issues.apache.org/jira/browse/SPARK-52536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhen Wang updated SPARK-52536:
------------------------------
Description:
AsyncProfilerLoader uses `user.home` by default to store the extracted
libraries:
[https://github.com/jvm-profiling-tools/ap-loader/blob/main/src/main/java/one/profiler/AsyncProfilerLoader.java#L139-L152]
The `user.home` directory of the datanodes in our yarn cluster was not
initialized, causing the executor startup to fail:
{code:java}
25/06/20 11:54:26 ERROR YarnCoarseGrainedExecutorBackend: Executor self-exiting
due to : Unable to create executor due to /home/pilot
java.nio.file.AccessDeniedException: /home/pilot
at
sun.nio.fs.UnixException.translateToIOException(UnixException.java:84)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at
sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384)
at java.nio.file.Files.createDirectory(Files.java:674)
at java.nio.file.Files.createAndCheckIsDirectory(Files.java:781)
at java.nio.file.Files.createDirectories(Files.java:767)
at
one.profiler.AsyncProfilerLoader.getExtractionDirectory(AsyncProfilerLoader.java:133)
at
one.profiler.AsyncProfilerLoader.getAsyncProfilerPath(AsyncProfilerLoader.java:562)
at one.profiler.AsyncProfilerLoader.load(AsyncProfilerLoader.java:861)
at
org.apache.spark.profiler.SparkAsyncProfiler.<init>(SparkAsyncProfiler.scala:70)
at
org.apache.spark.profiler.ProfilerExecutorPlugin.init(ProfilerPlugin.scala:82)
at
org.apache.spark.internal.plugin.ExecutorPluginContainer.$anonfun$executorPlugins$1(PluginContainer.scala:125)
at
scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:293)
at
scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
at
scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
at scala.collection.TraversableLike.flatMap(TraversableLike.scala:293)
at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:290)
at scala.collection.AbstractTraversable.flatMap(Traversable.scala:108)
at
org.apache.spark.internal.plugin.ExecutorPluginContainer.<init>(PluginContainer.scala:113)
at
org.apache.spark.internal.plugin.PluginContainer$.apply(PluginContainer.scala:211)
at
org.apache.spark.internal.plugin.PluginContainer$.apply(PluginContainer.scala:199)
at
org.apache.spark.executor.Executor.$anonfun$plugins$1(Executor.scala:337)
at org.apache.spark.util.Utils$.withContextClassLoader(Utils.scala:178)
at org.apache.spark.executor.Executor.<init>(Executor.scala:337)
at
org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$receive$1.applyOrElse(CoarseGrainedExecutorBackend.scala:181)
at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:115)
at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:213)
at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
at
org.apache.spark.rpc.netty.MessageLoop.org$apache$spark$rpc$netty$MessageLoop$$receiveLoop(MessageLoop.scala:75)
at
org.apache.spark.rpc.netty.MessageLoop$$anon$1.run(MessageLoop.scala:41)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
25/06/20 11:54:26 INFO YarnCoarseGrainedExecutorBackend: Driver commanded a
shutdown {code}
We can specify `AsyncProfilerLoader.extractionDir` to spark temp dir to avoid
this issue.
was:
AsyncProfilerLoader uses `user.home` by default to store the extracted
libraries:
https://github.com/jvm-profiling-tools/ap-loader/blob/main/src/main/java/one/profiler/AsyncProfilerLoader.java#L139-L152
The `user.home` directory of the datanodes in our yarn cluster was not
initialized, causing the executor startup to fail:
```
25/06/20 11:54:26 ERROR YarnCoarseGrainedExecutorBackend: Executor self-exiting
due to : Unable to create executor due to /home/test
java.nio.file.AccessDeniedException: /home/pilot at
sun.nio.fs.UnixException.translateToIOException(UnixException.java:84) at
sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at
sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at
sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384)
at java.nio.file.Files.createDirectory(Files.java:674) at
java.nio.file.Files.createAndCheckIsDirectory(Files.java:781) at
java.nio.file.Files.createDirectories(Files.java:767) at
one.profiler.AsyncProfilerLoader.getExtractionDirectory(AsyncProfilerLoader.java:133)
at
one.profiler.AsyncProfilerLoader.getAsyncProfilerPath(AsyncProfilerLoader.java:562)
at one.profiler.AsyncProfilerLoader.load(AsyncProfilerLoader.java:861) at
org.apache.spark.profiler.SparkAsyncProfiler.<init>(SparkAsyncProfiler.scala:70)
at
org.apache.spark.profiler.ProfilerExecutorPlugin.init(ProfilerPlugin.scala:82)
at
org.apache.spark.internal.plugin.ExecutorPluginContainer.$anonfun$executorPlugins$1(PluginContainer.scala:125)
at
scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:293)
at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) at
scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) at
scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at
scala.collection.TraversableLike.flatMap(TraversableLike.scala:293) at
scala.collection.TraversableLike.flatMap$(TraversableLike.scala:290) at
scala.collection.AbstractTraversable.flatMap(Traversable.scala:108) at
org.apache.spark.internal.plugin.ExecutorPluginContainer.<init>(PluginContainer.scala:113)
at
org.apache.spark.internal.plugin.PluginContainer$.apply(PluginContainer.scala:211)
at
org.apache.spark.internal.plugin.PluginContainer$.apply(PluginContainer.scala:199)
at org.apache.spark.executor.Executor.$anonfun$plugins$1(Executor.scala:337)
at org.apache.spark.util.Utils$.withContextClassLoader(Utils.scala:178) at
org.apache.spark.executor.Executor.<init>(Executor.scala:337) at
org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$receive$1.applyOrElse(CoarseGrainedExecutorBackend.scala:181)
at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:115) at
org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:213) at
org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100) at
org.apache.spark.rpc.netty.MessageLoop.org$apache$spark$rpc$netty$MessageLoop$$receiveLoop(MessageLoop.scala:75)
at org.apache.spark.rpc.netty.MessageLoop$$anon$1.run(MessageLoop.scala:41) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748) 25/06/20 11:54:26 INFO
YarnCoarseGrainedExecutorBackend: Driver commanded a shut
```
We can specify `AsyncProfilerLoader.extractionDir` to spark temp dir to avoid
this issue.
> Specify AsyncProfilerLoader.extractionDir to spark temp dir
> -----------------------------------------------------------
>
> Key: SPARK-52536
> URL: https://issues.apache.org/jira/browse/SPARK-52536
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core
> Affects Versions: 4.0.0
> Reporter: Zhen Wang
> Priority: Major
>
> AsyncProfilerLoader uses `user.home` by default to store the extracted
> libraries:
> [https://github.com/jvm-profiling-tools/ap-loader/blob/main/src/main/java/one/profiler/AsyncProfilerLoader.java#L139-L152]
> The `user.home` directory of the datanodes in our yarn cluster was not
> initialized, causing the executor startup to fail:
> {code:java}
> 25/06/20 11:54:26 ERROR YarnCoarseGrainedExecutorBackend: Executor
> self-exiting due to : Unable to create executor due to /home/pilot
> java.nio.file.AccessDeniedException: /home/pilot
> at
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:84)
> at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
> at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
> at
> sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384)
> at java.nio.file.Files.createDirectory(Files.java:674)
> at java.nio.file.Files.createAndCheckIsDirectory(Files.java:781)
> at java.nio.file.Files.createDirectories(Files.java:767)
> at
> one.profiler.AsyncProfilerLoader.getExtractionDirectory(AsyncProfilerLoader.java:133)
> at
> one.profiler.AsyncProfilerLoader.getAsyncProfilerPath(AsyncProfilerLoader.java:562)
> at one.profiler.AsyncProfilerLoader.load(AsyncProfilerLoader.java:861)
> at
> org.apache.spark.profiler.SparkAsyncProfiler.<init>(SparkAsyncProfiler.scala:70)
> at
> org.apache.spark.profiler.ProfilerExecutorPlugin.init(ProfilerPlugin.scala:82)
> at
> org.apache.spark.internal.plugin.ExecutorPluginContainer.$anonfun$executorPlugins$1(PluginContainer.scala:125)
> at
> scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:293)
> at
> scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
> at
> scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
> at scala.collection.TraversableLike.flatMap(TraversableLike.scala:293)
> at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:290)
> at scala.collection.AbstractTraversable.flatMap(Traversable.scala:108)
> at
> org.apache.spark.internal.plugin.ExecutorPluginContainer.<init>(PluginContainer.scala:113)
> at
> org.apache.spark.internal.plugin.PluginContainer$.apply(PluginContainer.scala:211)
> at
> org.apache.spark.internal.plugin.PluginContainer$.apply(PluginContainer.scala:199)
> at
> org.apache.spark.executor.Executor.$anonfun$plugins$1(Executor.scala:337)
> at org.apache.spark.util.Utils$.withContextClassLoader(Utils.scala:178)
> at org.apache.spark.executor.Executor.<init>(Executor.scala:337)
> at
> org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$receive$1.applyOrElse(CoarseGrainedExecutorBackend.scala:181)
> at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:115)
> at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:213)
> at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
> at
> org.apache.spark.rpc.netty.MessageLoop.org$apache$spark$rpc$netty$MessageLoop$$receiveLoop(MessageLoop.scala:75)
> at
> org.apache.spark.rpc.netty.MessageLoop$$anon$1.run(MessageLoop.scala:41)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> 25/06/20 11:54:26 INFO YarnCoarseGrainedExecutorBackend: Driver commanded a
> shutdown {code}
>
> We can specify `AsyncProfilerLoader.extractionDir` to spark temp dir to avoid
> this issue.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]