Parth Chandra created SPARK-46094:
-------------------------------------
Summary: Add support for code profiling executors
Key: SPARK-46094
URL: https://issues.apache.org/jira/browse/SPARK-46094
Project: Spark
Issue Type: New Feature
Components: Spark Core
Affects Versions: 4.0.0
Reporter: Parth Chandra
To profile a Spark application a user or developer has to run a spark job
locally on the development machine and use a tool like Java flight recorder,
Yourkit, or async-profiler to record profiling information. Because profiling
can be expensive, the profiler is typically attached to the Spark jvm process
after the process has started and stopped once sufficient profiling data is
collected.
The developers environment is frequently different from the production
environment and may not yield accurate information.
However, the profiling process is hard when a Spark application runs as a
distributed job on a cluster where the developer may have limited access to the
actual nodes where the executor processes are running. Also, in environments
like Kubernetes where the executor pods may be removed as soon as the job
completes, retrieving the profiling information from each executor pod can
become quite tricky.
This feature is to add a low overhead sampling profiler like async-profiler as
a built in capability to the Spark job that can be turned on using only user
configurable parameters (async-profiler is a low overhead profiler that can be
invoked programmatically and is available as a single multi-platform jar (for
linux, and mac).
In addition, for convenience, the feature would save profiling output files to
the distributed file system so that information from all executors can be
available in a single place.
The feature would add an executor plugin that does not add any overhead unless
enabled and can be configured to accept profiler arguments as a configuration
parameter.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]