pan3793 commented on PR #49483:
URL: https://github.com/apache/spark/pull/49483#issuecomment-2591070767

   @parthchandra I did some homework on integrating the profiler with the Spark 
UI flame graph.
   
   the first important question is what the pipeline of collecting and 
aggregating profiling events should be like.
   
   the current Spark UI building pipeline is:
   
   Live UI: events from Spark event bus => aggregated data in KVStore => UI
   History UI: events from event logs persisted on DFS => aggregated data in 
KVStore => Spark UI
   
   The JDK built-in JFR provides methods to read JFR events from both disk and 
in-process, so we can follow the current Spark UI process to use in-process JFR 
monitoring for live UI flame graph and read JFR results from DFS for History UI 
flame graph
   
   https://openjdk.org/jeps/349
   
   > There are three factory methods to create a stream.
   > `EventStream::openRepository(Path)` constructs a stream from a disk 
repository. This is a way to monitor other processes by working directly 
against the file system. The location of the disk repository is stored in the 
system property `jdk.jfr.repository` that can be read using the attach API. It 
is also possible to perform in-process monitoring using the 
`EventStream::openRepository()` method. Unlike `RecordingStream`, it does not 
start a recording. Instead, the stream receives events only when recordings are 
started by external means, for example using JCMD or JMX. The method 
`EventStream::openFile(Path)` creates a stream from a recording file. It 
complements the `RecordingFile` class that already exists today.
   
   but I think that `async-profiler` does not support in-process 
monitoring(correct me if I'm wrong), so we must persist the results to disk 
first and read again to replay the events to aggregate and draw the flame 
graph, so the pipeline will be unified to:
   
   JFR results persisted on DFS => aggregated data in KVStore => flame graph 
Spark UI (live and history)
   
   If so, the draw of the flame graph is decoupled from how we collect and 
generate the JFR results, as long as the JFR results have a stable folder 
layout and name pattern on the DFS. As you can see, the proposed refactor does 
not change that.
   
   ```
   <baseDir>/{{APP_ID}}/profile-driver.jfr                -- new added for 
driver
   <baseDir>/{{APP_ID}}/profile-exec-{{EXECUTOR_ID}}.jfr  -- unchanged for 
executors
   ```
   
   Before making the Spark UI display flame graph directly, I'd like to allow 
users to download the JFR results from the SHS listing page directly so that 
they can import the JFR results to local tools like JDK Mission Control or IDEA 
to analyze their jobs.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to