mridulm commented on PR #44021:
URL: https://github.com/apache/spark/pull/44021#issuecomment-1863873954

   > `AsyncGetCallTrace` is used precisely to map calls in the native thread to 
calls in the java thread. Not sure exactly what you are looking for here. Are 
you looking to profile individual tasks? It certainly can be done, but would 
require some changes similar to 
[SPARK-45151](https://issues.apache.org/jira/browse/SPARK-45151) and some 
additional work if you want the profile available thru the UI. Or are you 
looking to enhance 
[SPARK-45151](https://issues.apache.org/jira/browse/SPARK-45151) and get a 
stack trace that includes native calls? This is a little harder via 
async_profiler since there is no API to get a snapshot. Note that getting a 
profile needs to be collected over a period of time and so is different from 
getting a snapshot as 
[SPARK-45151](https://issues.apache.org/jira/browse/SPARK-45151) is doing.
   
   
   There is a difference between native thread id's and java thread ids.
   Given the async profiler output, can we map it to the corresponding task 
(given task's java thread id) ?
   My understanding is currently no - but if I am missing something, do let me 
know.
   
   Assuming no, this means the stack traces generated are for all threads in 
the executor jvm - and so does not allow us to get stack traces and/or 
flamegraphs for a particular task, tasks of a stage, etc.
   
   If yes, this would be very useful - and will allow for future evolution as 
part of SPARK-44893 [1].
   
   
   > 
   > > Simply dumping per executor flamegraphs or stack traces has limited 
utility (and can be done today).
   > 
   > I would suggest that this PR makes it trivially simple to profile with no 
setup required. On K8s, with ephemeral storage, it is not a simple task to dump 
a profile to disk and get it off the pod before the pod is destroyed (it was in 
fact the original motivation behind doing this).
   
   I am not seeing a lot of value in including this into Apache Spark itself - 
plugin api is public, and users can leverage it to do precisely what the PR is 
proposing.
   On other hand, if the PR is integrating well with SPARK-44893 [1] - and/or 
there is a path to leveraging it in that work, it would be more useful.
   
   I am not exactly -1 on this @dongjoon-hyun , but I am not seeing a lot of 
value in it: will let you make the call.
   
   
   [1] This is the jira I was trying to paste, but github mobile messed it up - 
and ended up referencing a subtask !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to