mridulm commented on PR #44021:
URL: https://github.com/apache/spark/pull/44021#issuecomment-1866074886

   That sounds promising !
   What is unclear to me is how we are going to do the mapping without 
something which ends up introducing safe point bias (essentially, cost of this 
operation) ... 
   For example, if the native to java thread mapping requires 
mxbean.getThreadInfo and/or similar approaches, it becomes fairly expensive.
   
   Essentially what I am trying to make sure is - given `(native-thread-id -> 
timestamp -> stack_dumps+)*`, can we identify the `native-thread -> 
java-thread-id` ?
   If yes, we can build the `java-thread-id -> task-id` in spark, and 
essentially get to `(task-id -> stack_dumps+)*` for all (most ?) tasks.
   
   When we built 
[Safari](https://blog.cloudera.com/demystifying-spark-jobs-to-optimize-for-cost-and-performance/),
 this is what ended up being extremely powerful for understanding application 
performance - per-task stack dumps, correlated across all tasks for a stage: 
allowing us to understand what the stack dump for a particular stage is, what 
the difference between 'expensive' tasks in a stage vs average task is, etc.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to