[GitHub] [spark] Ngone51 commented on pull request #39410: [SPARK-41848][CORE] Fixing task over-scheduled with TaskResourceProfile

GitBox Mon, 09 Jan 2023 17:48:24 -0800


Ngone51 commented on PR #39410:
URL: https://github.com/apache/spark/pull/39410#issuecomment-1376610318


   @mridulm That might be an alternative. I was thinking about it too. But 
checking the code, I found that a `TaskSet` might have been cleaned up when a 
task finishes, e.g., in the case of executor lost comes before the 
`StatusUpdate`. `TaskSchedulerImpl.statusUpdate()` has also considered this 
case: 
   
https://github.com/apache/spark/blob/c13fea90595cb1489046135c45c92d3bacb85818/core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala#L843
   
   in that case, we'd have troubles to know the exact task cores used. We might 
need to maintain an extra constructor at the driver to track the resources used 
by the task if we want to do this alternative way.
   
   
   > Instead of passing it from executor ?
   
   The current way is actually consistent with the way we handled with custom 
resources, which is also assigned at driver and returned with `StatusUpdate`. 
Though, the custom resources also has the same issue with task cores as you 
concerned.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] Ngone51 commented on pull request #39410: [SPARK-41848][CORE] Fixing task over-scheduled with TaskResourceProfile

Reply via email to