ron8hu commented on pull request #31165: URL: https://github.com/apache/spark/pull/31165#issuecomment-808645855
> Seems OK. Does this need to be documented, and, it shouldn't change any existing behavior right? @srowen The benefit of this feature is to let users target a subset of tasks in a specific task status. I often see a stage running with 5000+ tasks on a large Spark cluster. For example, a user may want to analyze the killed tasks after enabling speculation. The number of the killed tasks is usually small. Without the taskStatus parameter, then the REST API output for a specific stage is just too big (with 5000+ tasks). With the taskStatus parameter, the REST API output will be small with only those killed tasks. YES, we should document this feature. @AngersZhuuuu made the changes to doc/monitoring.md file. This feature is pretty local to the specified stage. It does not change any existing behavior. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
