[ https://issues.apache.org/jira/browse/SPARK-26399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ron Hu updated SPARK-26399: --------------------------- Description: Add the peak values for the metrics to the stages REST API. Also add a new executorSummary REST API, which will return executor summary metrics for a specified stage: {code:java} curl http://<spark history server>:18080/api/v1/applications/<application_id>/<application_attempt/stages/<stage_id>/<stage_attempt>/executorMetricsSummary{code} Add parameters to the stages REST API to specify: * filtering for task status, and returning tasks that match (for example, FAILED tasks). * task metric quantiles, add adding the task summary if specified * executor metric quantiles, and adding the executor summary if specified Note that the above description is too brief to be clear. [~angerszhuuu] and [~ron8hu] discussed a generic and consistent way for endpoint /application/\{app-id}/stages. It can be: /application/\{app-id}/stages?details=[true|false]&status=[ACTIVE|COMPLETE|FAILED|PENDING|SKIPPED]&withSummaries=[true|false]&taskStatus=[RUNNING|SUCCESS|FAILED|PENDING] where query parameter details=true is to show the detailed task information within each stage. The default value is details=false; query parameter status can select those stages with the specified status; query parameter withSummaries=true is to show both task summary information in percentile distribution and executor summary information in percentile distribution. The default value is withSummaries=false. taskStatus is to show only those tasks with the specified status within their corresponding stages. was: Add the peak values for the metrics to the stages REST API. Also add a new executorSummary REST API, which will return executor summary metrics for a specified stage: {code:java} curl http://<spark history server>:18080/api/v1/applications/<application_id>/<application_attempt/stages/<stage_id>/<stage_attempt>/executorMetricsSummary{code} Add parameters to the stages REST API to specify: * filtering for task status, and returning tasks that match (for example, FAILED tasks). * task metric quantiles, add adding the task summary if specified * executor metric quantiles, and adding the executor summary if specified Note that the above description is too brief to be clear. Ron Hu added the additional details to explain the use cases from the downstream products. See the comments dated 1/07/2021 with a couple of sample json files. > Add new stage-level REST APIs and parameters > -------------------------------------------- > > Key: SPARK-26399 > URL: https://issues.apache.org/jira/browse/SPARK-26399 > Project: Spark > Issue Type: Sub-task > Components: Spark Core > Affects Versions: 3.1.0 > Reporter: Edward Lu > Priority: Major > Attachments: executorMetricsSummary.json, > lispark230_restapi_ex2_stages_failedTasks.json, > lispark230_restapi_ex2_stages_withSummaries.json, > stage_executorSummary_image1.png > > > Add the peak values for the metrics to the stages REST API. Also add a new > executorSummary REST API, which will return executor summary metrics for a > specified stage: > {code:java} > curl http://<spark history > server>:18080/api/v1/applications/<application_id>/<application_attempt/stages/<stage_id>/<stage_attempt>/executorMetricsSummary{code} > Add parameters to the stages REST API to specify: > * filtering for task status, and returning tasks that match (for example, > FAILED tasks). > * task metric quantiles, add adding the task summary if specified > * executor metric quantiles, and adding the executor summary if specified > Note that the above description is too brief to be clear. [~angerszhuuu] and > [~ron8hu] discussed a generic and consistent way for endpoint > /application/\{app-id}/stages. It can be: > /application/\{app-id}/stages?details=[true|false]&status=[ACTIVE|COMPLETE|FAILED|PENDING|SKIPPED]&withSummaries=[true|false]&taskStatus=[RUNNING|SUCCESS|FAILED|PENDING] > where query parameter details=true is to show the detailed task information > within each stage. The default value is details=false; > query parameter status can select those stages with the specified status; > query parameter withSummaries=true is to show both task summary information > in percentile distribution and executor summary information in percentile > distribution. The default value is withSummaries=false. > taskStatus is to show only those tasks with the specified status within their > corresponding stages. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org