[ 
https://issues.apache.org/jira/browse/SPARK-38309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mridul Muralidharan reassigned SPARK-38309:
-------------------------------------------

    Assignee: Rob Reeves

> SHS has incorrect percentiles for shuffle read bytes and shuffle total blocks 
> metrics
> -------------------------------------------------------------------------------------
>
>                 Key: SPARK-38309
>                 URL: https://issues.apache.org/jira/browse/SPARK-38309
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 3.1.0
>            Reporter: Rob Reeves
>            Assignee: Rob Reeves
>            Priority: Major
>              Labels: correctness
>         Attachments: image-2022-02-23-14-19-33-255.png
>
>
> *Background*
> In [this PR|https://github.com/apache/spark/pull/26508] (SPARK-26260) the SHS 
> stage metric percentiles were updated to only include successful tasks when 
> using disk storage. It did this by making the values for each metric negative 
> when the task is not in a successful state. This approach was chosen to avoid 
> breaking changes to disk storage. See [this 
> comment|https://github.com/apache/spark/pull/26508#issuecomment-554540314] 
> for context.
> To get the percentiles, it reads the metric values, starting at 0, in 
> ascending order. This filters out all tasks that are not successful because 
> the values are less than 0. To get the percentile values it scales the 
> percentiles to the list index of successful tasks. For example if there are 
> 200 tasks and you want percentiles [0, 25, 50, 75, 100] the lookup indexes in 
> the task collection are [0, 50, 100, 150, 199].
> *Issue*
> For metrics 1) shuffle total reads and 2) shuffle total blocks, the above PR 
> incorrectly makes the metric indices positive. This means tasks that are not 
> successful are included in the percentile calculations. The percentile lookup 
> index calculation is still based on the number of successful task so the 
> wrong task metric is returned for a given percentile. This was not caught 
> because the unit test only verified values for one metric, executorRunTime.
> *Steps to Reproduce*
> _SHS UI_
>  # Find a spark application in the SHS that has failed tasks for a stage with 
> shuffle read.
>  # Navigate to the stage UI.
>  # Look at the max shuffle read size in the summary metrics
>  # Sort the tasks by shuffle read size descending. You'll see it doesn't 
> match step 3.
>  
> !image-2022-02-23-14-19-33-255.png|width=789,height=389!
> _API_
>  # For the same stage in the above repro steps, make a request to the task 
> summary endpoint (e.g. 
> /api/v1/applications/application_1632281309592_21294517/1/stages/6/0/taskSummary?quantiles=0,0.25,0.5,0.75,1.0)
>  # Look at the shuffleReadMetrics.readBytes and 
> shuffleReadMetrics.totalBlocksFetched. You will see -2 for at least some of 
> the lower percentiles and the positive values will also be wrong.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to