virrrat opened a new pull request, #47516:
URL: https://github.com/apache/spark/pull/47516

   ### What changes were proposed in this pull request?
   This patch fixes an issue in the driver hosted Spark UI `SQL` tab DAG view 
where the invalid SQL metric values are not filtered out correctly and hence 
showing incorrect `minimum` and `median` metric values in the UI. This 
regression got introduced in #39311 .
   
   ### Why are the changes needed?
   `SIZE`, `TIMING` and `NS_TIMING` metrics are 
[created](https://github.com/apache/spark/blob/9f22fa4d2acfcbc42d6d76a28778885cbdad733d/sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala#L155-L182)
 with initial value `-1` (given `0` is a valid metric value for them). The 
`SQLMetrics.stringValue` method filters out the invalid values using condition: 
`value >= 0` before calculating the `min`, `med` and `max` values. But #39311 
introduced in Spark `3.4.0` introduced a regression where the `SQLMetric.value` 
is always >= 0. This means that the invalid accumulators with value `-1` are no 
longer invalid to get filtered out correctly. This needs to be fixed.
   
   ### Does this PR introduce _any_ user-facing change?
   Yes, as end users can access accumulator values directly. Users accessing 
the values in the physical plan programmatically should use `SQLMetric.isZero` 
before consuming its value.
   
   ### How was this patch tested?
   Existing tests; Created new jar for Spark `3.5.1` and confirms that the 
incorrect data is shown correctly in Spark UI now.
   
   Old UI view: 
   
[old_spark_ui_view_3_5_1.pdf](https://github.com/user-attachments/files/16411445/old_spark_ui_view_3_5_1.pdf)
   
   Fixed UI view: 
   
[new_spark_ui_view_3_5_1.pdf](https://github.com/user-attachments/files/16411447/new_spark_ui_view_3_5_1.pdf)
   
   ### Was this patch authored or co-authored using generative AI tooling?
   No
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to