virrrat opened a new pull request, #47516: URL: https://github.com/apache/spark/pull/47516
### What changes were proposed in this pull request? This patch fixes an issue in the driver hosted Spark UI `SQL` tab DAG view where the invalid SQL metric values are not filtered out correctly and hence showing incorrect `minimum` and `median` metric values in the UI. This regression got introduced in #39311 . ### Why are the changes needed? `SIZE`, `TIMING` and `NS_TIMING` metrics are [created](https://github.com/apache/spark/blob/9f22fa4d2acfcbc42d6d76a28778885cbdad733d/sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala#L155-L182) with initial value `-1` (given `0` is a valid metric value for them). The `SQLMetrics.stringValue` method filters out the invalid values using condition: `value >= 0` before calculating the `min`, `med` and `max` values. But #39311 introduced in Spark `3.4.0` introduced a regression where the `SQLMetric.value` is always >= 0. This means that the invalid accumulators with value `-1` are no longer invalid to get filtered out correctly. This needs to be fixed. ### Does this PR introduce _any_ user-facing change? Yes, as end users can access accumulator values directly. Users accessing the values in the physical plan programmatically should use `SQLMetric.isZero` before consuming its value. ### How was this patch tested? Existing tests; Created new jar for Spark `3.5.1` and confirms that the incorrect data is shown correctly in Spark UI now. Old UI view: [old_spark_ui_view_3_5_1.pdf](https://github.com/user-attachments/files/16411445/old_spark_ui_view_3_5_1.pdf) Fixed UI view: [new_spark_ui_view_3_5_1.pdf](https://github.com/user-attachments/files/16411447/new_spark_ui_view_3_5_1.pdf) ### Was this patch authored or co-authored using generative AI tooling? No -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
