virrrat commented on PR #47516: URL: https://github.com/apache/spark/pull/47516#issuecomment-2262298606
> Are you sure it may contain -1? You can run the tpc-ds benchmark q1 and check the SQL tab DAG views in driver hosted Spark UI. The minimum and median values are not correct for multiple operator metrics and they don't match with history server values either (as pointed out by @abmodi in the JIRA [ticket](https://issues.apache.org/jira/browse/SPARK-49038)). For incorrect metric cases, the minimum value is always zero while the median value is always less than the actual median value. The reason for this is that there are extra zeros in the `values` parameter of `SQLMetrics.stringValue` method which were actually `-1` and should have filtered out. I'm not a spark expert but given `SQLMetrics.stringValue` method has [this](https://github.com/apache/spark/blob/9f22fa4d2acfcbc42d6d76a28778885cbdad733d/sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala#L251) logic to fetch `validValues` itself indicates that the `values` parameter may contain `-1`. > People can access SQLMetric instances in physical plan programmatically. Agree. If users are consuming values directly from listener, they may use `SQLMetric.isZero` before consuming it. We can't compromise the integrity of the `SQLMetric.value` just for this given it is causing regression in the data shown in Spark UI. Either we need to expose two different variables, one to be used for Spark UI metric data calculation and other to be used by users programmatically but not sure if it's a good idea. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
