HeartSaVioR commented on PR #48517: URL: https://github.com/apache/spark/pull/48517#issuecomment-2428109032
UPDATE: @hvanhovell and I had an offline talk. I wasn't very clear about the semantic of the API, and he clarified that the intention of the API (semantic) is not to cope with the default values, but "to ensure" the node is executed in any way. E.g. it is a wrong optimization if any optimization would end up with dropping the node. I only see the issue for PruneFilters so it's easier to make a point fix, but need to discuss further with other folks as well to aim for better fix. That said, he also stated that providing default value is worse than not having the metrics; which I agree based on the new understanding of the semantic of the API. Users should be able to know whether Spark fails to calculate the metrics or not. It is at least possible before this fix. I'll revert the commit and look for a better fix. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
