huonw opened a new pull request #24414: [SPARK-22044][SQL] Add `cost` and `codegen` arguments to `explain` URL: https://github.com/apache/spark/pull/24414 ## What changes were proposed in this pull request? In SQL it's easy to see the inferred statistics (`EXPLAIN COST`) and the generated code (`EXPLAIN CODEGEN`), but it was much more annoying to do so via the Dataset/DataFrame APIs. It was more annoying to access this information from PySpark, and yet even more annoying from SparkR, as the work-around for each required dropping down to call JVM functions directly. This patch exposes this via an overload of `explain` that takes 3 boolean arguments (extended, cost and codegen). This doesn't replace the old `explain` overloads (to keep backwards compatibility), and uses booleans to be easily compatible with PySpark and SparkR callers. The latter have their `explain` functions extended to include these extra arguments too. ## How was this patch tested? Added unit tests.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
