Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/23096#discussion_r235309483
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
---
@@ -648,7 +648,11 @@ class SparkSession private(
* @since 2.0.0
*/
def sql(sqlText: String): DataFrame = {
- Dataset.ofRows(self, sessionState.sqlParser.parsePlan(sqlText))
+ val tracker = new QueryPlanningTracker
--- End diff --
I don't think it makes sense to add random flags for everything. If the
argument is that this change has a decent chance of introducing regressions
(e.g. due to higher memory usage, or cpu overhead), then it would make a lot of
sense to put it behind a flag so it can be disabled in production if that
happens.
That said, the overhead on the hot code path here is substantially smaller
than even transforming the simplest Catalyst plan (hash map look up is orders
of magnitude cheaper than calling a partial function to transform a Scala
collection for TreeNode), so I think the risk is so low that it does not
warrant adding a config.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]