Github user liancheng commented on a diff in the pull request:
https://github.com/apache/spark/pull/11466#discussion_r55118884
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/SQLBuilder.scala ---
@@ -78,6 +78,27 @@ class SQLBuilder(logicalPlan: LogicalPlan, sqlContext:
SQLContext) extends Loggi
}
}
+ private def toSQL(node: LogicalPlan, topNode: Boolean): String = {
--- End diff --
@viirya @gatorsmile Thanks for your work and discussions! The initial
motivation for implementing SQL generation is for better native view support.
This makes the following constraints reasonable for the initial version of SQL
generation in Spark 2.0:
1. The target logical plan must be parsed from a valid HiveQL query
statement
So that the structure of the original SQL statement is preserved as much
as possible (for example, subquery scoping information). This makes mapping
from plan fragments to their SQL representations much easier.
1. The target logical plan must be _fully_ resolved
Basically you can't guarantee that an unresolved / partially resolved
logical plan is actually valid. And they may contain unwanted auxiliary
expressions / operators like `UnresolvedAlias`, which further complicate SQL
generation.
1. The target logical plan should NOT be optimized
Similar to 1.
Also, for native view support, generating optimal SQL statements is NOT a
requirement. It's OK that we generate verbose and inefficient SQL statements
as long as the Catalyst optimizer can optimize them at runtime.
Ideally, I'd like to remove constraints 1 and 3 in the future, so that SQL
generation can be applied to wider scenarios, e.g., random query testing. But
for Spark 2.0, let's focus on fully resolved, non-optimized logical plans
parsed from valid HiveQL first. Correctness and test coverage is more
important at the current stage.
So my suggestion is:
1. Revert this change
1. Revisit it after we finish SQL generation support of all major language
structures
Window functions and generators are not supported yet. I'm afraid we
may miss important cases if we try to do this work now.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]