davidm-db commented on code in PR #47403:
URL: https://github.com/apache/spark/pull/47403#discussion_r1698602295
##########
sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala:
##########
@@ -650,14 +657,27 @@ class SparkSession private(
private[sql] def sql(sqlText: String, args: Array[_], tracker:
QueryPlanningTracker): DataFrame =
withActive {
val plan = tracker.measurePhase(QueryPlanningTracker.PARSING) {
- val parsedPlan = sessionState.sqlParser.parsePlan(sqlText)
- if (args.nonEmpty) {
- PosParameterizedQuery(parsedPlan,
args.map(lit(_).expr).toImmutableArraySeq)
- } else {
- parsedPlan
+ val parsedPlan = sessionState.sqlParser.parseScript(sqlText)
+ parsedPlan match {
+ case CompoundBody(Seq(singleStmtPlan: SingleStatement), label) if
args.nonEmpty =>
+ CompoundBody(Seq(SingleStatement(
+ PosParameterizedQuery(
+ singleStmtPlan.parsedPlan,
args.map(lit(_).expr).toImmutableArraySeq))), label)
+ case p =>
+ assert(args.isEmpty, "Named parameters are not supported for batch
queries")
+ p
}
}
- Dataset.ofRows(self, plan, tracker)
+
+ plan match {
+ case CompoundBody(Seq(singleStmtPlan: SingleStatement), _) =>
+ Dataset.ofRows(self, singleStmtPlan.parsedPlan, tracker)
+ case _ =>
+ // execute the plan directly if it is not a single statement
+ val lastRow = executeScript(plan).foldLeft(Array.empty[Row])((_,
next) => next)
Review Comment:
let's think if we want to do this exactly this way, because:
- `executeScript` is basically a simple one-liner and alias for
interpreter's `execute` function
- when we introduce multiple results in the future, it seems best to:
- have `executeMultipleResults` in the interpreter
- each function (`execute` and `executeMultipleResults` and maybe
something new?) should collect data based on the type of data it needs to return
I propose that `execute` family of methods in the interpreter should be
responsible to handle the logic of which data is returned, instead of fetching
last row here in `SparkSession`.
I didn't write a ton of details here, I'm writing this comment as a reminder
and we can discuss more offline.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]