Re: [PR] [SPARK-48344][SQL] Prepare SQL Scripting for addition of Execution Framework - part 2 [spark]

2024-11-25 Thread via GitHub


miland-db commented on code in PR #48950:
URL: https://github.com/apache/spark/pull/48950#discussion_r1856780654


##
sql/core/src/main/scala/org/apache/spark/sql/scripting/SqlScriptingExecution.scala:
##
@@ -0,0 +1,87 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.scripting
+
+import org.apache.spark.sql.{DataFrame, SparkSession}
+import org.apache.spark.sql.catalyst.expressions.Expression
+import org.apache.spark.sql.catalyst.plans.logical.{CommandResult, 
CompoundBody}
+
+/**
+ * SQL scripting executor - executes script and returns result statements.
+ */
+class SqlScriptingExecution(
+sqlScript: CompoundBody,
+session: SparkSession,
+args: Map[String, Expression]) extends Iterator[DataFrame] {
+
+  // Build the execution plan for the script
+  private val executionPlan: Iterator[CompoundStatementExec] =
+SqlScriptingInterpreter(session).buildExecutionPlan(sqlScript, args)
+
+  private var current = getNextResult
+
+  override def hasNext: Boolean = current.isDefined
+
+  override def next(): DataFrame = {
+if (!hasNext) {
+  throw new NoSuchElementException("No more statements to execute")
+}
+val nextDataFrame = current.get
+current = getNextResult
+nextDataFrame
+  }
+
+  /** Helper method to iterate through statements until next result statement 
is encountered */
+  private def getNextResult: Option[DataFrame] = {
+var currentStatement = if (executionPlan.hasNext) 
Some(executionPlan.next()) else None
+// While we don't have a result statement, execute the statements
+while (currentStatement.isDefined) {
+  currentStatement match {
+case Some(stmt: SingleStatementExec) if !stmt.isExecuted =>
+  withErrorHandling() {
+val df = stmt.buildDataFrame(session)
+if (df.logicalPlan.isInstanceOf[CommandResult]) {
+  // If the statement is not a result, we need to write it to a 
noop sink to execute it
+  df.write.format("noop").mode("overwrite").save()
+} else {
+  // If the statement is a result, we need to return it to the 
caller
+  return Some(df)
+}
+  }
+case _ => // pass
+  }
+  currentStatement = if (executionPlan.hasNext) Some(executionPlan.next()) 
else None
+}
+None
+  }
+
+  private def handleException(e: Exception): Unit = {
+// Rethrow the exception
+// TODO: SPARK-48353 Add error handling for SQL scripts
+throw e
+  }
+
+  def withErrorHandling()(f: => Unit): Unit = {

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48344][SQL] Prepare SQL Scripting for addition of Execution Framework - part 2 [spark]

2024-11-25 Thread via GitHub


dusantism-db commented on code in PR #48950:
URL: https://github.com/apache/spark/pull/48950#discussion_r1856718243


##
sql/core/src/main/scala/org/apache/spark/sql/scripting/SqlScriptingExecution.scala:
##
@@ -0,0 +1,87 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.scripting
+
+import org.apache.spark.sql.{DataFrame, SparkSession}
+import org.apache.spark.sql.catalyst.expressions.Expression
+import org.apache.spark.sql.catalyst.plans.logical.{CommandResult, 
CompoundBody}
+
+/**
+ * SQL scripting executor - executes script and returns result statements.
+ */
+class SqlScriptingExecution(
+sqlScript: CompoundBody,
+session: SparkSession,
+args: Map[String, Expression]) extends Iterator[DataFrame] {
+
+  // Build the execution plan for the script
+  private val executionPlan: Iterator[CompoundStatementExec] =
+SqlScriptingInterpreter(session).buildExecutionPlan(sqlScript, args)
+
+  private var current = getNextResult
+
+  override def hasNext: Boolean = current.isDefined
+
+  override def next(): DataFrame = {
+if (!hasNext) {
+  throw new NoSuchElementException("No more statements to execute")
+}
+val nextDataFrame = current.get
+current = getNextResult
+nextDataFrame
+  }
+
+  /** Helper method to iterate through statements until next result statement 
is encountered */
+  private def getNextResult: Option[DataFrame] = {
+var currentStatement = if (executionPlan.hasNext) 
Some(executionPlan.next()) else None
+// While we don't have a result statement, execute the statements
+while (currentStatement.isDefined) {
+  currentStatement match {
+case Some(stmt: SingleStatementExec) if !stmt.isExecuted =>
+  withErrorHandling() {
+val df = stmt.buildDataFrame(session)
+if (df.logicalPlan.isInstanceOf[CommandResult]) {
+  // If the statement is not a result, we need to write it to a 
noop sink to execute it
+  df.write.format("noop").mode("overwrite").save()
+} else {
+  // If the statement is a result, we need to return it to the 
caller
+  return Some(df)
+}
+  }
+case _ => // pass
+  }
+  currentStatement = if (executionPlan.hasNext) Some(executionPlan.next()) 
else None
+}
+None
+  }
+
+  private def handleException(e: Exception): Unit = {
+// Rethrow the exception
+// TODO: SPARK-48353 Add error handling for SQL scripts
+throw e
+  }
+
+  def withErrorHandling()(f: => Unit): Unit = {

Review Comment:
   nit:
   ```suggestion
 def withErrorHandling(f: => Unit): Unit = {
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48344][SQL] Prepare SQL Scripting for addition of Execution Framework - part 2 [spark]

2024-11-25 Thread via GitHub


miland-db commented on code in PR #48950:
URL: https://github.com/apache/spark/pull/48950#discussion_r1856649688


##
sql/core/src/main/scala/org/apache/spark/sql/scripting/SqlScriptingExecution.scala:
##
@@ -0,0 +1,87 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.scripting
+
+import org.apache.spark.sql.{DataFrame, SparkSession}
+import org.apache.spark.sql.catalyst.expressions.Expression
+import org.apache.spark.sql.catalyst.plans.logical.{CommandResult, 
CompoundBody}
+
+/**
+ * SQL scripting executor - executes script and returns result statements.
+ */
+class SqlScriptingExecution(
+sqlScript: CompoundBody,
+session: SparkSession,
+args: Map[String, Expression]) extends Iterator[DataFrame] {
+
+  // Build the execution plan for the script
+  private val executionPlan: Iterator[CompoundStatementExec] =
+SqlScriptingInterpreter(session).buildExecutionPlan(sqlScript, args)
+
+  private var current = getNextResult
+
+  override def hasNext: Boolean = current.isDefined
+
+  override def next(): DataFrame = {
+if (!hasNext) {
+  throw new NoSuchElementException("No more statements to execute")
+}
+val nextDataFrame = current.get
+current = getNextResult
+nextDataFrame
+  }
+
+  /** Helper method to iterate through statements until next result statement 
is encountered */
+  private def getNextResult: Option[DataFrame] = {
+var currentStatement = if (executionPlan.hasNext) 
Some(executionPlan.next()) else None
+// While we don't have a result statement, execute the statements
+while (currentStatement.isDefined) {
+  currentStatement match {
+case Some(stmt: SingleStatementExec) if !stmt.isExecuted =>
+  withErrorHandling() {
+val df = stmt.buildDataFrame(session)
+if (df.logicalPlan.isInstanceOf[CommandResult]) {
+  // If the statement is not a result, we need to write it to a 
noop sink to execute it

Review Comment:
   Should we still have `df.write.format("noop").save()` for the `MultiResult` 
or it is not needed?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48344][SQL] Prepare SQL Scripting for addition of Execution Framework - part 2 [spark]

2024-11-25 Thread via GitHub


cloud-fan commented on code in PR #48950:
URL: https://github.com/apache/spark/pull/48950#discussion_r1856646155


##
sql/core/src/test/scala/org/apache/spark/sql/scripting/SqlScriptingE2eSuite.scala:
##
@@ -0,0 +1,154 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.scripting
+
+import org.apache.spark.SparkConf
+import org.apache.spark.sql.{QueryTest, Row}
+import org.apache.spark.sql.catalyst.plans.logical.CompoundBody
+import org.apache.spark.sql.catalyst.util.QuotingUtils.toSQLConf
+import org.apache.spark.sql.exceptions.SqlScriptingException
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.test.SharedSparkSession
+
+
+/**
+ * End-to-end tests for SQL Scripting.
+ * This suite is not intended to heavily test the SQL scripting (parser & 
interpreter) logic.
+ * It is rather focused on testing the sql() API - whether it can handle SQL 
scripts correctly,
+ *  results are returned in expected manner, config flags are applied 
properly, etc.
+ * For full functionality tests, see SqlScriptingParserSuite and 
SqlScriptingInterpreterSuite.
+ */
+class SqlScriptingE2eSuite extends QueryTest with SharedSparkSession {
+  // Helpers
+  private def verifySqlScriptResult(sqlText: String, expected: Seq[Row]): Unit 
= {
+val df = spark.sql(sqlText)
+checkAnswer(df, expected)
+  }
+
+  private def verifySqlScriptResultWithNamedParams(
+  sqlText: String,
+  expected: Seq[Row],
+  args: Map[String, Any]): Unit = {
+val df = spark.sql(sqlText, args)
+checkAnswer(df, expected)
+  }
+
+  // Tests setup
+  override protected def sparkConf: SparkConf = {
+super.sparkConf.set(SQLConf.SQL_SCRIPTING_ENABLED.key, "true")
+  }
+
+  // Tests
+  test("SQL Scripting not enabled") {
+withSQLConf(SQLConf.SQL_SCRIPTING_ENABLED.key -> "false") {
+  val sqlScriptText =
+"""
+  |BEGIN
+  |  SELECT 1;
+  |END""".stripMargin
+  checkError(
+exception = intercept[SqlScriptingException] {
+  spark.sql(sqlScriptText).asInstanceOf[CompoundBody]
+},
+condition = "UNSUPPORTED_FEATURE.SQL_SCRIPTING",
+parameters = Map("sqlScriptingEnabled" -> 
toSQLConf(SQLConf.SQL_SCRIPTING_ENABLED.key)))
+}
+  }
+
+  test("single select") {
+val sqlText = "SELECT 1;"
+verifySqlScriptResult(sqlText, Seq(Row(1)))
+  }
+
+  test("multiple selects") {
+val sqlText =
+  """
+|BEGIN
+|  SELECT 1;
+|  SELECT 2;
+|END""".stripMargin
+verifySqlScriptResult(sqlText, Seq(Row(2)))
+  }
+
+  test("multi statement - simple") {
+withTable("t") {
+  val sqlScript =
+"""
+  |BEGIN
+  |  CREATE TABLE t (a INT, b STRING, c DOUBLE) USING parquet;
+  |  INSERT INTO t VALUES (1, 'a', 1.0);
+  |  SELECT a FROM t;
+  |END
+  |""".stripMargin
+  verifySqlScriptResult(sqlScript, Seq(Row(1)))
+}
+  }
+
+  test("last statement without result") {
+val sqlScript =
+  """
+|BEGIN
+|  DECLARE x INT;
+|  SET x = 1;
+|  DROP TEMPORARY VARIABLE x;
+|END
+|""".stripMargin
+verifySqlScriptResult(sqlScript, Seq.empty)
+  }
+
+  test("named params") {
+val sqlScriptText =
+  """
+|BEGIN
+|  SELECT 1;
+|  IF :param_1 > 10 THEN
+|SELECT :param_2;
+|  ELSE
+|SELECT :param_3;
+|  END IF;
+|END""".stripMargin
+// Define a map with SQL parameters
+val args: Map[String, Any] = Map(
+  "param_1" -> 5,
+  "param_2" -> "greater",
+  "param_3" -> "smaller"
+)
+verifySqlScriptResultWithNamedParams(sqlScriptText, Seq(Row("smaller")), 
args)
+  }
+
+  test("positional params") {
+val sqlScriptText =
+  """
+|BEGIN
+|  SELECT 1;
+|  IF ? > 10 THEN
+|SELECT ?;
+|  ELSE
+|SELECT ?;
+|  END IF;
+|END""".stripMargin
+// Define an array with SQL parameters in the correct order
+val args: Array[Any] = Array(5, "greater", "smaller")
+checkError(
+  exception = intercept[SqlScriptingException

Re: [PR] [SPARK-48344][SQL] Prepare SQL Scripting for addition of Execution Framework - part 2 [spark]

2024-11-25 Thread via GitHub


cloud-fan commented on code in PR #48950:
URL: https://github.com/apache/spark/pull/48950#discussion_r1856645162


##
sql/core/src/main/scala/org/apache/spark/sql/scripting/SqlScriptingExecution.scala:
##
@@ -0,0 +1,87 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.scripting
+
+import org.apache.spark.sql.{DataFrame, SparkSession}
+import org.apache.spark.sql.catalyst.expressions.Expression
+import org.apache.spark.sql.catalyst.plans.logical.{CommandResult, 
CompoundBody}
+
+/**
+ * SQL scripting executor - executes script and returns result statements.
+ */
+class SqlScriptingExecution(
+sqlScript: CompoundBody,
+session: SparkSession,
+args: Map[String, Expression]) extends Iterator[DataFrame] {
+
+  // Build the execution plan for the script
+  private val executionPlan: Iterator[CompoundStatementExec] =
+SqlScriptingInterpreter(session).buildExecutionPlan(sqlScript, args)
+
+  private var current = getNextResult
+
+  override def hasNext: Boolean = current.isDefined
+
+  override def next(): DataFrame = {
+if (!hasNext) {
+  throw new NoSuchElementException("No more statements to execute")
+}
+val nextDataFrame = current.get
+current = getNextResult
+nextDataFrame
+  }
+
+  /** Helper method to iterate through statements until next result statement 
is encountered */
+  private def getNextResult: Option[DataFrame] = {
+var currentStatement = if (executionPlan.hasNext) 
Some(executionPlan.next()) else None
+// While we don't have a result statement, execute the statements
+while (currentStatement.isDefined) {
+  currentStatement match {
+case Some(stmt: SingleStatementExec) if !stmt.isExecuted =>
+  withErrorHandling() {
+val df = stmt.buildDataFrame(session)
+if (df.logicalPlan.isInstanceOf[CommandResult]) {
+  // If the statement is not a result, we need to write it to a 
noop sink to execute it

Review Comment:
   BTW let's also exclude `MultiResult` here, which is the result of invoking 
stored procedures, and not a query either.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48344][SQL] Prepare SQL Scripting for addition of Execution Framework - part 2 [spark]

2024-11-25 Thread via GitHub


cloud-fan commented on code in PR #48950:
URL: https://github.com/apache/spark/pull/48950#discussion_r1856643188


##
sql/core/src/main/scala/org/apache/spark/sql/scripting/SqlScriptingExecution.scala:
##
@@ -0,0 +1,87 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.scripting
+
+import org.apache.spark.sql.{DataFrame, SparkSession}
+import org.apache.spark.sql.catalyst.expressions.Expression
+import org.apache.spark.sql.catalyst.plans.logical.{CommandResult, 
CompoundBody}
+
+/**
+ * SQL scripting executor - executes script and returns result statements.
+ */
+class SqlScriptingExecution(
+sqlScript: CompoundBody,
+session: SparkSession,
+args: Map[String, Expression]) extends Iterator[DataFrame] {
+
+  // Build the execution plan for the script
+  private val executionPlan: Iterator[CompoundStatementExec] =
+SqlScriptingInterpreter(session).buildExecutionPlan(sqlScript, args)
+
+  private var current = getNextResult
+
+  override def hasNext: Boolean = current.isDefined
+
+  override def next(): DataFrame = {
+if (!hasNext) {
+  throw new NoSuchElementException("No more statements to execute")
+}
+val nextDataFrame = current.get
+current = getNextResult
+nextDataFrame
+  }
+
+  /** Helper method to iterate through statements until next result statement 
is encountered */
+  private def getNextResult: Option[DataFrame] = {
+var currentStatement = if (executionPlan.hasNext) 
Some(executionPlan.next()) else None
+// While we don't have a result statement, execute the statements
+while (currentStatement.isDefined) {
+  currentStatement match {
+case Some(stmt: SingleStatementExec) if !stmt.isExecuted =>
+  withErrorHandling() {
+val df = stmt.buildDataFrame(session)
+if (df.logicalPlan.isInstanceOf[CommandResult]) {
+  // If the statement is not a result, we need to write it to a 
noop sink to execute it

Review Comment:
   since command is eagerly executed, I think we can just do nothing here, and 
move to the next iteration. `CommandResult` indicates that the command has been 
executed and the result is kept here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48344][SQL] Prepare SQL Scripting for addition of Execution Framework - part 2 [spark]

2024-11-25 Thread via GitHub


miland-db commented on code in PR #48950:
URL: https://github.com/apache/spark/pull/48950#discussion_r1856507402


##
sql/core/src/main/scala/org/apache/spark/sql/scripting/SqlScriptingExecutionNode.scala:
##
@@ -122,6 +130,20 @@ class SingleStatementExec(
*/
   var isExecuted = false
 
+  /**
+   * Plan with named parameters.
+   */
+  lazy val resolvedPlan: LogicalPlan = {
+if (args.nonEmpty) {
+  NameParameterizedQuery(parsedPlan, args)
+} else {
+  parsedPlan
+}
+  }
+
+  /** Statement is result if it is a SELECT query, and it is not in control 
flow condition */
+  override def isResult: Boolean = parsedPlan.isInstanceOf[Project] && 
!isExecuted

Review Comment:
   IIUC, `SqlScriptExecution` should be responsible to check if current 
DataFrame.logicalPlan  is `CommandResult` or not, instead of 
`SingleStatementExec` based on it's logical plan?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48344][SQL] Prepare SQL Scripting for addition of Execution Framework - part 2 [spark]

2024-11-25 Thread via GitHub


cloud-fan commented on code in PR #48950:
URL: https://github.com/apache/spark/pull/48950#discussion_r1856466335


##
sql/core/src/main/scala/org/apache/spark/sql/scripting/SqlScriptingExecutionNode.scala:
##
@@ -122,6 +130,20 @@ class SingleStatementExec(
*/
   var isExecuted = false
 
+  /**
+   * Plan with named parameters.
+   */
+  lazy val resolvedPlan: LogicalPlan = {
+if (args.nonEmpty) {
+  NameParameterizedQuery(parsedPlan, args)
+} else {
+  parsedPlan
+}
+  }
+
+  /** Statement is result if it is a SELECT query, and it is not in control 
flow condition */
+  override def isResult: Boolean = parsedPlan.isInstanceOf[Project] && 
!isExecuted

Review Comment:
   With SQL pipe syntax, the root node may not always be `Project`. I think 
it's simpler to let the caller side check `df.logicalPlan`. If it's 
`CommandResult`, then it's not a query.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48344][SQL] Prepare SQL Scripting for addition of Execution Framework - part 2 [spark]

2024-11-25 Thread via GitHub


miland-db commented on code in PR #48950:
URL: https://github.com/apache/spark/pull/48950#discussion_r1856507402


##
sql/core/src/main/scala/org/apache/spark/sql/scripting/SqlScriptingExecutionNode.scala:
##
@@ -122,6 +130,20 @@ class SingleStatementExec(
*/
   var isExecuted = false
 
+  /**
+   * Plan with named parameters.
+   */
+  lazy val resolvedPlan: LogicalPlan = {
+if (args.nonEmpty) {
+  NameParameterizedQuery(parsedPlan, args)
+} else {
+  parsedPlan
+}
+  }
+
+  /** Statement is result if it is a SELECT query, and it is not in control 
flow condition */
+  override def isResult: Boolean = parsedPlan.isInstanceOf[Project] && 
!isExecuted

Review Comment:
   IIUC, `SqlScriptExecution` should be responsible to check if current 
DataFrame.logicalPlan  is `Project` or not, instead of `SingleStatementExec` 
based on it's logical plan?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48344][SQL] Prepare SQL Scripting for addition of Execution Framework - part 2 [spark]

2024-11-25 Thread via GitHub


miland-db commented on code in PR #48950:
URL: https://github.com/apache/spark/pull/48950#discussion_r1856495434


##
sql/core/src/test/scala/org/apache/spark/sql/scripting/SqlScriptingE2eSuite.scala:
##
@@ -0,0 +1,154 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.scripting
+
+import org.apache.spark.SparkConf
+import org.apache.spark.sql.{QueryTest, Row}
+import org.apache.spark.sql.catalyst.plans.logical.CompoundBody
+import org.apache.spark.sql.catalyst.util.QuotingUtils.toSQLConf
+import org.apache.spark.sql.exceptions.SqlScriptingException
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.test.SharedSparkSession
+
+
+/**
+ * End-to-end tests for SQL Scripting.
+ * This suite is not intended to heavily test the SQL scripting (parser & 
interpreter) logic.
+ * It is rather focused on testing the sql() API - whether it can handle SQL 
scripts correctly,
+ *  results are returned in expected manner, config flags are applied 
properly, etc.
+ * For full functionality tests, see SqlScriptingParserSuite and 
SqlScriptingInterpreterSuite.
+ */
+class SqlScriptingE2eSuite extends QueryTest with SharedSparkSession {
+  // Helpers
+  private def verifySqlScriptResult(sqlText: String, expected: Seq[Row]): Unit 
= {
+val df = spark.sql(sqlText)
+checkAnswer(df, expected)
+  }
+
+  private def verifySqlScriptResultWithNamedParams(
+  sqlText: String,
+  expected: Seq[Row],
+  args: Map[String, Any]): Unit = {
+val df = spark.sql(sqlText, args)
+checkAnswer(df, expected)
+  }
+
+  // Tests setup
+  override protected def sparkConf: SparkConf = {
+super.sparkConf.set(SQLConf.SQL_SCRIPTING_ENABLED.key, "true")
+  }
+
+  // Tests
+  test("SQL Scripting not enabled") {
+withSQLConf(SQLConf.SQL_SCRIPTING_ENABLED.key -> "false") {
+  val sqlScriptText =
+"""
+  |BEGIN
+  |  SELECT 1;
+  |END""".stripMargin
+  checkError(
+exception = intercept[SqlScriptingException] {
+  spark.sql(sqlScriptText).asInstanceOf[CompoundBody]
+},
+condition = "UNSUPPORTED_FEATURE.SQL_SCRIPTING",
+parameters = Map("sqlScriptingEnabled" -> 
toSQLConf(SQLConf.SQL_SCRIPTING_ENABLED.key)))
+}
+  }
+
+  test("single select") {
+val sqlText = "SELECT 1;"
+verifySqlScriptResult(sqlText, Seq(Row(1)))
+  }
+
+  test("multiple selects") {
+val sqlText =
+  """
+|BEGIN
+|  SELECT 1;
+|  SELECT 2;
+|END""".stripMargin
+verifySqlScriptResult(sqlText, Seq(Row(2)))
+  }
+
+  test("multi statement - simple") {
+withTable("t") {
+  val sqlScript =
+"""
+  |BEGIN
+  |  CREATE TABLE t (a INT, b STRING, c DOUBLE) USING parquet;
+  |  INSERT INTO t VALUES (1, 'a', 1.0);
+  |  SELECT a FROM t;
+  |END
+  |""".stripMargin
+  verifySqlScriptResult(sqlScript, Seq(Row(1)))
+}
+  }
+
+  test("last statement without result") {
+val sqlScript =
+  """
+|BEGIN
+|  DECLARE x INT;
+|  SET x = 1;
+|  DROP TEMPORARY VARIABLE x;
+|END
+|""".stripMargin
+verifySqlScriptResult(sqlScript, Seq.empty)
+  }
+
+  test("named params") {
+val sqlScriptText =
+  """
+|BEGIN
+|  SELECT 1;
+|  IF :param_1 > 10 THEN
+|SELECT :param_2;
+|  ELSE
+|SELECT :param_3;
+|  END IF;
+|END""".stripMargin
+// Define a map with SQL parameters
+val args: Map[String, Any] = Map(
+  "param_1" -> 5,
+  "param_2" -> "greater",
+  "param_3" -> "smaller"
+)
+verifySqlScriptResultWithNamedParams(sqlScriptText, Seq(Row("smaller")), 
args)
+  }
+
+  test("positional params") {
+val sqlScriptText =
+  """
+|BEGIN
+|  SELECT 1;
+|  IF ? > 10 THEN
+|SELECT ?;
+|  ELSE
+|SELECT ?;
+|  END IF;
+|END""".stripMargin
+// Define an array with SQL parameters in the correct order
+val args: Array[Any] = Array(5, "greater", "smaller")
+checkError(
+  exception = intercept[SqlScriptingException

Re: [PR] [SPARK-48344][SQL] Prepare SQL Scripting for addition of Execution Framework - part 2 [spark]

2024-11-25 Thread via GitHub


miland-db commented on code in PR #48950:
URL: https://github.com/apache/spark/pull/48950#discussion_r1856491695


##
sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala:
##
@@ -411,6 +413,34 @@ class SparkSession private(
|  Everything else  |
* - */
 
+  private def executeSqlScript(
+  script: CompoundBody,
+  args: Map[String, Expression] = Map.empty): DataFrame = {
+val sse = new SqlScriptingExecution(script, this, args)
+var df: DataFrame = null

Review Comment:
   `SqlScriptExecution` next method returns DataFrame. We can remove this var 
and use `val` inside while loop. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48344][SQL] Prepare SQL Scripting for addition of Execution Framework - part 2 [spark]

2024-11-25 Thread via GitHub


miland-db commented on code in PR #48950:
URL: https://github.com/apache/spark/pull/48950#discussion_r1856481204


##
sql/core/src/test/scala/org/apache/spark/sql/scripting/SqlScriptingE2eSuite.scala:
##
@@ -0,0 +1,154 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.scripting
+
+import org.apache.spark.SparkConf
+import org.apache.spark.sql.{QueryTest, Row}
+import org.apache.spark.sql.catalyst.plans.logical.CompoundBody
+import org.apache.spark.sql.catalyst.util.QuotingUtils.toSQLConf
+import org.apache.spark.sql.exceptions.SqlScriptingException
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.test.SharedSparkSession
+
+
+/**
+ * End-to-end tests for SQL Scripting.
+ * This suite is not intended to heavily test the SQL scripting (parser & 
interpreter) logic.
+ * It is rather focused on testing the sql() API - whether it can handle SQL 
scripts correctly,
+ *  results are returned in expected manner, config flags are applied 
properly, etc.
+ * For full functionality tests, see SqlScriptingParserSuite and 
SqlScriptingInterpreterSuite.
+ */
+class SqlScriptingE2eSuite extends QueryTest with SharedSparkSession {
+  // Helpers
+  private def verifySqlScriptResult(sqlText: String, expected: Seq[Row]): Unit 
= {
+val df = spark.sql(sqlText)
+checkAnswer(df, expected)
+  }
+
+  private def verifySqlScriptResultWithNamedParams(
+  sqlText: String,
+  expected: Seq[Row],
+  args: Map[String, Any]): Unit = {
+val df = spark.sql(sqlText, args)
+checkAnswer(df, expected)
+  }
+
+  // Tests setup
+  override protected def sparkConf: SparkConf = {
+super.sparkConf.set(SQLConf.SQL_SCRIPTING_ENABLED.key, "true")
+  }
+
+  // Tests
+  test("SQL Scripting not enabled") {
+withSQLConf(SQLConf.SQL_SCRIPTING_ENABLED.key -> "false") {
+  val sqlScriptText =
+"""
+  |BEGIN
+  |  SELECT 1;
+  |END""".stripMargin
+  checkError(
+exception = intercept[SqlScriptingException] {
+  spark.sql(sqlScriptText).asInstanceOf[CompoundBody]
+},
+condition = "UNSUPPORTED_FEATURE.SQL_SCRIPTING",
+parameters = Map("sqlScriptingEnabled" -> 
toSQLConf(SQLConf.SQL_SCRIPTING_ENABLED.key)))
+}
+  }
+
+  test("single select") {
+val sqlText = "SELECT 1;"
+verifySqlScriptResult(sqlText, Seq(Row(1)))
+  }
+
+  test("multiple selects") {
+val sqlText =
+  """
+|BEGIN
+|  SELECT 1;
+|  SELECT 2;
+|END""".stripMargin
+verifySqlScriptResult(sqlText, Seq(Row(2)))
+  }
+
+  test("multi statement - simple") {
+withTable("t") {
+  val sqlScript =
+"""
+  |BEGIN
+  |  CREATE TABLE t (a INT, b STRING, c DOUBLE) USING parquet;
+  |  INSERT INTO t VALUES (1, 'a', 1.0);
+  |  SELECT a FROM t;
+  |END
+  |""".stripMargin
+  verifySqlScriptResult(sqlScript, Seq(Row(1)))
+}
+  }
+
+  test("last statement without result") {

Review Comment:
   Added new E2E test with empty script.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48344][SQL] Prepare SQL Scripting for addition of Execution Framework - part 2 [spark]

2024-11-25 Thread via GitHub


miland-db commented on code in PR #48950:
URL: https://github.com/apache/spark/pull/48950#discussion_r1856480336


##
sql/core/src/test/scala/org/apache/spark/sql/scripting/SqlScriptingE2eSuite.scala:
##
@@ -0,0 +1,154 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.scripting
+
+import org.apache.spark.SparkConf
+import org.apache.spark.sql.{QueryTest, Row}
+import org.apache.spark.sql.catalyst.plans.logical.CompoundBody
+import org.apache.spark.sql.catalyst.util.QuotingUtils.toSQLConf
+import org.apache.spark.sql.exceptions.SqlScriptingException
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.test.SharedSparkSession
+
+
+/**
+ * End-to-end tests for SQL Scripting.
+ * This suite is not intended to heavily test the SQL scripting (parser & 
interpreter) logic.
+ * It is rather focused on testing the sql() API - whether it can handle SQL 
scripts correctly,
+ *  results are returned in expected manner, config flags are applied 
properly, etc.
+ * For full functionality tests, see SqlScriptingParserSuite and 
SqlScriptingInterpreterSuite.
+ */
+class SqlScriptingE2eSuite extends QueryTest with SharedSparkSession {
+  // Helpers
+  private def verifySqlScriptResult(sqlText: String, expected: Seq[Row]): Unit 
= {
+val df = spark.sql(sqlText)
+checkAnswer(df, expected)
+  }
+
+  private def verifySqlScriptResultWithNamedParams(
+  sqlText: String,
+  expected: Seq[Row],
+  args: Map[String, Any]): Unit = {
+val df = spark.sql(sqlText, args)
+checkAnswer(df, expected)
+  }
+
+  // Tests setup
+  override protected def sparkConf: SparkConf = {
+super.sparkConf.set(SQLConf.SQL_SCRIPTING_ENABLED.key, "true")
+  }
+
+  // Tests
+  test("SQL Scripting not enabled") {
+withSQLConf(SQLConf.SQL_SCRIPTING_ENABLED.key -> "false") {
+  val sqlScriptText =
+"""
+  |BEGIN
+  |  SELECT 1;
+  |END""".stripMargin
+  checkError(
+exception = intercept[SqlScriptingException] {
+  spark.sql(sqlScriptText).asInstanceOf[CompoundBody]
+},
+condition = "UNSUPPORTED_FEATURE.SQL_SCRIPTING",
+parameters = Map("sqlScriptingEnabled" -> 
toSQLConf(SQLConf.SQL_SCRIPTING_ENABLED.key)))
+}
+  }
+
+  test("single select") {
+val sqlText = "SELECT 1;"
+verifySqlScriptResult(sqlText, Seq(Row(1)))
+  }
+
+  test("multiple selects") {
+val sqlText =
+  """
+|BEGIN
+|  SELECT 1;
+|  SELECT 2;
+|END""".stripMargin
+verifySqlScriptResult(sqlText, Seq(Row(2)))
+  }
+
+  test("multi statement - simple") {
+withTable("t") {
+  val sqlScript =
+"""
+  |BEGIN
+  |  CREATE TABLE t (a INT, b STRING, c DOUBLE) USING parquet;
+  |  INSERT INTO t VALUES (1, 'a', 1.0);
+  |  SELECT a FROM t;
+  |END
+  |""".stripMargin
+  verifySqlScriptResult(sqlScript, Seq(Row(1)))
+}
+  }
+
+  test("last statement without result") {

Review Comment:
   It will return `emptyDataFrame` that is defined in `SparkSession`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48344][SQL] Prepare SQL Scripting for addition of Execution Framework - part 2 [spark]

2024-11-25 Thread via GitHub


cloud-fan commented on code in PR #48950:
URL: https://github.com/apache/spark/pull/48950#discussion_r1856479952


##
sql/core/src/test/scala/org/apache/spark/sql/scripting/SqlScriptingE2eSuite.scala:
##
@@ -0,0 +1,154 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.scripting
+
+import org.apache.spark.SparkConf
+import org.apache.spark.sql.{QueryTest, Row}
+import org.apache.spark.sql.catalyst.plans.logical.CompoundBody
+import org.apache.spark.sql.catalyst.util.QuotingUtils.toSQLConf
+import org.apache.spark.sql.exceptions.SqlScriptingException
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.test.SharedSparkSession
+
+
+/**
+ * End-to-end tests for SQL Scripting.
+ * This suite is not intended to heavily test the SQL scripting (parser & 
interpreter) logic.
+ * It is rather focused on testing the sql() API - whether it can handle SQL 
scripts correctly,
+ *  results are returned in expected manner, config flags are applied 
properly, etc.
+ * For full functionality tests, see SqlScriptingParserSuite and 
SqlScriptingInterpreterSuite.
+ */
+class SqlScriptingE2eSuite extends QueryTest with SharedSparkSession {
+  // Helpers
+  private def verifySqlScriptResult(sqlText: String, expected: Seq[Row]): Unit 
= {
+val df = spark.sql(sqlText)
+checkAnswer(df, expected)
+  }
+
+  private def verifySqlScriptResultWithNamedParams(
+  sqlText: String,
+  expected: Seq[Row],
+  args: Map[String, Any]): Unit = {
+val df = spark.sql(sqlText, args)
+checkAnswer(df, expected)
+  }
+
+  // Tests setup
+  override protected def sparkConf: SparkConf = {
+super.sparkConf.set(SQLConf.SQL_SCRIPTING_ENABLED.key, "true")
+  }
+
+  // Tests
+  test("SQL Scripting not enabled") {
+withSQLConf(SQLConf.SQL_SCRIPTING_ENABLED.key -> "false") {
+  val sqlScriptText =
+"""
+  |BEGIN
+  |  SELECT 1;
+  |END""".stripMargin
+  checkError(
+exception = intercept[SqlScriptingException] {
+  spark.sql(sqlScriptText).asInstanceOf[CompoundBody]
+},
+condition = "UNSUPPORTED_FEATURE.SQL_SCRIPTING",
+parameters = Map("sqlScriptingEnabled" -> 
toSQLConf(SQLConf.SQL_SCRIPTING_ENABLED.key)))
+}
+  }
+
+  test("single select") {
+val sqlText = "SELECT 1;"
+verifySqlScriptResult(sqlText, Seq(Row(1)))
+  }
+
+  test("multiple selects") {
+val sqlText =
+  """
+|BEGIN
+|  SELECT 1;
+|  SELECT 2;
+|END""".stripMargin
+verifySqlScriptResult(sqlText, Seq(Row(2)))
+  }
+
+  test("multi statement - simple") {
+withTable("t") {
+  val sqlScript =
+"""
+  |BEGIN
+  |  CREATE TABLE t (a INT, b STRING, c DOUBLE) USING parquet;
+  |  INSERT INTO t VALUES (1, 'a', 1.0);
+  |  SELECT a FROM t;
+  |END
+  |""".stripMargin
+  verifySqlScriptResult(sqlScript, Seq(Row(1)))
+}
+  }
+
+  test("last statement without result") {
+val sqlScript =
+  """
+|BEGIN
+|  DECLARE x INT;
+|  SET x = 1;
+|  DROP TEMPORARY VARIABLE x;
+|END
+|""".stripMargin
+verifySqlScriptResult(sqlScript, Seq.empty)
+  }
+
+  test("named params") {
+val sqlScriptText =
+  """
+|BEGIN
+|  SELECT 1;
+|  IF :param_1 > 10 THEN
+|SELECT :param_2;
+|  ELSE
+|SELECT :param_3;
+|  END IF;
+|END""".stripMargin
+// Define a map with SQL parameters
+val args: Map[String, Any] = Map(
+  "param_1" -> 5,
+  "param_2" -> "greater",
+  "param_3" -> "smaller"
+)
+verifySqlScriptResultWithNamedParams(sqlScriptText, Seq(Row("smaller")), 
args)
+  }
+
+  test("positional params") {
+val sqlScriptText =
+  """
+|BEGIN
+|  SELECT 1;
+|  IF ? > 10 THEN
+|SELECT ?;
+|  ELSE
+|SELECT ?;
+|  END IF;
+|END""".stripMargin
+// Define an array with SQL parameters in the correct order
+val args: Array[Any] = Array(5, "greater", "smaller")
+checkError(
+  exception = intercept[SqlScriptingException

Re: [PR] [SPARK-48344][SQL] Prepare SQL Scripting for addition of Execution Framework - part 2 [spark]

2024-11-25 Thread via GitHub


cloud-fan commented on code in PR #48950:
URL: https://github.com/apache/spark/pull/48950#discussion_r1856478045


##
sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala:
##
@@ -411,6 +413,34 @@ class SparkSession private(
|  Everything else  |
* - */
 
+  private def executeSqlScript(
+  script: CompoundBody,
+  args: Map[String, Expression] = Map.empty): DataFrame = {
+val sse = new SqlScriptingExecution(script, this, args)
+var df: DataFrame = null

Review Comment:
   why do we need this `var`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48344][SQL] Prepare SQL Scripting for addition of Execution Framework - part 2 [spark]

2024-11-25 Thread via GitHub


cloud-fan commented on code in PR #48950:
URL: https://github.com/apache/spark/pull/48950#discussion_r1856477473


##
sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala:
##
@@ -411,6 +413,34 @@ class SparkSession private(
|  Everything else  |
* - */
 
+  private def executeSqlScript(
+  script: CompoundBody,
+  args: Map[String, Expression] = Map.empty): DataFrame = {
+val sse = new SqlScriptingExecution(script, this, args)
+var df: DataFrame = null
+var result: Option[Seq[Row]] = null

Review Comment:
   Since we use `Option`, the default value should be `None`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48344][SQL] Prepare SQL Scripting for addition of Execution Framework - part 2 [spark]

2024-11-25 Thread via GitHub


miland-db commented on code in PR #48950:
URL: https://github.com/apache/spark/pull/48950#discussion_r1856477096


##
sql/core/src/main/scala/org/apache/spark/sql/scripting/SqlScriptingExecutionNode.scala:
##
@@ -132,6 +154,17 @@ class SingleStatementExec(
 origin.sqlText.get.substring(origin.startIndex.get, origin.stopIndex.get + 
1)
   }
 
+  /**
+   * Builds a DataFrame from the parsedPlan of this SingleStatementExec,
+   * logging Origin.sqlText if it exists

Review Comment:
   Removed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48344][SQL] Prepare SQL Scripting for addition of Execution Framework - part 2 [spark]

2024-11-25 Thread via GitHub


cloud-fan commented on code in PR #48950:
URL: https://github.com/apache/spark/pull/48950#discussion_r1856476461


##
sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala:
##
@@ -411,6 +413,34 @@ class SparkSession private(
|  Everything else  |
* - */
 
+  private def executeSqlScript(
+  script: CompoundBody,
+  args: Map[String, Expression] = Map.empty): DataFrame = {
+val sse = new SqlScriptingExecution(script, this, args)
+var df: DataFrame = null
+var result: Option[Seq[Row]] = null
+
+while (sse.hasNext) {
+  sse.withErrorHandling() {
+df = sse.next()
+if (sse.hasNext) {
+  df.collect()

Review Comment:
   ```suggestion
 df.write.format("noop").save()
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48344][SQL] Prepare SQL Scripting for addition of Execution Framework - part 2 [spark]

2024-11-25 Thread via GitHub


cloud-fan commented on code in PR #48950:
URL: https://github.com/apache/spark/pull/48950#discussion_r1856473627


##
sql/core/src/main/scala/org/apache/spark/sql/scripting/SqlScriptingExecutionNode.scala:
##
@@ -132,6 +154,17 @@ class SingleStatementExec(
 origin.sqlText.get.substring(origin.startIndex.get, origin.stopIndex.get + 
1)
   }
 
+  /**
+   * Builds a DataFrame from the parsedPlan of this SingleStatementExec,
+   * logging Origin.sqlText if it exists

Review Comment:
   what does this mean?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48344][SQL] Prepare SQL Scripting for addition of Execution Framework - part 2 [spark]

2024-11-25 Thread via GitHub


cloud-fan commented on code in PR #48950:
URL: https://github.com/apache/spark/pull/48950#discussion_r1856470827


##
sql/core/src/test/scala/org/apache/spark/sql/scripting/SqlScriptingE2eSuite.scala:
##
@@ -0,0 +1,154 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.scripting
+
+import org.apache.spark.SparkConf
+import org.apache.spark.sql.{QueryTest, Row}
+import org.apache.spark.sql.catalyst.plans.logical.CompoundBody
+import org.apache.spark.sql.catalyst.util.QuotingUtils.toSQLConf
+import org.apache.spark.sql.exceptions.SqlScriptingException
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.test.SharedSparkSession
+
+
+/**
+ * End-to-end tests for SQL Scripting.
+ * This suite is not intended to heavily test the SQL scripting (parser & 
interpreter) logic.
+ * It is rather focused on testing the sql() API - whether it can handle SQL 
scripts correctly,
+ *  results are returned in expected manner, config flags are applied 
properly, etc.
+ * For full functionality tests, see SqlScriptingParserSuite and 
SqlScriptingInterpreterSuite.
+ */
+class SqlScriptingE2eSuite extends QueryTest with SharedSparkSession {
+  // Helpers
+  private def verifySqlScriptResult(sqlText: String, expected: Seq[Row]): Unit 
= {
+val df = spark.sql(sqlText)
+checkAnswer(df, expected)
+  }
+
+  private def verifySqlScriptResultWithNamedParams(
+  sqlText: String,
+  expected: Seq[Row],
+  args: Map[String, Any]): Unit = {
+val df = spark.sql(sqlText, args)
+checkAnswer(df, expected)
+  }
+
+  // Tests setup
+  override protected def sparkConf: SparkConf = {
+super.sparkConf.set(SQLConf.SQL_SCRIPTING_ENABLED.key, "true")
+  }
+
+  // Tests
+  test("SQL Scripting not enabled") {
+withSQLConf(SQLConf.SQL_SCRIPTING_ENABLED.key -> "false") {
+  val sqlScriptText =
+"""
+  |BEGIN
+  |  SELECT 1;
+  |END""".stripMargin
+  checkError(
+exception = intercept[SqlScriptingException] {
+  spark.sql(sqlScriptText).asInstanceOf[CompoundBody]
+},
+condition = "UNSUPPORTED_FEATURE.SQL_SCRIPTING",
+parameters = Map("sqlScriptingEnabled" -> 
toSQLConf(SQLConf.SQL_SCRIPTING_ENABLED.key)))
+}
+  }
+
+  test("single select") {
+val sqlText = "SELECT 1;"
+verifySqlScriptResult(sqlText, Seq(Row(1)))
+  }
+
+  test("multiple selects") {
+val sqlText =
+  """
+|BEGIN
+|  SELECT 1;
+|  SELECT 2;
+|END""".stripMargin
+verifySqlScriptResult(sqlText, Seq(Row(2)))
+  }
+
+  test("multi statement - simple") {
+withTable("t") {
+  val sqlScript =
+"""
+  |BEGIN
+  |  CREATE TABLE t (a INT, b STRING, c DOUBLE) USING parquet;
+  |  INSERT INTO t VALUES (1, 'a', 1.0);
+  |  SELECT a FROM t;
+  |END
+  |""".stripMargin
+  verifySqlScriptResult(sqlScript, Seq(Row(1)))
+}
+  }
+
+  test("last statement without result") {

Review Comment:
   This test is testing this case, but it doesn't check the result schema.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48344][SQL] Prepare SQL Scripting for addition of Execution Framework - part 2 [spark]

2024-11-25 Thread via GitHub


cloud-fan commented on code in PR #48950:
URL: https://github.com/apache/spark/pull/48950#discussion_r1856469772


##
sql/core/src/test/scala/org/apache/spark/sql/scripting/SqlScriptingE2eSuite.scala:
##
@@ -0,0 +1,154 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.scripting
+
+import org.apache.spark.SparkConf
+import org.apache.spark.sql.{QueryTest, Row}
+import org.apache.spark.sql.catalyst.plans.logical.CompoundBody
+import org.apache.spark.sql.catalyst.util.QuotingUtils.toSQLConf
+import org.apache.spark.sql.exceptions.SqlScriptingException
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.test.SharedSparkSession
+
+
+/**
+ * End-to-end tests for SQL Scripting.
+ * This suite is not intended to heavily test the SQL scripting (parser & 
interpreter) logic.
+ * It is rather focused on testing the sql() API - whether it can handle SQL 
scripts correctly,
+ *  results are returned in expected manner, config flags are applied 
properly, etc.
+ * For full functionality tests, see SqlScriptingParserSuite and 
SqlScriptingInterpreterSuite.
+ */
+class SqlScriptingE2eSuite extends QueryTest with SharedSparkSession {
+  // Helpers
+  private def verifySqlScriptResult(sqlText: String, expected: Seq[Row]): Unit 
= {
+val df = spark.sql(sqlText)
+checkAnswer(df, expected)
+  }
+
+  private def verifySqlScriptResultWithNamedParams(
+  sqlText: String,
+  expected: Seq[Row],
+  args: Map[String, Any]): Unit = {
+val df = spark.sql(sqlText, args)
+checkAnswer(df, expected)
+  }
+
+  // Tests setup
+  override protected def sparkConf: SparkConf = {
+super.sparkConf.set(SQLConf.SQL_SCRIPTING_ENABLED.key, "true")
+  }
+
+  // Tests
+  test("SQL Scripting not enabled") {
+withSQLConf(SQLConf.SQL_SCRIPTING_ENABLED.key -> "false") {
+  val sqlScriptText =
+"""
+  |BEGIN
+  |  SELECT 1;
+  |END""".stripMargin
+  checkError(
+exception = intercept[SqlScriptingException] {
+  spark.sql(sqlScriptText).asInstanceOf[CompoundBody]
+},
+condition = "UNSUPPORTED_FEATURE.SQL_SCRIPTING",
+parameters = Map("sqlScriptingEnabled" -> 
toSQLConf(SQLConf.SQL_SCRIPTING_ENABLED.key)))
+}
+  }
+
+  test("single select") {
+val sqlText = "SELECT 1;"
+verifySqlScriptResult(sqlText, Seq(Row(1)))
+  }
+
+  test("multiple selects") {
+val sqlText =
+  """
+|BEGIN
+|  SELECT 1;
+|  SELECT 2;
+|END""".stripMargin
+verifySqlScriptResult(sqlText, Seq(Row(2)))
+  }
+
+  test("multi statement - simple") {
+withTable("t") {
+  val sqlScript =
+"""
+  |BEGIN
+  |  CREATE TABLE t (a INT, b STRING, c DOUBLE) USING parquet;
+  |  INSERT INTO t VALUES (1, 'a', 1.0);
+  |  SELECT a FROM t;
+  |END
+  |""".stripMargin
+  verifySqlScriptResult(sqlScript, Seq(Row(1)))
+}
+  }
+
+  test("last statement without result") {

Review Comment:
   what if the script has no query at all? What should be the DataFrame 
returned by `.sql()`? A special DataFrame that has no column and no output row?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48344][SQL] Prepare SQL Scripting for addition of Execution Framework - part 2 [spark]

2024-11-25 Thread via GitHub


cloud-fan commented on code in PR #48950:
URL: https://github.com/apache/spark/pull/48950#discussion_r1856463771


##
sql/core/src/main/scala/org/apache/spark/sql/scripting/SqlScriptingExecutionNode.scala:
##
@@ -122,6 +130,20 @@ class SingleStatementExec(
*/
   var isExecuted = false
 
+  /**
+   * Plan with named parameters.
+   */
+  lazy val resolvedPlan: LogicalPlan = {

Review Comment:
   it's not resolved, maybe `preparedPlan`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48344][SQL] Prepare SQL Scripting for addition of Execution Framework - part 2 [spark]

2024-11-25 Thread via GitHub


cloud-fan commented on code in PR #48950:
URL: https://github.com/apache/spark/pull/48950#discussion_r1856460713


##
sql/core/src/main/scala/org/apache/spark/sql/scripting/SqlScriptingExecution.scala:
##
@@ -0,0 +1,80 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.scripting
+
+import org.apache.spark.sql.{DataFrame, SparkSession}
+import org.apache.spark.sql.catalyst.expressions.Expression
+import org.apache.spark.sql.catalyst.plans.logical.CompoundBody
+
+/**
+ * SQL scripting executor - executes script and returns result statements.
+ */
+class SqlScriptingExecution(
+sqlScript: CompoundBody,
+session: SparkSession,
+args: Map[String, Expression]) extends Iterator[DataFrame] {
+
+  // Build the execution plan for the script
+  private val executionPlan: Iterator[CompoundStatementExec] =
+SqlScriptingInterpreter(session).buildExecutionPlan(sqlScript, args)
+
+  private var current = getNextResult
+
+  override def hasNext: Boolean = current.isDefined
+
+  override def next(): DataFrame = {
+if (!hasNext) {
+  throw new NoSuchElementException("No more statements to execute")
+}
+val nextDataFrame = 
current.get.asInstanceOf[SingleStatementExec].buildDataFrame(session)
+current = getNextResult
+nextDataFrame
+  }
+
+  /** Helper method to iterate through statements until next result statement 
is encountered */
+  private def getNextResult: Option[CompoundStatementExec] = {
+var currentStatement = if (executionPlan.hasNext) 
Some(executionPlan.next()) else None
+// While we don't have a result statement, execute the statements
+while (currentStatement.isDefined && !currentStatement.get.isResult) {
+  currentStatement match {
+case Some(stmt: SingleStatementExec) if !stmt.isExecuted =>
+  withErrorHandling() {
+stmt.buildDataFrame(session).collect()

Review Comment:
   Since we discard the result, we can do 
`stmt.buildDataFrame(session).write.format("noop").save()`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48344][SQL] Prepare SQL Scripting for addition of Execution Framework - part 2 [spark]

2024-11-25 Thread via GitHub


miland-db commented on PR #48950:
URL: https://github.com/apache/spark/pull/48950#issuecomment-2497614926

   Adding @cloud-fan @dejankrak-db @davidm-db @dusantism-db @MaxGekk


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org