Re: [PR] [SPARK-48346][SQL] Support for IF ELSE statements in SQL scripts [spark]

via GitHub Thu, 01 Aug 2024 15:24:54 -0700


davidm-db commented on code in PR #47442:
URL: https://github.com/apache/spark/pull/47442#discussion_r1700938528



##########
sql/core/src/test/scala/org/apache/spark/sql/scripting/SqlScriptingExecutionNodeSuite.scala:
##########
@@ -77,11 +93,157 @@ class SqlScriptingExecutionNodeSuite extends SparkFunSuite 
{
         TestLeafStatement("three"),
         TestNestedStatementIterator(Seq(TestLeafStatement("four"), 
TestLeafStatement("five")))))
       .getTreeIterator
-    val statements = iter.map {
-      case TestLeafStatement(v) => v
-      case _ => fail("Unexpected statement type")
-    }.toList
-
+    val statements = iter.map(extractStatementValue).toSeq
     assert(statements === Seq("one", "two", "three", "four", "five"))
   }
+
+  test("if else - enter body of the IF clause") {
+    val iter = TestNestedStatementIterator(Seq(
+      TestIfElse(
+        conditions = Seq(
+          TestSingleStatement("con1")

Review Comment:
   I'm not sure if we understand each other here - `StatementBooleanEvaluator` 
**is NOT used only in tests**, it is used in `IfElseStatementExec` to evaluate 
the statement in condition to `true` or `false`, based on the output format and 
values.
   
   For unit tests in `SqlScriptingExecutionNodeSuite` this is overridden with 
`RepEval` (for now, more will be added with `while` loops). `RepEval` is a 
simple and a bit hardcoded implementation for the sake of the tests.
   
   If I were to change the tests to not use the evaluator, I don't know how 
would I change the evaluator used in `IfElseStatementExec` and have the same 
logic applied (i.e. checking the statement output format and values). This 
evaluator approach seems like the cleanest one - we have removed any kind of 
abstractions and we will introduce it in the future if needed, but I don't see 
a way how to do this simpler now.
   
   **Any suggestions or further comments on the topic?** Everything that I can 
come up with is to actually execute the statement and take first row/first 
column value from the collected DataFrame and check if it's `true` or `false` - 
but this is literally what evaluator is doing at the moment (with a few format 
checks as well).
   
   If we decide to keep the evaluator, it also makes tests simpler - we simply 
create a dummy evaluator (`RepEval`) instead of adding logic to the while loops 
that adds statements that work with session variables etc. Evaluator 
implementation for tests is literally 5 lines whereas any other logic would 
require adjusting on the level of each separate test (i.e. creating session 
vars and adding statements that set their values based on the number of loop 
iterations that we want to achieve).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-48346][SQL] Support for IF ELSE statements in SQL scripts [spark]

Reply via email to