Re: [PR] feat: Implement Spark unhex [datafusion-comet]

via GitHub Tue, 07 May 2024 18:02:40 -0700


andygrove commented on code in PR #342:
URL: https://github.com/apache/datafusion-comet/pull/342#discussion_r1593255210



##########
spark/src/test/scala/org/apache/comet/CometExpressionSuite.scala:
##########
@@ -1024,7 +1024,25 @@ class CometExpressionSuite extends CometTestBase with 
AdaptiveSparkPlanHelper {
       }
     }
   }
+  test("unhex") {
+    val table = "unhex_table"
+    withTable(table) {
+      sql(s"create table $table(col string) using parquet")
+
+      sql(s"""INSERT INTO $table VALUES
+        |('537061726B2053514C'),
+        |('737472696E67'),
+        |('\0'),
+        |(''),
+        |('###'),
+        |('G123'),
+        |('hello'),
+        |('A1B'),
+        |('0A1B')""".stripMargin)
 
+      checkSparkAnswerAndOperator(s"SELECT unhex(col) FROM $table")

Review Comment:
   Actually adding the `ORDER BY` causes the test to fail because it no longer 
runs fully native, so ignore that suggestion.
   
   Also, I do see code differences in Spark between 3.2 and 3.4 in the unhex 
algorithm. 3.2 does not have the `oddShift` variable. `oddShift` was added in 
https://github.com/apache/spark/commit/276abe3c3311300cffc6570b66f3977ea8172ff0
   
   I guess the options are:
   
   - mark `unhex` as incompat just for 3.2 and skip the test for 3.2 (probably 
the easiest path)
   - implement per-spark-version logic in Rust for unhex
   
   Let me know what you think or if you have any questions on this. This is a 
good example of the challenge of supporting multiple Spark versions 😓 
   
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] feat: Implement Spark unhex [datafusion-comet]

Reply via email to