Re: [PR] feat: ic for native unhex [datafusion-comet]

via GitHub Wed, 01 May 2024 01:10:14 -0700


kazuyukitanimura commented on code in PR #342:
URL: https://github.com/apache/datafusion-comet/pull/342#discussion_r1585982628



##########
pom.xml:
##########
@@ -88,7 +88,8 @@ under the License.
     <argLine>-ea -Xmx4g -Xss4m ${extraJavaTestArgs}</argLine>
     <additional.3_3.test.source>spark-3.3-plus</additional.3_3.test.source>
     <additional.3_4.test.source>spark-3.4</additional.3_4.test.source>
-    <shims.source>spark-3.x</shims.source>
+    <shims.majorSource>spark-3.x</shims.majorSource>
+    <shims.minorSource>spark-3.4.x</shims.minorSource>

Review Comment:
   FYI `spark-3.x` will be gone and supposed to be replaced by independent 
`spark-3.2`, `spark-3.3`, and `spark-3.4` dirs
   So let's go ahead and start using `spark-3.4` instead of `spark-3.4.x`? I.e.
   ```
   <shims.source.shared>spark-3.x</shims.source.shared>
   <shims.source>spark-3.4</shims.source>
   ```



##########
pom.xml:
##########
@@ -500,6 +501,7 @@ under the License.
         <!-- we don't add special test suits for spark-3.2, so a not existed 
dir is specified-->
         <additional.3_3.test.source>not-needed-yet</additional.3_3.test.source>
         <additional.3_4.test.source>not-needed-yet</additional.3_4.test.source>
+        <shims.minorSource>spark-3.3.x</shims.minorSource>

Review Comment:
   I guess this should be `spark-3.2`?



##########
spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala:
##########
@@ -1396,6 +1397,19 @@ object QueryPlanSerde extends Logging with 
ShimQueryPlanSerde {
           val optExpr = scalarExprToProto("atan2", leftExpr, rightExpr)
           optExprWithInfo(optExpr, expr, left, right)
 
+        case e: Unhex =>
+          val unHex = unhexSerde(e)
+
+          val childCast = Cast(unHex._1, StringType)
+          val failOnErrorCast = Cast(unHex._2, BooleanType)

Review Comment:
   What would happen if we do not use `Cast`?



##########
spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala:
##########
@@ -45,12 +45,13 @@ import 
org.apache.comet.CometSparkSessionExtensions.{isCometOperatorEnabled, isC
 import org.apache.comet.serde.ExprOuterClass.{AggExpr, DataType => 
ProtoDataType, Expr, ScalarFunc}
 import org.apache.comet.serde.ExprOuterClass.DataType.{DataTypeInfo, 
DecimalInfo, ListInfo, MapInfo, StructInfo}
 import org.apache.comet.serde.OperatorOuterClass.{AggregateMode => 
CometAggregateMode, JoinType, Operator}
+import org.apache.comet.shims.ShimCometUnhexExpr
 import org.apache.comet.shims.ShimQueryPlanSerde
 
 /**
  * An utility object for query plan and expression serialization.
  */
-object QueryPlanSerde extends Logging with ShimQueryPlanSerde {
+object QueryPlanSerde extends Logging with ShimQueryPlanSerde with 
ShimCometUnhexExpr {

Review Comment:
   I would say `ShimCometUnhexExpr` should be more generic name, not unhex 
specific. What about `ShimCometExpr ` (or if you have other idea, please feel 
free to propose)?
   Otherwise, we will have to keep adding trait class per function.
   
   Eventually we should merge  `ShimQueryPlanSerde` and `ShimCometExpr` into 
one when we remove `spark-3.x` dir.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] feat: ic for native unhex [datafusion-comet]

Reply via email to