[GitHub] [spark] LuciferYang commented on a diff in pull request #40654: [SPARK-43022][CONNECT] Support protobuf functions for Scala client

via GitHub Tue, 09 May 2023 02:27:50 -0700


LuciferYang commented on code in PR #40654:
URL: https://github.com/apache/spark/pull/40654#discussion_r1188376220



##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala:
##########
@@ -1567,6 +1613,19 @@ class SparkConnectPlanner(val session: SparkSession) {
         val expr = transformExpression(fun.getArguments(0))
         Some(transformUnregisteredUDF(MLFunctions.arrayToVectorUdf, Seq(expr)))
 
+      // Protobuf-specific functions
+      case "from_protobuf" if Seq(2, 3, 4).contains(fun.getArgumentsCount) =>
+        val children = 
fun.getArgumentsList.asScala.toSeq.map(transformExpression)
+        val (messageClassName, descFilePathOpt, options) =
+          extractArgsOfProtobufFunction("from_protobuf", 
fun.getArgumentsCount, children)
+        Some(ProtobufDataToCatalyst(children.head, messageClassName, 
descFilePathOpt, options))

Review Comment:
   like
   
   ```scala
          import org.apache.spark.sql.protobuf.{functions => pb}
           descFilePathOpt match {
             case Some(descFilePath) =>
               Some(pb.from_protobuf(Column(children.head),
                 messageClassName, descFilePath, options.asJava).expr)
             case _ =>
               Some(pb.from_protobuf(Column(children.head), messageClassName, 
options.asJava).expr)
           }
   ```?
   
   This seem not simple enough:
   
   1. We had to new a Column with `children.head` and retrieve the `.expr` of 
the return value again
   
   2. Need to pattern match `descFilePathOpt` again
   
   3. Keep the same calling way as others in `transformUnregisteredFunction` now
   
   
   So I tend to prefer the current invocation way now, do you have a simpler 
way?
   
   
   
   
   



##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala:
##########
@@ -1567,6 +1613,19 @@ class SparkConnectPlanner(val session: SparkSession) {
         val expr = transformExpression(fun.getArguments(0))
         Some(transformUnregisteredUDF(MLFunctions.arrayToVectorUdf, Seq(expr)))
 
+      // Protobuf-specific functions
+      case "from_protobuf" if Seq(2, 3, 4).contains(fun.getArgumentsCount) =>
+        val children = 
fun.getArgumentsList.asScala.toSeq.map(transformExpression)
+        val (messageClassName, descFilePathOpt, options) =
+          extractArgsOfProtobufFunction("from_protobuf", 
fun.getArgumentsCount, children)
+        Some(ProtobufDataToCatalyst(children.head, messageClassName, 
descFilePathOpt, options))

Review Comment:
   like
   
   ```scala
          import org.apache.spark.sql.protobuf.{functions => pb}
           descFilePathOpt match {
             case Some(descFilePath) =>
               Some(pb.from_protobuf(Column(children.head),
                 messageClassName, descFilePath, options.asJava).expr)
             case _ =>
               Some(pb.from_protobuf(Column(children.head), messageClassName, 
options.asJava).expr)
           }
   ```
   ?
   
   This seem not simple enough:
   
   1. We had to new a Column with `children.head` and retrieve the `.expr` of 
the return value again
   
   2. Need to pattern match `descFilePathOpt` again
   
   3. Keep the same calling way as others in `transformUnregisteredFunction` now
   
   
   So I tend to prefer the current invocation way now, do you have a simpler 
way?
   
   
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] LuciferYang commented on a diff in pull request #40654: [SPARK-43022][CONNECT] Support protobuf functions for Scala client

Reply via email to