AngersZhuuuu commented on a change in pull request #29085:
URL: https://github.com/apache/spark/pull/29085#discussion_r454864040



##########
File path: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/HiveScriptTransformationExec.scala
##########
@@ -78,17 +78,25 @@ case class HiveScriptTransformationExec(
       stderrBuffer,
       "Thread-ScriptTransformation-STDERR-Consumer").start()
 
-    val outputProjection = new InterpretedProjection(input, child.output)
-
     // This nullability is a performance optimization in order to avoid an 
Option.foreach() call
     // inside of a loop
     @Nullable val (inputSerde, inputSoi) = 
ioschema.initInputSerDe(input).getOrElse((null, null))
 
+    // For HiveScriptTransformationExec, if inputSerde == null, but 
outputSerde != null
+    // We will use StringBuffer to pass data, in this case, we should cast 
data as string too.
+    val finalInput = if (inputSerde == null) {
+      input.map(Cast(_, StringType).withTimeZone(conf.sessionLocalTimeZone))

Review comment:
       > This suggested `CAST` approach looks good, but we get a little 
behaviour changes for some types?
   
   To be honest, transform can't support array/map/struct etc,  now we just 
keep normal datatype correct is ok.
   And here we handle like default LazySimpleSerde.
   
   For input and output data serialization, add spark's own serde is the best 
way, and with that pr, this code can be removed.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to