AngersZhuuuu commented on a change in pull request #30957:
URL: https://github.com/apache/spark/pull/30957#discussion_r570797337
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/execution/BaseScriptTransformationExec.scala
##########
@@ -220,6 +226,9 @@ trait BaseScriptTransformationExec extends UnaryExecNode {
case CalendarIntervalType => wrapperConvertException(
data => IntervalUtils.stringToInterval(UTF8String.fromString(data)),
converter)
+ case _: ArrayType | _: MapType | _: StructType =>
+ wrapperConvertException(data => JsonToStructs(attr.dataType,
Map.empty[String, String],
Review comment:
> This can cause much overhead cuz this make a new object
(`JsonToStructs `) for each call. Could you avoid it?
This problem also happen in input side's `Cast` and `StructToJson`.
To avoid it maybe we need to extract common method from these expression or
just write some thing for` ScriptTransform`. WDYT @cloud-fan
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]