HeartSaVioR commented on a change in pull request #26173: [SPARK-29503][SQL]
Copy result row from RowWriter in GenerateUnsafeProjection when the expression
is lambdaFunction in MapObject
URL: https://github.com/apache/spark/pull/26173#discussion_r337266895
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
##########
@@ -885,13 +887,15 @@ case class MapObjects private(
)
}
- // Make a copy of the data if it's unsafe-backed
- def makeCopyIfInstanceOf(clazz: Class[_ <: Any], value: String) =
- s"$value instanceof ${clazz.getSimpleName}? ${value}.copy() : $value"
+ // Make a copy of the unsafe data if the result contains any
+ def makeCopyUnsafeData(dataType: DataType, value: String) = {
+ s"""${value}.copyUnsafeData("${dataType.catalogString}")"""
Review comment:
We may be able to let CodeGenerator go through the schema and generate code
based on the schema, but I'd suspect the overall amount of generated code could
be larger as generated code cannot leverage schema information and we have to
generate all the codes accessing and replacing fields recursively.
And looks like string literal has 64k limit, which doesn't seem to be short
- it's also the limitation of method length as well. We may be even be able to
try to apply trick here, define this in static field or field of class, and
assign to one String instance via splitting by chunk (less than 64k) and
concatenating to get over the limitation.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]