Github user mgaido91 commented on a diff in the pull request:
https://github.com/apache/spark/pull/19797#discussion_r152866870
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
---
@@ -87,14 +87,13 @@ private [sql] object GenArrayData {
elementType: DataType,
elementsCode: Seq[ExprCode],
isMapKey: Boolean): (String, Seq[String], String, String) = {
- val arrayName = ctx.freshName("array")
val arrayDataName = ctx.freshName("arrayData")
val numElements = elementsCode.length
if (!ctx.isPrimitiveType(elementType)) {
+ val arrayName = "arrayObject"
val genericArrayClass = classOf[GenericArrayData].getName
- ctx.addMutableState("Object[]", arrayName,
- s"$arrayName = new Object[$numElements];")
+ ctx.reuseOrAddMutableState("Object[]", arrayName)
--- End diff --
sorry, I understand only now your comment there. Honestly I may have a
partial vision of this then please correct me if I am wrong, but my preference
is for local variable for two reasons:
- in the case we have a lot of methods and we have inner classes
containing the methods, using a variable of the outer class adds (at least) a
constant pool entry, while using a local variable doesn't (as per my
understanding);
- since we reinitialize every time the object with new assignments, I
think we are not saving any GC (we are reusing only the pointer basically,
which I don't think is a big problem).
What do you think?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]