Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19797#discussion_r152866870
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
 ---
    @@ -87,14 +87,13 @@ private [sql] object GenArrayData {
           elementType: DataType,
           elementsCode: Seq[ExprCode],
           isMapKey: Boolean): (String, Seq[String], String, String) = {
    -    val arrayName = ctx.freshName("array")
         val arrayDataName = ctx.freshName("arrayData")
         val numElements = elementsCode.length
     
         if (!ctx.isPrimitiveType(elementType)) {
    +      val arrayName = "arrayObject"
           val genericArrayClass = classOf[GenericArrayData].getName
    -      ctx.addMutableState("Object[]", arrayName,
    -        s"$arrayName = new Object[$numElements];")
    +      ctx.reuseOrAddMutableState("Object[]", arrayName)
    --- End diff --
    
    sorry, I understand only now your comment there. Honestly I may have a 
partial vision of this then please correct me if I am wrong, but my preference 
is for local variable for two reasons:
     - in the case we have a lot of methods and we have inner classes 
containing the methods, using a variable of the outer class adds (at least) a 
constant pool entry, while using a local variable doesn't (as per my 
understanding);
     - since we reinitialize every time the object with new assignments, I 
think we are not saving any GC (we are reusing only the pointer basically, 
which I don't think is a big problem).
    What do you think?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to