Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19480#discussion_r144263743
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala
 ---
    @@ -798,10 +830,35 @@ class CodegenContext {
                |  ${makeSplitFunction(body)}
                |}
              """.stripMargin
    -        addNewFunction(name, code)
    +        addNewFunctionInternal(name, code, inlineToOuterClass = false)
           }
     
    -      foldFunctions(functions.map(name => 
s"$name(${arguments.map(_._2).mkString(", ")})"))
    +      val outerClassFunctions = functions
    +        .filter(_.subclassName.isEmpty)
    +        .map(_.functionName)
    +
    +      val innerClassFunctions = functions
    +        .filter(_.subclassName.isDefined)
    +        .foldLeft(Map.empty[(String, String), Seq[String]]) { case (acc, 
f) =>
    +          val key = (f.subclassName.get, f.subclassInstance.get)
    +          acc.updated(key, acc.getOrElse(key, Seq.empty[String]) ++ 
Seq(f.functionName))
    +        }
    +        .map { case ((subclassName, subclassInstance), subclassFunctions) 
=>
    +          // Adding a new function to each subclass which contains
    +          // the invocation of all the ones which have been added to
    +          // that subclass
    +          val code = s"""
    +              |private $returnType $func($argString) {
    +              |  
${makeSplitFunction(foldFunctions(subclassFunctions.map(name =>
    +                    s"$name(${arguments.map(_._2).mkString(", ")})")))}
    +              |}
    +            """.stripMargin
    +          addNewFunctionToClass(func, code, subclassName)
    --- End diff --
    
    yes, but we are likely to recreate the same issue the PR is trying to solve 
in the last `NestedClass`. The problem is that a method call, despite it is few 
bytes of code, introduce multiple entries in the constant pool. If we use 
`addNewFunction` here, we will end up creating all the wrapper methods in the 
last `NestedClass`, which will have the same issues as the outer class, 
becoming a new bottelneck.
    Instead, in this way, we are splitting the method call among the several 
nested classes and each of them has its owns, thus it is reusing some constant 
pool entries.
    As you pointed out, nested classes are less critical than the outer one, 
because the attributes are declared in the outer class and currently all the 
many small methods are called there. If we put `addNewFunction` here, instead, 
we are making critical the last nested function because we are adding there all 
the many small function calls.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to