Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/19480#discussion_r144263743 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -798,10 +830,35 @@ class CodegenContext { | ${makeSplitFunction(body)} |} """.stripMargin - addNewFunction(name, code) + addNewFunctionInternal(name, code, inlineToOuterClass = false) } - foldFunctions(functions.map(name => s"$name(${arguments.map(_._2).mkString(", ")})")) + val outerClassFunctions = functions + .filter(_.subclassName.isEmpty) + .map(_.functionName) + + val innerClassFunctions = functions + .filter(_.subclassName.isDefined) + .foldLeft(Map.empty[(String, String), Seq[String]]) { case (acc, f) => + val key = (f.subclassName.get, f.subclassInstance.get) + acc.updated(key, acc.getOrElse(key, Seq.empty[String]) ++ Seq(f.functionName)) + } + .map { case ((subclassName, subclassInstance), subclassFunctions) => + // Adding a new function to each subclass which contains + // the invocation of all the ones which have been added to + // that subclass + val code = s""" + |private $returnType $func($argString) { + | ${makeSplitFunction(foldFunctions(subclassFunctions.map(name => + s"$name(${arguments.map(_._2).mkString(", ")})")))} + |} + """.stripMargin + addNewFunctionToClass(func, code, subclassName) --- End diff -- yes, but we are likely to recreate the same issue the PR is trying to solve in the last `NestedClass`. The problem is that a method call, despite it is few bytes of code, introduce multiple entries in the constant pool. If we use `addNewFunction` here, we will end up creating all the wrapper methods in the last `NestedClass`, which will have the same issues as the outer class, becoming a new bottelneck. Instead, in this way, we are splitting the method call among the several nested classes and each of them has its owns, thus it is reusing some constant pool entries. As you pointed out, nested classes are less critical than the outer one, because the attributes are declared in the outer class and currently all the many small methods are called there. If we put `addNewFunction` here, instead, we are making critical the last nested function because we are adding there all the many small function calls.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org