Github user mgaido91 commented on a diff in the pull request:
https://github.com/apache/spark/pull/19480#discussion_r144263743
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala
---
@@ -798,10 +830,35 @@ class CodegenContext {
| ${makeSplitFunction(body)}
|}
""".stripMargin
- addNewFunction(name, code)
+ addNewFunctionInternal(name, code, inlineToOuterClass = false)
}
- foldFunctions(functions.map(name =>
s"$name(${arguments.map(_._2).mkString(", ")})"))
+ val outerClassFunctions = functions
+ .filter(_.subclassName.isEmpty)
+ .map(_.functionName)
+
+ val innerClassFunctions = functions
+ .filter(_.subclassName.isDefined)
+ .foldLeft(Map.empty[(String, String), Seq[String]]) { case (acc,
f) =>
+ val key = (f.subclassName.get, f.subclassInstance.get)
+ acc.updated(key, acc.getOrElse(key, Seq.empty[String]) ++
Seq(f.functionName))
+ }
+ .map { case ((subclassName, subclassInstance), subclassFunctions)
=>
+ // Adding a new function to each subclass which contains
+ // the invocation of all the ones which have been added to
+ // that subclass
+ val code = s"""
+ |private $returnType $func($argString) {
+ |
${makeSplitFunction(foldFunctions(subclassFunctions.map(name =>
+ s"$name(${arguments.map(_._2).mkString(", ")})")))}
+ |}
+ """.stripMargin
+ addNewFunctionToClass(func, code, subclassName)
--- End diff --
yes, but we are likely to recreate the same issue the PR is trying to solve
in the last `NestedClass`. The problem is that a method call, despite it is few
bytes of code, introduce multiple entries in the constant pool. If we use
`addNewFunction` here, we will end up creating all the wrapper methods in the
last `NestedClass`, which will have the same issues as the outer class,
becoming a new bottelneck.
Instead, in this way, we are splitting the method call among the several
nested classes and each of them has its owns, thus it is reusing some constant
pool entries.
As you pointed out, nested classes are less critical than the outer one,
because the attributes are declared in the outer class and currently all the
many small methods are called there. If we put `addNewFunction` here, instead,
we are making critical the last nested function because we are adding there all
the many small function calls.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]