Github user bdrillard commented on a diff in the pull request:
https://github.com/apache/spark/pull/18075#discussion_r121968338
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala
---
@@ -233,10 +222,124 @@ class CodegenContext {
// The collection of sub-expression result resetting methods that need
to be called on each row.
val subexprFunctions = mutable.ArrayBuffer.empty[String]
- def declareAddedFunctions(): String = {
- addedFunctions.map { case (funcName, funcCode) => funcCode
}.mkString("\n")
+ /**
+ * Holds the class and instance names to be generated. `OuterClass` is a
placeholder standing for
+ * whichever class is generated as the outermost class and which will
contain any nested
+ * sub-classes. All other classes and instance names in this list will
represent private, nested
+ * sub-classes.
+ */
+ private val classes: mutable.ListBuffer[(String, String)] =
+ mutable.ListBuffer[(String, String)]("OuterClass" -> null)
+
+ // A map holding the current size in bytes of each class to be generated.
+ private val classSize: mutable.Map[String, Int] =
+ mutable.Map[String, Int]("OuterClass" -> 0)
+
+ // Nested maps holding function names and their code belonging to each
class.
+ private val classFunctions: mutable.Map[String, mutable.Map[String,
String]] =
--- End diff --
I had originally thought so, but it turns out that there's at least one
instance where the code for a given function name is updated during the
code-generation process. The generated
[`stopEarly`](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala#L75)
function can actually be inserted twice, once returning a variable returning a
different `stopEarly` variable each time. What would end up occurring is that
two functions of the same signature would exist in the class, causing a compile
error. So we need to use a map to make sure the implementation gets _updated_
for a given function when necessary. Note also that the old implementation of
[`addedFunctions`](https://github.com/apache/spark/pull/18075/files#diff-8bcc5aea39c73d4bf38aef6f6951d42cL208)
was a map also.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]