fqaiser94 commented on pull request #27066:
URL: https://github.com/apache/spark/pull/27066#issuecomment-642984217


   > BTW Spark does common subexpressions elimination during codegen, so 
repeated expressions don't mean they are evaluated repeatedly.
   
   You're right, I forgot about this. 
   
   Even so, the amount generated code grows exponentially for each column you 
want to add to a nullable struct. For example, adding just 5 columns to a 
nullable struct generates more than 150,000 lines of java code and naturally 
this throws an error with `Cause: 
org.codehaus.janino.InternalCompilerException: Code of method "..." of class 
"..." grows beyond 64 KB`. Here is a simple reproducible example: 
   
   ```
     test("withField temp") {
       val newColumn = (1 to 5).foldLeft(col("a")) {
         (column, num) => column.withField(s"col$num", lit(num))
       }
       val result = nullStructLevel1.withColumn("a", newColumn)
       result.show(false)
     }
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to