Marco Gaido created SPARK-22226:
-----------------------------------

             Summary: Code generation fails for dataframes with 10000 columns
                 Key: SPARK-22226
                 URL: https://issues.apache.org/jira/browse/SPARK-22226
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.2.0
            Reporter: Marco Gaido


Code generation for very wide datasets can fail because of the Constant Pool 
limit reached.

This can be caused by many reasons. One of them is that we are currently 
splitting the definition of the generated methods among several {{NestedClass}} 
but all these methods are called in the main class. Since we have entries added 
to the constant pool for each method invocation, this is limiting the number of 
rows and is leading for very wide dataset to:

{noformat}
org.codehaus.janino.JaninoRuntimeException: Constant pool for class 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificMutableProjection
 has grown past JVM limit of 0xFFFF
{noformat}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to