[
https://issues.apache.org/jira/browse/SPARK-18016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aleksander Eskilson reopened SPARK-18016:
-----------------------------------------
After digging through the stacktraces more closely, it appears that for certain
very wide/nested schemas, the generated code does attempt to allocate a number
of variables larger than the Janino compiler will allow, 0xFFFF or 65536. This
convinces me that the error is not in the class of "64 KB" errors found in
other Jiras (e.g. SPARK-17702, SPARK-16845), although it is related to the size
of code generation in the sense of the number of variables declared.
> Code Generation: Constant Pool Past Limit for Wide/Nested Dataset
> -----------------------------------------------------------------
>
> Key: SPARK-18016
> URL: https://issues.apache.org/jira/browse/SPARK-18016
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.1.0
> Reporter: Aleksander Eskilson
>
> When attempting to encode collections of large Java objects to Datasets
> having very wide or deeply nested schemas, code generation can fail, yielding:
> {code}
> Caused by: org.codehaus.janino.JaninoRuntimeException: Constant pool for
> class
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection
> has grown past JVM limit of 0xFFFF
> at
> org.codehaus.janino.util.ClassFile.addToConstantPool(ClassFile.java:499)
> at
> org.codehaus.janino.util.ClassFile.addConstantNameAndTypeInfo(ClassFile.java:439)
> at
> org.codehaus.janino.util.ClassFile.addConstantMethodrefInfo(ClassFile.java:358)
> at
> org.codehaus.janino.UnitCompiler.writeConstantMethodrefInfo(UnitCompiler.java:11114)
> at org.codehaus.janino.UnitCompiler.compileGet2(UnitCompiler.java:4547)
> at org.codehaus.janino.UnitCompiler.access$7500(UnitCompiler.java:206)
> at
> org.codehaus.janino.UnitCompiler$12.visitMethodInvocation(UnitCompiler.java:3774)
> at
> org.codehaus.janino.UnitCompiler$12.visitMethodInvocation(UnitCompiler.java:3762)
> at org.codehaus.janino.Java$MethodInvocation.accept(Java.java:4328)
> at org.codehaus.janino.UnitCompiler.compileGet(UnitCompiler.java:3762)
> at
> org.codehaus.janino.UnitCompiler.compileGetValue(UnitCompiler.java:4933)
> at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:3180)
> at org.codehaus.janino.UnitCompiler.access$5000(UnitCompiler.java:206)
> at
> org.codehaus.janino.UnitCompiler$9.visitMethodInvocation(UnitCompiler.java:3151)
> at
> org.codehaus.janino.UnitCompiler$9.visitMethodInvocation(UnitCompiler.java:3139)
> at org.codehaus.janino.Java$MethodInvocation.accept(Java.java:4328)
> at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:3139)
> at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:2112)
> at org.codehaus.janino.UnitCompiler.access$1700(UnitCompiler.java:206)
> at
> org.codehaus.janino.UnitCompiler$6.visitExpressionStatement(UnitCompiler.java:1377)
> at
> org.codehaus.janino.UnitCompiler$6.visitExpressionStatement(UnitCompiler.java:1370)
> at org.codehaus.janino.Java$ExpressionStatement.accept(Java.java:2558)
> at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:1370)
> at
> org.codehaus.janino.UnitCompiler.compileStatements(UnitCompiler.java:1450)
> at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:2811)
> at
> org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1262)
> at
> org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1234)
> at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:538)
> at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:890)
> at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:894)
> at org.codehaus.janino.UnitCompiler.access$600(UnitCompiler.java:206)
> at
> org.codehaus.janino.UnitCompiler$2.visitMemberClassDeclaration(UnitCompiler.java:377)
> at
> org.codehaus.janino.UnitCompiler$2.visitMemberClassDeclaration(UnitCompiler.java:369)
> at
> org.codehaus.janino.Java$MemberClassDeclaration.accept(Java.java:1128)
> at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:369)
> at
> org.codehaus.janino.UnitCompiler.compileDeclaredMemberTypes(UnitCompiler.java:1209)
> at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:564)
> at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:420)
> at org.codehaus.janino.UnitCompiler.access$400(UnitCompiler.java:206)
> at
> org.codehaus.janino.UnitCompiler$2.visitPackageMemberClassDeclaration(UnitCompiler.java:374)
> at
> org.codehaus.janino.UnitCompiler$2.visitPackageMemberClassDeclaration(UnitCompiler.java:369)
> at
> org.codehaus.janino.Java$AbstractPackageMemberClassDeclaration.accept(Java.java:1309)
> at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:369)
> at org.codehaus.janino.UnitCompiler.compileUnit(UnitCompiler.java:345)
> at
> org.codehaus.janino.SimpleCompiler.compileToClassLoader(SimpleCompiler.java:396)
> at
> org.codehaus.janino.ClassBodyEvaluator.compileToClass(ClassBodyEvaluator.java:311)
> at
> org.codehaus.janino.ClassBodyEvaluator.cook(ClassBodyEvaluator.java:229)
> at org.codehaus.janino.SimpleCompiler.cook(SimpleCompiler.java:196)
> at org.codehaus.commons.compiler.Cookable.cook(Cookable.java:91)
> at
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.org$apache$spark$sql$catalyst$expressions$codegen$CodeGenerator$$doCompile(CodeGenerator.scala:905)
> ... 35 more
> {code}
> During generation of the code for SpecificUnsafeProjection, all the mutable
> variables are declared up front. If there are too many, it seems it perhaps
> exceeds some type of resource limit.
> This issue seems related to (but is not fixed by) SPARK-17702, which itself
> was about the size of individual methods growing beyond the 64 KB limit.
> SPARK-17702 was resolved by breaking extractions into smaller methods, but
> does not seem to have resolved this issue.
> I've created a small project [1] where I declare a list of "wide" and
> "nested" Bean objects that I attempt to encode to a Dataset. This code can
> trigger the failure for Spark 2.1.0-SNAPSHOT.
> [1] - https://github.com/bdrillard/spark-codegen-error
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]