[ 
https://issues.apache.org/jira/browse/SPARK-22373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16269354#comment-16269354
 ] 

Leigh  Klotz edited comment on SPARK-22373 at 11/28/17 7:51 PM:
----------------------------------------------------------------

This happens to me regularly enough using 2.1.1.20, Avro, and more than one 
executor core that I have abandoned use of multiple cores with Avro.

I've attempted to make a 100% reproducible test case but failed, so I'm 
reporting these factors here.  

1. set --conf spark.executor.cores=2 (or any higher number)
2. reading in a certain large Avro file
3. spark 2.1.1.20
4. spark.read.avro(fn).cache.count or other action involving writes; just count 
doesn't do it.
5. The Avro file contains a key of type Map[String->Array[Byte]], though the 
values cab all be empty arrays. The cardinality of they keyspace is high and 
the number of keys per map is tens to hundreds.
6. Multiple partitions are necessary to trigger the error.
7. Before the stack trace reported above, I see "ERROR CodeGenerator: failed to 
compile: java.lang.NullPointerException" followed by a dump of generated Java 
code.  

The nodes that fail have "INFO CodeGenerator: Code generated in ##.##### ms" 
messages and my theory is that the code generator being used here has a thread 
safety issue.  




was (Author: leighklotz):
This happens to me regularly enough using 2.1.1.20, Avro, and more than one 
executor core that I have abandoned use of multiple cores with Avro.

I've attempted to make a 100% reproducible test case but failed, so I'm 
reporting these factors here.  

1. set --conf spark.executor.cores=2 (or any higher number)
2. reading in a certain large Avro file
3. spark 2.1.1.20
4. spark.read.avro(fn).cache.count 
5. The Avro file contains a key of type Map[String->Array[Byte]], though the 
values cab all be empty arrays. The cardinality of they keyspace is high and 
the number of keys per map is tens to hundreds.
6. Multiple partitions are necessary to trigger the error.
7. Before the stack trace reported above, I see "ERROR CodeGenerator: failed to 
compile: java.lang.NullPointerException" followed by a dump of generated Java 
code.  

The nodes that fail have "INFO CodeGenerator: Code generated in ##.##### ms" 
messages and my theory is that the code generator being used here has a thread 
safety issue.  



> Intermittent NullPointerException in 
> org.codehaus.janino.IClass.isAssignableFrom
> --------------------------------------------------------------------------------
>
>                 Key: SPARK-22373
>                 URL: https://issues.apache.org/jira/browse/SPARK-22373
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.1.1
>         Environment: Hortonworks distribution: HDP 2.6.2.0-205 , 
> /usr/hdp/current/spark2-client/jars/spark-core_2.11-2.1.1.2.6.2.0-205.jar
>            Reporter: Dan Meany
>            Priority: Minor
>
> Very occasional and retry works.
> Full stack:
> 17/10/27 21:06:15 ERROR Executor: Exception in task 29.0 in stage 12.0 (TID 
> 758)
> java.lang.NullPointerException
>       at org.codehaus.janino.IClass.isAssignableFrom(IClass.java:569)
>       at 
> org.codehaus.janino.UnitCompiler.isWideningReferenceConvertible(UnitCompiler.java:10347)
>       at 
> org.codehaus.janino.UnitCompiler.isMethodInvocationConvertible(UnitCompiler.java:8636)
>       at 
> org.codehaus.janino.UnitCompiler.findMostSpecificIInvocable(UnitCompiler.java:8427)
>       at 
> org.codehaus.janino.UnitCompiler.findMostSpecificIInvocable(UnitCompiler.java:8285)
>       at org.codehaus.janino.UnitCompiler.findIMethod(UnitCompiler.java:8169)
>       at org.codehaus.janino.UnitCompiler.findIMethod(UnitCompiler.java:8071)
>       at org.codehaus.janino.UnitCompiler.compileGet2(UnitCompiler.java:4421)
>       at org.codehaus.janino.UnitCompiler.access$7500(UnitCompiler.java:206)
>       at 
> org.codehaus.janino.UnitCompiler$12.visitMethodInvocation(UnitCompiler.java:3774)
>       at 
> org.codehaus.janino.UnitCompiler$12.visitMethodInvocation(UnitCompiler.java:3762)
>       at org.codehaus.janino.Java$MethodInvocation.accept(Java.java:4328)
>       at org.codehaus.janino.UnitCompiler.compileGet(UnitCompiler.java:3762)
>       at 
> org.codehaus.janino.UnitCompiler.compileGetValue(UnitCompiler.java:4933)
>       at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:3180)
>       at org.codehaus.janino.UnitCompiler.access$5000(UnitCompiler.java:206)
>       at 
> org.codehaus.janino.UnitCompiler$9.visitMethodInvocation(UnitCompiler.java:3151)
>       at 
> org.codehaus.janino.UnitCompiler$9.visitMethodInvocation(UnitCompiler.java:3139)
>       at org.codehaus.janino.Java$MethodInvocation.accept(Java.java:4328)
>       at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:3139)
>       at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:2112)
>       at org.codehaus.janino.UnitCompiler.access$1700(UnitCompiler.java:206)
>       at 
> org.codehaus.janino.UnitCompiler$6.visitExpressionStatement(UnitCompiler.java:1377)
>       at 
> org.codehaus.janino.UnitCompiler$6.visitExpressionStatement(UnitCompiler.java:1370)
>       at org.codehaus.janino.Java$ExpressionStatement.accept(Java.java:2558)
>       at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:1370)
>       at 
> org.codehaus.janino.UnitCompiler.compileStatements(UnitCompiler.java:1450)
>       at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:2811)
>       at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:550)
>       at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:890)
>       at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:894)
>       at org.codehaus.janino.UnitCompiler.access$600(UnitCompiler.java:206)
>       at 
> org.codehaus.janino.UnitCompiler$2.visitMemberClassDeclaration(UnitCompiler.java:377)
>       at 
> org.codehaus.janino.UnitCompiler$2.visitMemberClassDeclaration(UnitCompiler.java:369)
>       at 
> org.codehaus.janino.Java$MemberClassDeclaration.accept(Java.java:1128)
>       at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:369)
>       at 
> org.codehaus.janino.UnitCompiler.compileDeclaredMemberTypes(UnitCompiler.java:1209)
>       at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:564)
>       at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:890)
>       at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:894)
>       at org.codehaus.janino.UnitCompiler.access$600(UnitCompiler.java:206)
>       at 
> org.codehaus.janino.UnitCompiler$2.visitMemberClassDeclaration(UnitCompiler.java:377)
>       at 
> org.codehaus.janino.UnitCompiler$2.visitMemberClassDeclaration(UnitCompiler.java:369)
>       at 
> org.codehaus.janino.Java$MemberClassDeclaration.accept(Java.java:1128)
>       at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:369)
>       at 
> org.codehaus.janino.UnitCompiler.compileDeclaredMemberTypes(UnitCompiler.java:1209)
>       at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:564)
>       at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:420)
>       at org.codehaus.janino.UnitCompiler.access$400(UnitCompiler.java:206)
>       at 
> org.codehaus.janino.UnitCompiler$2.visitPackageMemberClassDeclaration(UnitCompiler.java:374)
>       at 
> org.codehaus.janino.UnitCompiler$2.visitPackageMemberClassDeclaration(UnitCompiler.java:369)
>       at 
> org.codehaus.janino.Java$AbstractPackageMemberClassDeclaration.accept(Java.java:1309)
>       at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:369)
>       at org.codehaus.janino.UnitCompiler.compileUnit(UnitCompiler.java:345)
>       at 
> org.codehaus.janino.SimpleCompiler.compileToClassLoader(SimpleCompiler.java:396)
>       at 
> org.codehaus.janino.ClassBodyEvaluator.compileToClass(ClassBodyEvaluator.java:311)
>       at 
> org.codehaus.janino.ClassBodyEvaluator.cook(ClassBodyEvaluator.java:229)
>       at org.codehaus.janino.SimpleCompiler.cook(SimpleCompiler.java:196)
>       at org.codehaus.commons.compiler.Cookable.cook(Cookable.java:91)
>       at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.org$apache$spark$sql$catalyst$expressions$codegen$CodeGenerator$$doCompile(CodeGenerator.scala:959)
>       at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:1026)
>       at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:1023)
>       at 
> org.spark_project.guava.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599)
>       at 
> org.spark_project.guava.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)
>       at 
> org.spark_project.guava.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342)
>       at 
> org.spark_project.guava.cache.LocalCache$Segment.get(LocalCache.java:2257)
>       at org.spark_project.guava.cache.LocalCache.get(LocalCache.java:4000)
>       at 
> org.spark_project.guava.cache.LocalCache.getOrLoad(LocalCache.java:4004)
>       at 
> org.spark_project.guava.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874)
>       at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.compile(CodeGenerator.scala:908)
>       at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8.apply(WholeStageCodegenExec.scala:372)
>       at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8.apply(WholeStageCodegenExec.scala:371)
>       at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$26.apply(RDD.scala:844)
>       at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$26.apply(RDD.scala:844)
>       at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>       at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>       at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>       at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>       at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>       at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>       at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
>       at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
>       at org.apache.spark.scheduler.Task.run(Task.scala:99)
>       at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at java.lang.Thread.run(Thread.java:745)
> 17/10/27 21:06:15 INFO CodeGenerator: Code generated in 8.896831 ms
> Intermittent nature of problem makes me suspect the cache or a thread-related 
> issue.
> Some the SQL that appears in the area of the code line reported in Spark UI:
>      dense_rank() over (partition by itemid, type order by 
> sum(col_a)+(sum(col_b)/1000000000000000.0) desc) as rank, 
>              ...where cast(mytimestampfield as String) >= '$mydate'
>            



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to