[
https://issues.apache.org/jira/browse/SPARK-33792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17268202#comment-17268202
]
Jungtaek Lim commented on SPARK-33792:
--------------------------------------
I have no idea about the possibility, but the new stack trace concerns me. If I
understand correctly, unless there's other possibility, that's likely saying
key or value in map is null, which doesn't look to be possible.
I'm not sure the failure is consistent, as you said the failure as "time to
time". Does the problematic checkpoint always fail consistently? Or does it
simply work with another run? Reproducing the issue is quite important to solve
the problem.
And probably add Java vendor & version to issue description as well. Also if
the issue is vendor specific (as you've mentioned EMR) I'm not sure we'd like
to investigate here. Please check the problem also appears with "Apache" Spark
2.4.4.
> NullPointerException in HDFSBackedStateStoreProvider
> ----------------------------------------------------
>
> Key: SPARK-33792
> URL: https://issues.apache.org/jira/browse/SPARK-33792
> Project: Spark
> Issue Type: Bug
> Components: Structured Streaming
> Affects Versions: 2.4.4
> Environment: emr-5.28.0
> Reporter: Gabor Barna
> Priority: Major
>
> Hi,
> We are getting NPEs with spark structured streaming from time-to-time. Here's
> the stacktrace:
> {code:java}
> java.lang.NullPointerException
> at
> org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider$HDFSBackedStateStore$$anonfun$iterator$1.apply(HDFSBackedStateStoreProvider.scala:164)
> at
> org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider$HDFSBackedStateStore$$anonfun$iterator$1.apply(HDFSBackedStateStoreProvider.scala:163)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:410)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:410)
> at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:463)
> at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
> at scala.collection.Iterator$JoinIterator.hasNext(Iterator.scala:220)
> at
> org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:31)
> at
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage3.processNext(Unknown
> Source)
> at
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
> at
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$13$$anon$1.hasNext(WholeStageCodegenExec.scala:636)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
> at
> org.apache.spark.sql.kinesis.KinesisWriteTask.execute(KinesisWriteTask.scala:50)
> at
> org.apache.spark.sql.kinesis.KinesisWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1$$anonfun$apply$1.apply$mcV$sp(KinesisWriter.scala:40)
> at
> org.apache.spark.sql.kinesis.KinesisWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1$$anonfun$apply$1.apply(KinesisWriter.scala:40)
> at
> org.apache.spark.sql.kinesis.KinesisWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1$$anonfun$apply$1.apply(KinesisWriter.scala:40)
> at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
> at
> org.apache.spark.sql.kinesis.KinesisWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(KinesisWriter.scala:40)
> at
> org.apache.spark.sql.kinesis.KinesisWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(KinesisWriter.scala:38)
> at
> org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$28.apply(RDD.scala:935)
> at
> org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$28.apply(RDD.scala:935)
> at
> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
> at
> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
> at org.apache.spark.scheduler.Task.run(Task.scala:123)
> at
> org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
> at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748) {code}
> As you can see we are using [https://github.com/qubole/kinesis-sql/] for
> writing the kinesis queue, the state backend is HDFS.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]