[
https://issues.apache.org/jira/browse/SPARK-30745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17033348#comment-17033348
]
Hyukjin Kwon commented on SPARK-30745:
--------------------------------------
Spark 2.0.x is EOL. Can you try it in a higher version?
> Spark streaming, kafka broker error, "Failed to get records for
> spark-executor- .... after polling for 512"
> -----------------------------------------------------------------------------------------------------------
>
> Key: SPARK-30745
> URL: https://issues.apache.org/jira/browse/SPARK-30745
> Project: Spark
> Issue Type: Bug
> Components: Build, Deploy, DStreams, Kubernetes
> Affects Versions: 2.0.2
> Environment: Spark 2.0.2, Kafka 0.10
> Reporter: Harneet K
> Priority: Major
> Labels: Spark2.0.2, cluster, kafka-0.10, spark-streaming-kafka
>
> We have a spark streaming application reading data from Kafka.
> Data size: 15 Million
> Below errors were seen:
> java.lang.AssertionError: assertion failed: Failed to get records for
> spark-executor- .... after polling for 512 at
> scala.Predef$.assert(Predef.scala:170)
> There were more errors seen pertaining to CachedKafkaConsumer
> at
> org.apache.spark.streaming.kafka010.CachedKafkaConsumer.get(CachedKafkaConsumer.scala:74)
> at
> org.apache.spark.streaming.kafka010.KafkaRDD$KafkaRDDIterator.next(KafkaRDD.scala:227)
> at
> org.apache.spark.streaming.kafka010.KafkaRDD$KafkaRDDIterator.next(KafkaRDD.scala:193)
>
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
> at
> org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:214)
> at
> org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:935)
> at
> org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:926)
> at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:866)
> at
> org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:926)
> at
> org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:670)
> at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:330)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:281)
> at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:105)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
> at org.apache.spark.scheduler.Task.run(Task.scala:86)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
>
> The spark.streaming.kafka.consumer.poll.ms is set to default 512ms and other
> Kafka stream timeout settings are default.
> "request.timeout.ms"
> "heartbeat.interval.ms"
> "session.timeout.ms"
> "max.poll.interval.ms"
> Also, the kafka is being recently updated to 0.10 from 0.8. In 0.8, this
> behavior was not seen.
> No resource issues are seen.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]