[ https://issues.apache.org/jira/browse/SPARK-30745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17033348#comment-17033348 ]
Hyukjin Kwon commented on SPARK-30745: -------------------------------------- Spark 2.0.x is EOL. Can you try it in a higher version? > Spark streaming, kafka broker error, "Failed to get records for > spark-executor- .... after polling for 512" > ----------------------------------------------------------------------------------------------------------- > > Key: SPARK-30745 > URL: https://issues.apache.org/jira/browse/SPARK-30745 > Project: Spark > Issue Type: Bug > Components: Build, Deploy, DStreams, Kubernetes > Affects Versions: 2.0.2 > Environment: Spark 2.0.2, Kafka 0.10 > Reporter: Harneet K > Priority: Major > Labels: Spark2.0.2, cluster, kafka-0.10, spark-streaming-kafka > > We have a spark streaming application reading data from Kafka. > Data size: 15 Million > Below errors were seen: > java.lang.AssertionError: assertion failed: Failed to get records for > spark-executor- .... after polling for 512 at > scala.Predef$.assert(Predef.scala:170) > There were more errors seen pertaining to CachedKafkaConsumer > at > org.apache.spark.streaming.kafka010.CachedKafkaConsumer.get(CachedKafkaConsumer.scala:74) > at > org.apache.spark.streaming.kafka010.KafkaRDD$KafkaRDDIterator.next(KafkaRDD.scala:227) > at > org.apache.spark.streaming.kafka010.KafkaRDD$KafkaRDDIterator.next(KafkaRDD.scala:193) > > at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) > at > org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:214) > at > org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:935) > at > org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:926) > at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:866) > at > org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:926) > at > org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:670) > at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:330) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:281) > at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:105) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:283) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70) > at org.apache.spark.scheduler.Task.run(Task.scala:86) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > > The spark.streaming.kafka.consumer.poll.ms is set to default 512ms and other > Kafka stream timeout settings are default. > "request.timeout.ms" > "heartbeat.interval.ms" > "session.timeout.ms" > "max.poll.interval.ms" > Also, the kafka is being recently updated to 0.10 from 0.8. In 0.8, this > behavior was not seen. > No resource issues are seen. > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org