Dmitry Ochnev created SPARK-19275:
-------------------------------------
Summary: Spark Streaming, Kafka receiver, "Failed to get records
for ... after polling for 512"
Key: SPARK-19275
URL: https://issues.apache.org/jira/browse/SPARK-19275
Project: Spark
Issue Type: Bug
Components: DStreams
Affects Versions: 2.0.0
Environment: Apache Spark 2.0.0, Kafka 0.10 for Scala 2.11
Reporter: Dmitry Ochnev
Priority: Minor
We have a Spark Streaming application reading records from Kafka 0.10.
Some tasks are failed because of the following error:
"java.lang.AssertionError: assertion failed: Failed to get records for (...)
after polling for 512"
The first attempt fails and the second attempt (retry) completes successfully,
- this is the pattern that we see for many tasks in our logs.
A similar case with a stack trace are described here:
https://www.mail-archive.com/[email protected]/msg56564.html
https://gist.github.com/SrikanthTati/c2e95c4ac689cd49aab817e24ec42767
Here is the line from the stack trace where the error is raised:
org.apache.spark.streaming.kafka010.CachedKafkaConsumer.get(CachedKafkaConsumer.scala:74)
We tried several values for "spark.streaming.kafka.consumer.poll.ms", - 2, 5,
10, 30 and 60 seconds, but the error appeared in all the cases except the last
one. Moreover, increasing the threshold led to increasing total Spark stage
duration.
In other words, increasing "spark.streaming.kafka.consumer.poll.ms" led to
fewer task failures but with cost of total stage duration. So, it is bad for
performance when processing data streams.
We have a suspicion that there is a bug in CachedKafkaConsumer (and/or other
related classes) which inhibits the reading process.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]