No. I think increasing the timeout should work. Spark 2.1.0 changed this timeout to 120 seconds as we found the default value in 2.0.2 is too small.
On Wed, Jan 11, 2017 at 12:01 PM, Timothy Chan <tc...@lumoslabs.com> wrote: > We're currently using EMR and they are still on Spark 2.0.2. > > Do you have any other suggestions for additional parameters to adjust > besides "kafkaConsumer.pollTimeoutMs"? > > On Wed, Jan 11, 2017 at 11:17 AM Shixiong(Ryan) Zhu < > shixi...@databricks.com> wrote: > >> You can increase the timeout using the option >> "kafkaConsumer.pollTimeoutMs". In addition, I would recommend you try Spark >> 2.1.0 as there are many improvements in Structured Streaming. >> >> On Wed, Jan 11, 2017 at 11:05 AM, Timothy Chan <tc...@lumoslabs.com> >> wrote: >> >> I'm using Spark 2.0.2 and running a structured streaming query. When I >> set startingOffsets to earliest I get the following timeout errors: >> >> java.lang.AssertionError: assertion failed: Failed to get records for >> spark-kafka-source-be89d84c-f6e9-4d2b-b6cd-570942dc7d5d-185814897-executor >> my-favorite-topic-0 1127918 after polling for 2048 >> >> I do not get these errors when I set startingOffsets to latest. >> >> >> >>