We're currently using EMR and they are still on Spark 2.0.2. Do you have any other suggestions for additional parameters to adjust besides "kafkaConsumer.pollTimeoutMs"?
On Wed, Jan 11, 2017 at 11:17 AM Shixiong(Ryan) Zhu <shixi...@databricks.com> wrote: > You can increase the timeout using the option > "kafkaConsumer.pollTimeoutMs". In addition, I would recommend you try Spark > 2.1.0 as there are many improvements in Structured Streaming. > > On Wed, Jan 11, 2017 at 11:05 AM, Timothy Chan <tc...@lumoslabs.com> > wrote: > > I'm using Spark 2.0.2 and running a structured streaming query. When I set > startingOffsets > to earliest I get the following timeout errors: > > java.lang.AssertionError: assertion failed: Failed to get records for > spark-kafka-source-be89d84c-f6e9-4d2b-b6cd-570942dc7d5d-185814897-executor > my-favorite-topic-0 1127918 after polling for 2048 > > I do not get these errors when I set startingOffsets to latest. > > > >