Re: structured streaming polling timeouts

2017-01-11 Thread Shixiong(Ryan) Zhu
No. I think increasing the timeout should work. Spark 2.1.0 changed this
timeout to 120 seconds as we found the default value in 2.0.2 is too small.

On Wed, Jan 11, 2017 at 12:01 PM, Timothy Chan  wrote:

> We're currently using EMR and they are still on Spark 2.0.2.
>
> Do you have any other suggestions for additional parameters to adjust
> besides "kafkaConsumer.pollTimeoutMs"?
>
> On Wed, Jan 11, 2017 at 11:17 AM Shixiong(Ryan) Zhu <
> shixi...@databricks.com> wrote:
>
>> You can increase the timeout using the option
>> "kafkaConsumer.pollTimeoutMs". In addition, I would recommend you try Spark
>> 2.1.0 as there are many improvements in Structured Streaming.
>>
>> On Wed, Jan 11, 2017 at 11:05 AM, Timothy Chan 
>> wrote:
>>
>> I'm using Spark 2.0.2 and running a structured streaming query. When I
>> set startingOffsets to earliest I get the following timeout errors:
>>
>> java.lang.AssertionError: assertion failed: Failed to get records for
>> spark-kafka-source-be89d84c-f6e9-4d2b-b6cd-570942dc7d5d-185814897-executor
>> my-favorite-topic-0 1127918 after polling for 2048
>>
>> I do not get these errors when I set startingOffsets to latest.
>>
>>
>>
>>


Re: structured streaming polling timeouts

2017-01-11 Thread Timothy Chan
We're currently using EMR and they are still on Spark 2.0.2.

Do you have any other suggestions for additional parameters to adjust
besides "kafkaConsumer.pollTimeoutMs"?

On Wed, Jan 11, 2017 at 11:17 AM Shixiong(Ryan) Zhu 
wrote:

> You can increase the timeout using the option
> "kafkaConsumer.pollTimeoutMs". In addition, I would recommend you try Spark
> 2.1.0 as there are many improvements in Structured Streaming.
>
> On Wed, Jan 11, 2017 at 11:05 AM, Timothy Chan 
> wrote:
>
> I'm using Spark 2.0.2 and running a structured streaming query. When I set 
> startingOffsets
> to earliest I get the following timeout errors:
>
> java.lang.AssertionError: assertion failed: Failed to get records for
> spark-kafka-source-be89d84c-f6e9-4d2b-b6cd-570942dc7d5d-185814897-executor
> my-favorite-topic-0 1127918 after polling for 2048
>
> I do not get these errors when I set startingOffsets to latest.
>
>
>
>


Re: structured streaming polling timeouts

2017-01-11 Thread Shixiong(Ryan) Zhu
You can increase the timeout using the option
"kafkaConsumer.pollTimeoutMs". In addition, I would recommend you try Spark
2.1.0 as there are many improvements in Structured Streaming.

On Wed, Jan 11, 2017 at 11:05 AM, Timothy Chan  wrote:

> I'm using Spark 2.0.2 and running a structured streaming query. When I set 
> startingOffsets
> to earliest I get the following timeout errors:
>
> java.lang.AssertionError: assertion failed: Failed to get records for
> spark-kafka-source-be89d84c-f6e9-4d2b-b6cd-570942dc7d5d-185814897-executor
> my-favorite-topic-0 1127918 after polling for 2048
>
> I do not get these errors when I set startingOffsets to latest.
>
>
>


structured streaming polling timeouts

2017-01-11 Thread Timothy Chan
I'm using Spark 2.0.2 and running a structured streaming query. When I
set startingOffsets
to earliest I get the following timeout errors:

java.lang.AssertionError: assertion failed: Failed to get records for
spark-kafka-source-be89d84c-f6e9-4d2b-b6cd-570942dc7d5d-185814897-executor
my-favorite-topic-0 1127918 after polling for 2048

I do not get these errors when I set startingOffsets to latest.