[
https://issues.apache.org/jira/browse/SPARK-21558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hyukjin Kwon updated SPARK-21558:
---------------------------------
Labels: bulk-closed (was: )
> Kinesis lease failover time should be increased or made configurable
> --------------------------------------------------------------------
>
> Key: SPARK-21558
> URL: https://issues.apache.org/jira/browse/SPARK-21558
> Project: Spark
> Issue Type: Bug
> Components: DStreams
> Affects Versions: 2.0.2
> Reporter: Clément MATHIEU
> Priority: Major
> Labels: bulk-closed
>
> I have a Spark Streaming application reading from a Kinesis stream which
> exhibits serious shard lease fickleness. The root cause as been identified as
> KCL default failover time being too low for our typical JVM pauses time:
> # KinesisClientLibConfiguration#DEFAULT_FAILOVER_TIME_MILLIS is 10 seconds,
> meaning that if a worker does not renew a lease within 10s, others workers
> will steal it
> # spark-streaming-kinesis-asl uses default KCL failover time and does not
> allow to configure it
> # Executor's JVM logs show frequent 10+ seconds pauses
> While we could spend some time to fine tune GC configuration to reduce pause
> times, I am wondering if 10 seconds is not too low. Typical Spark executors
> have very large heaps and GCs available in HotSpot are not great at ensuring
> low and deterministic pause times. One might also want to use ParallelGC.
> What do you think about:
> # Increasing fail over time (it might hurts application with low latency
> requirements)
> # Making it configurable
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]