[ https://issues.apache.org/jira/browse/SPARK-11211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen resolved SPARK-11211. ------------------------------- Resolution: Not A Problem > Kafka - offsetOutOfRange forces to largest > ------------------------------------------ > > Key: SPARK-11211 > URL: https://issues.apache.org/jira/browse/SPARK-11211 > Project: Spark > Issue Type: Bug > Components: Streaming > Affects Versions: 1.3.1, 1.5.1 > Reporter: Daniel Strassler > > This problem relates to how DStreams using the Direct Approach of connecting > to a Kafka topic behave when they request an offset that does not exist on > the topic. Currently it appears the "auto.offset.reset" configuration value > is being ignored and the default value of “largest” is always being used. > > When using the Direct Approach of connecting to a Kafka topic using a > DStream, even if you have the Kafka configuration "auto.offset.reset" set to > smallest, the behavior in the event of a > kafka.common.OffsetOutOfRangeException exception is to move the next offset > to be consumed value to the largest value on the Kafka topic. It appears > that the exception is being eaten and not propagated up to the driver as > well, so a work around triggered by the propagation of the error can not be > implemented either. > > The current behavior of setting to largest means that any data on the Kafka > topic at the time of the exception being thrown is skipped(lost) to > consumption and only data produced to the topic after the exception will be > consumed. Two possible fixes are listed below. > > Fix 1: When “auto.offset.reset" is set to “smallest”, the DStream should set > the next consumed offset to be the smallest offset value on the Kafka topic. > > Fix 2: Propagate the error to the Driver to allow it to react as it deems > appropriate. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org