-1 and -2 come from kafka.api.OffsetRequest: https://github.com/apache/kafka/blob/0.8.1/core/src/main/scala/kafka/api/OffsetRequest.scala
-1 is the latest time, -2 is the earliest time In order to be sure you always start from the most recent offset in kafka, you need to set up your KafkaConfig ( https://github.com/apache/storm/blob/master/external/storm-kafka/src/jvm/storm/kafka/KafkaConfig.java) so that forceFromStart is set to true and startOffsetTime is -1: config.forceFromStart = true; // this might be what you're missing config.startOffsetTime = kafka.api.OffsetRequest.LatestTime(); // i.e. -1 On Thu, Dec 4, 2014 at 5:34 PM, Filipa Moura <[email protected]> wrote: > Hi, > I'm trying to get my KafkaSpout to read the latest offset from Kafka. > > With my standard configurations, I see startOffsetTime being set to -2 on > the logs: > 2014-12-04 23:19:34 s.k.PartitionManager [INFO] Read last commit offset > from zookeeper: 2924325359; old topology_id: > de65d4f8-a8e6-4f72-99bd-2d66e95fd293 - new topology_id: > 89cf7268-9db1-423a-978b-3fe214d64e8e > 2014-12-04 23:19:34 s.k.PartitionManager [INFO] Read last commit offset > from zookeeper: 3217013339; old topology_id: > de65d4f8-a8e6-4f72-99bd-2d66e95fd293 - new topology_id: > 89cf7268-9db1-423a-978b-3fe214d64e8e > 2014-12-04 23:19:34 s.k.PartitionManager [INFO] Last commit offset from > zookeeper: 2924325359 > 2014-12-04 23:19:34 s.k.PartitionManager [INFO] Starting Kafka xx3.com:1 > from offset 3217013339 > 2014-12-04 23:19:34 s.k.PartitionManager [INFO] Commit offset 2924497976 > is more than 100000 behind, resetting to startOffsetTime=-2 > 2014-12-04 23:19:34 s.k.PartitionManager [INFO] Starting Kafka xxx3.com:0 > from offset 2924497976 > > Adding the following on the code "spoutConfig.startOffsetTime = -1;" : > 2014-12-04 23:14:21 s.k.PartitionManager [INFO] Read last commit offset > from zookeeper: 3217004355; old topology_id: > a7010f51-9fea-43e4-ba16-c9ad1f0ec245 - new topology_id: > de65d4f8-a8e6-4f72-99bd-2d66e95fd293 > 2014-12-04 23:14:21 s.k.PartitionManager [INFO] Read last commit offset > from zookeeper: 2923885086; old topology_id: > a7010f51-9fea-43e4-ba16-c9ad1f0ec245 - new topology_id: > de65d4f8-a8e6-4f72-99bd-2d66e95fd293 > 2014-12-04 23:14:21 s.k.PartitionManager [INFO] Starting Kafka xxx3.com:1 > from offset 3217004355 > 2014-12-04 23:14:21 s.k.PartitionManager [INFO] Last commit offset from > zookeeper: 2923885086 > 2014-12-04 23:14:21 s.k.PartitionManager [INFO] Commit offset 2924022915 > is more than 100000 behind, resetting to startOffsetTime=-1 > 2014-12-04 23:14:21 s.k.PartitionManager [INFO] Starting Kafka xxx3.com:0 > from offset 2924022915 > > What is the difference? And how can I be sure it's using the most recent > offset from Kafka? > > Thank you, > Filipa >
