If your topology has saved Kafka offset in your zookeeper it will start processing from that otherwise It checks spoutConfig.forceFromStart set to true in this case it will try to fetch data from the beginning of the queue i.e kafka.api.EarliestTime() . If none of the above matches it will pick users spoutConfig.startOffsetTime.
"kafka.api.OffsetRequest.EarliestTime() finds the beginning of the data in the logs and starts streaming from there, kafka.api.OffsetRequest.LatestTime() will only stream new messages.” If you stopping and re-deploying the topology make sure you used the same name as KafkaSpout uses topology name to store and retrieve the offsets from zookeeper. -- Harsha On March 9, 2015 at 7:30:38 AM, Tousif ([email protected]) wrote: If your topology has saved Kafka offset in your zookeeper it will start processing from that otherwise It checks spoutConfig.forceFromStart set to true in this case it will try to fetch data from the beginning of the queue i.e kafka.api.EarliestTime() . If none of the above matches it will pick users spoutConfig.startOffsetTime. "kafka.api.OffsetRequest.EarliestTime() finds the beginning of the data in the logs and starts streaming from there, kafka.api.OffsetRequest.LatestTime() will only stream new messages.” If you stopping and re-deploying the topology make sure you used the same name as KafkaSpout uses topology name to store and retrieve the offsets from zookeeper. -- Harsha
