The only data loss I've seen is where a topology with KafkaSpout gets so far behind that the Kafka log segment for a given partition is rotated. In such a scenario, you'll see an OffsetOutOfRangeException.
--John On Tue, Jan 19, 2016 at 5:21 PM, Milind Vaidya <[email protected]> wrote: > Yes. In a sunny day scenario there is no data loss. But we are trying to > list some cases where there will be a data loss, or at least we want to > consider different scenarios in which one or more components fail and see > how the kafka-storm set up reacts to that and if there is any data loss. > > We had some scenarios like you mentioned where the maxOffsetBehind setting > led to some problems due to down stream slow operations. But we are not > worried about kafka retention period either, that is a configuration issue. > What we are looking at is some thread accidentally dying say kafka-spout or > some kafka host containing all partitions for a topic goes down etc. > > > > On Sat, Jan 16, 2016 at 5:32 AM, Abhishek Agarwal <[email protected]> > wrote: > >> The kafka spout doesn't have a data loss scenario unless you have >> modified the maxOffsetBehind setting (Long.MAX_VALUE by default) and >> acks/fails are being done properly. Though data could be lost due to >> retention being kicked in kafka. The topology will keep retrying a timed >> out message but kafka is not going to keep it forever. >> >> On Fri, Jan 15, 2016 at 12:21 AM, Milind Vaidya <[email protected]> >> wrote: >> >>> Hi >>> >>> I have been using kafka-storm setup for more than a year, running almost >>> 10 different topologies. >>> >>> The flow is something like this >>> >>> Producer --> Kafka Cluster --> Storm cluster --> MongoDB. >>> >>> The zookeeper keeps the metadata. >>> >>> So far the approach was little ad hoc and want it to be more >>> disciplined. We are trying to achieve no data loss and automation in case >>> of failure handling. >>> >>> What are the failure scenarios in case of a storm cluster ? Failure as >>> in data loss. We will be trying to cover once we know them. >>> >>> >>> >>> >>> >>> >>> >>> >> >> >> -- >> Regards, >> Abhishek Agarwal >> >> >
