The kafka spout doesn't have a data loss scenario unless you have modified the maxOffsetBehind setting (Long.MAX_VALUE by default) and acks/fails are being done properly. Though data could be lost due to retention being kicked in kafka. The topology will keep retrying a timed out message but kafka is not going to keep it forever.
On Fri, Jan 15, 2016 at 12:21 AM, Milind Vaidya <[email protected]> wrote: > Hi > > I have been using kafka-storm setup for more than a year, running almost > 10 different topologies. > > The flow is something like this > > Producer --> Kafka Cluster --> Storm cluster --> MongoDB. > > The zookeeper keeps the metadata. > > So far the approach was little ad hoc and want it to be more disciplined. > We are trying to achieve no data loss and automation in case of failure > handling. > > What are the failure scenarios in case of a storm cluster ? Failure as in > data loss. We will be trying to cover once we know them. > > > > > > > > -- Regards, Abhishek Agarwal
