Hi everyone,

 

We are in the process of designing a high available system with zero data loss 
tolerance. Plan is for the spouts to read from a queue and process them down in 
several different specialized bolts and then flush to DB. How can we guarantee 
no data loss here? Should we keep the queue transactions open until data is 
committed to DB? Should we persist the state of all the bolts? What happens to 
the intermediate data if the whole cluster fails?

 

Any suggestions are much appreciated.

 

Nima

Reply via email to