Hello everybody, I have a small question on best way to implement HA in flume.
I have seen several features in flume to enhance the HA and noloss on datas: file or database channels load balancing and failover capabilities in sinks and flume sdk transactions which garantees that an event is removed from (n-1) agent channel only after it is received in channel by agent (n) I also read a post saying that you could duplicate your flows of data, and use hadoop to handle the duplicates .... I would like to know what is the recommended architecture to guarantee that an event given to flume does arrive to HDFS....even in case of massive failures, machine crash ... . Thanks and best regards Pascal
