Hello everybody,

I have a small question on best way to implement HA in flume. 

I have seen several features in flume to enhance the HA and noloss on 
datas:
file or database channels
load balancing and failover capabilities in sinks and flume sdk
transactions which garantees that an event is removed from (n-1) agent 
channel only after it is received in channel by agent (n)

I also read a post saying that you could duplicate your flows of data, and 
use hadoop to handle the duplicates ....

I would like to know what is the recommended architecture to guarantee 
that an event given to flume does arrive to HDFS....even in case of 
massive failures, machine crash ... .

Thanks and best regards

Pascal
  • flume HA Pascal Taddei

Reply via email to