Hi all. I'm in the middle of hardware provisioning for flume-hbase-hadoop solution. The plan is that flume agents collect and pass log data to collectors and the collectors write data into hbase using sink. The question is a flume collector's scale.
Flume agents:250 Data receiving ratio: 5.78MB/second Data writing ratio: 17.9MB/second Number of data nodes: 12 This system will be used to provide real-time use case, so there shouldn't be delay. How many collectors required to handle this request? Thanks in advance,