Hi, that depends on the sink you want to use. Lets say you use E2E chains, the collectors are on actual hardware and you use compression I would put 10 agents per collector (180mb/s * 60 (for a minute based file closing) = 10.8 GB / min). To get closer on RT I would suggest a 10 sec roll, but more as 10 could be create a bottleneck at peak times The collectors need fast hard disks.
best, Alex -- Alexander Lorenz http://mapredit.blogspot.com On Feb 14, 2012, at 8:25 PM, Kim, Jongkook wrote: > Hi all. > > I'm in the middle of hardware provisioning for flume-hbase-hadoop solution. > The plan is that flume agents collect and pass log data to collectors and the > collectors write data into hbase using sink. > The question is a flume collector's scale. > > Flume agents:250 > Data receiving ratio: 5.78MB/second > Data writing ratio: 17.9MB/second > Number of data nodes: 12 > > This system will be used to provide real-time use case, so there shouldn't be > delay. > How many collectors required to handle this request? > > Thanks in advance,