In flume-ng is there any advantages of 2-tier topology in a cluster of 30-40 nodes?

Jagadish Bihani Tue, 29 Jan 2013 22:06:11 -0800

Hi

In our scenario there are around 30 machines from which we want to putdata into HDFS.


Now the approach we thought of initially was:

1. First tier : Agent which collect data from source then pass it toavro sink.2. Second tier: Lets call those agents 'collectors' which collect datafrom First tier agents and then dump it to HDFS.

(Second tier agents are fewer in number say 4:1)

Instead of above topology if I simply use HDFS sink in first tieragents. It can serve the purpose.And also number of nodes are lesser (say 30) that won't hurt HDFSnamenode too much compared

to if number of nodes were say 1000.

But apart from that I don't say any advantage of adding the 2nd tier.

Is there any advantage I am missing in terms of failover, HDFSperformance or any other parameter?


Regards,
Jagadish

In flume-ng is there any advantages of 2-tier topology in a cluster of 30-40 nodes?

Reply via email to