It's a little unclear to me who is transferring the chunks to the collectors. Does each adaptor have a connection or does the agent have a single connection to the collector? For example if I have 10 log files that I am tailing (an adaptor for each) do they all go to the same collector or does it distribute those to any one of the collectors I have listed in my collectors file?
http://hadoop.apache.org/chukwa/docs/current/design.html#Collectors "Rather than have each adaptor write directly to HDFS, data is sent across the network to a collector process, that does the HDFS writes. Each collector receives data from up to several hundred hosts, and writes all this data to a single sink file, which is a Hadoop sequence file of serialized Chunks. Periodically, collectors close their sink files, rename them to mark them available for processing, and resume writing a new file. Data is sent to collectors over HTTP." Corbin Hoenes cor...@tynt.com skype: choenes