It's a little unclear to me who is transferring the chunks to the collectors.  
Does each adaptor have a connection or does the agent have a single connection 
to the collector?   For example if I have 10 log files that I am tailing (an 
adaptor for each) do they all go to the same collector or does it distribute 
those to any one of the collectors I have listed in my collectors file?  

http://hadoop.apache.org/chukwa/docs/current/design.html#Collectors

"Rather than have each adaptor write directly to HDFS, data is sent across the 
network to a collector process, that does the HDFS writes. Each collector 
receives data from up to several hundred hosts, and writes all this data to a 
single sink file, which is a Hadoop sequence file of serialized Chunks. 
Periodically, collectors close their sink files, rename them to mark them 
available for processing, and resume writing a new file. Data is sent to 
collectors over HTTP."




Corbin Hoenes
cor...@tynt.com
skype: choenes



Reply via email to