Hi Tp, 1) Yes, chukwa communicate over http. By default, collector listens to port 8080.
2) If agent only has one collect defined in it's collector list. It will retry the same collector after a few second pause. 3) There are 2 additional features for improving end-to-end reliability. In Chukwa collector, you can turn on httpConnector.asyncAcks=true. This will ensure Agent resend data if the data has not been committed. A second method is to use localWriter to buffer the data on local disk of the collector and periodically upload the data to HDFS. Both options can be configured in chukwa-collector-conf.xml. Hope this helps. regards, Eric On Jul 29, 2011, at 11:04 AM, T. A. Smooth wrote: > Hello I am checking out Chukwa. I have a few questions I was hoping the mail > list could answer :-) > > 1)Does Chukwa agents communicate to collectors over http? Or some other > protocol? > > The agent configuration makes me believe that: > http://incubator.apache.org/chukwa/docs/r0.4.0/admin.html#Configuration > > 2) And the docs it seems an Agent will pick a collector at random and then > use that collect until there is a problem in communicating with it. How do > you think the agent/collector would act if they have a load balancer between > them? For example, the agent configuration would have just one url > http://collector-loadbalancer. example.com:8080/ > > The load balancer would have 1 or more collectors behind it saving the chunks > it receives to disk or hadoop. > > 3) Does chukwa have any “end-to-end” reliability features for message > delivery? For example, a collector may receive the chunk from the agent but > it may have a problem writing it to the data store. (ie. Disk space full, > connection to hadoop down) . Will the agent be notified that the chunk was > not processed for a certain reason and the agent is told to cache to disk the > missed message? > > Thanks for the info! > > -tp- >
