Re: Agent and collector

Ariel Rabkin Fri, 29 Jul 2011 15:55:15 -0700

Yes, the agent-->collector path is HTTP.

This was done precisely to allow load balancers. I don't know how
tested that configuration is, though. I think most sites had Chukwa
itself do the load balancing by specifying multiple collectors.


There is a notion of end-to-end reliability; the so-called
asynchronous ack mechanism. It's off by default and hasn't been tried
much in production. See
http://www.usenix.org/events/lisa10/tech/full_papers/Rabkin.pdf for
the detailed design of it.

--Ari

On Fri, Jul 29, 2011 at 11:04 AM, T. A. Smooth <[email protected]> wrote:
> Hello I am checking out Chukwa. I have a few questions I was hoping the mail
> list could answer :-)
>
> 1)Does Chukwa agents communicate to collectors over http? Or some other
> protocol?
>
> The agent configuration makes me believe that:
> http://incubator.apache.org/chukwa/docs/r0.4.0/admin.html#Configuration
>
> 2) And the docs it seems an Agent will pick a collector at random and then
> use that collect until there is a problem in communicating with it. How do
> you think the agent/collector would act if they have a load balancer between
> them? For example, the agent configuration would have just one url
> http://collector-loadbalancer. example.com:8080/
>
> The load balancer would have 1 or more collectors behind it saving the
> chunks it receives to disk or hadoop.
>
> 3) Does chukwa have any “end-to-end” reliability features for message
> delivery? For example, a collector may receive the chunk from the agent but
> it may have a problem writing it to the data store. (ie. Disk space full,
> connection to hadoop down) . Will the agent be notified that the chunk was
> not processed for a certain reason and the agent is told to cache to disk
> the missed message?
>
> Thanks for the info!
>
> -tp-



-- 
Ari Rabkin [email protected]
UC Berkeley Computer Science Department

Re: Agent and collector

Reply via email to