Hi Ari,
Thanks for the info.
So the recommended model is to have my app write out to a directory on
local storage and on the same machine have an adapter that tails files
from that same directory and handles the communication to the collector?
I can see your point about loss of network connectivity - my assumption
was that the Chukwa adapter buffered/persisted the data internally on
network failures.
If the above is correct, then my next question is - how would I know
when I can reclaim the files on the local storage? How do I know when
they've been sent to the collector and thus can be deleted from local
storage so as to prevent unbounded growth of the files?
Thanks!
Kirk
Ariel Rabkin wrote:
The benefit of the agent model is that it makes it much easier to do
failure tolerance. Networks go down, but writes to local disk are
about as reliable as anything can be. Also, it means that your app
won't fail, even if you hit a bug in Chukwa, and you won't lose the
collected data, even if your app crashes.
That said, it is certainly possible to embed the guts of the Chukwa
adaptors and agent inside your process. There's two plausible ways I
see to do it:
Approach 1)
- Instantiate a Connector, the piece that talks to the collector.
- The Connector reads from a static queue; you can get ahold of this
queue via DataFactory.
- Just push Chunks into the queue, and they'll get sent to the collector.
Approach 2)
Instantiate a ChukwaAgent, and then call its add(...) method to start adaptors.
Approach 1 is best if all the data you want is inside your
application; approach 2 is more flexible, but if you go for approach
two, you might just run Chukwa as a normal external process and be
done with it.
The backfilling loader (in class BackfillingLoader) may be worth inspecting
--Ari
On Thu, Apr 1, 2010 at 11:33 PM, Kirk True <k...@mustardgrain.com> wrote:
Hi all,
We'd like to push data to the collectors straight from our application as it
receives data from its clients. In a sense we'd like to embed a Chukwa agent
and our adapters inside our web application. Does this make sense or is this
approach flawed in some way? Otherwise we'd have to save the data to a file
in a shared directory that's then picked up by a stand-alone Chukwa agent
via a 'file tailer' or something. Or do we want to use the UDPAdaptor and
proxy our data to it, which in turn forwards it to the collector?
Thanks,
Kirk