What you describe is certainly doable. I'm not sure what the use case is, though.
The core goal for Chukwa is to facilitate MapReduce processing of logs. The idea of the SocketTeeWriter is to get a "sneak peek" at data, before it gets stored to HDFS. If collectors crash or get overloaded, data can get processed more than once by collectors. So there's a real cost to the real-time path. One of the main benefits of SocketTee is that the processing can happen in a separate process, or even on a separate machine. Integrating the pattern-matching in the pipeline is certainly doable, but it's not clear to me that that's an architecture we want to encourage or commit to. If people want Swatch, they know where to find it. What's the argument for needing to emulate it, real-time, in Chukwa? --Ari On Tue, Nov 3, 2009 at 3:48 PM, Thushara Wijeratna <[email protected]> wrote: > Would it be useful to provide something similar to the Swatch Log > monitoring for Chukwa? > http://www.linuxjournal.com/article/4776 > > Currently, we can listen to port 9094 (after running a > SocketTeeWriter), and handle each input line. > I'm wondering whether there will be a value add in creating some more > infra-structure code in Chukwa that will: > > 1. do some regular expression parsing and filter the lines with the > alert condition(s) > 2. perform some standard actions, like email etc > 3. provide an interface to perform custom handling for the user > > The basic core will be someting like this: > > Interface AlertCallback { > > boolean handle(String alertExp, String line); > > } > > Class AlertWriter extends PipelinableWriter { > private String[] alertExps; > private AlertCallback alertCB; > > public AlertWriter(String[] alertExps, AlertCallback alertCB); > } > > It seems like most of the plumbing is already there, exposed in > SocketTeeWriter class, for ex: Filter class. > If you all think it is a good idea, I can help with this. > > thanks, > thushara > -- Ari Rabkin [email protected] UC Berkeley Computer Science Department
