Hi, I agree with Ari, the post-processing should be on another process/machine since we don't want to take more time/cpu/mem on the collector side.
Ari, could you give us some details on you're using the SocketTeeWriter? Thanks, /Jerome. On 11/3/09 5:27 PM, "Thushara Wijeratna" <[email protected]> wrote: > yeah, that makes sense. i don't have a strong argument, except it > might be a tad bit easier to integrate alerting to the system. > swatch is pretty good, however, for custom processing, for each > pattern matched, a separate process needs to be run. if alerts are > rare, as is generally the case, that is not a big problem. > one reason i'm considering Chukwa instead of swatch is that it > centralizes the input logs at the collector - swatch AFAIK doesn't > perform any centralization of logs. > > thanks, > thushara > > On Tue, Nov 3, 2009 at 3:51 PM, Ariel Rabkin <[email protected]> wrote: >> What you describe is certainly doable. I'm not sure what the use case >> is, though. >> >> The core goal for Chukwa is to facilitate MapReduce processing of >> logs. The idea of the SocketTeeWriter is to get a "sneak peek" at >> data, before it gets stored to HDFS. If collectors crash or get >> overloaded, data can get processed more than once by collectors. So >> there's a real cost to the real-time path. >> >> One of the main benefits of SocketTee is that the processing can >> happen in a separate process, or even on a separate machine. >> Integrating the pattern-matching in the pipeline is certainly doable, >> but it's not clear to me that that's an architecture we want to >> encourage or commit to. >> >> If people want Swatch, they know where to find it. What's the argument >> for needing to emulate it, real-time, in Chukwa? >> >> --Ari >> >> On Tue, Nov 3, 2009 at 3:48 PM, Thushara Wijeratna <[email protected]> wrote: >>> Would it be useful to provide something similar to the Swatch Log >>> monitoring for Chukwa? >>> http://www.linuxjournal.com/article/4776 >>> >>> Currently, we can listen to port 9094 (after running a >>> SocketTeeWriter), and handle each input line. >>> I'm wondering whether there will be a value add in creating some more >>> infra-structure code in Chukwa that will: >>> >>> 1. do some regular expression parsing and filter the lines with the >>> alert condition(s) >>> 2. perform some standard actions, like email etc >>> 3. provide an interface to perform custom handling for the user >>> >>> The basic core will be someting like this: >>> >>> Interface AlertCallback { >>> >>> boolean handle(String alertExp, String line); >>> >>> } >>> >>> Class AlertWriter extends PipelinableWriter { >>> private String[] alertExps; >>> private AlertCallback alertCB; >>> >>> public AlertWriter(String[] alertExps, AlertCallback alertCB); >>> } >>> >>> It seems like most of the plumbing is already there, exposed in >>> SocketTeeWriter class, for ex: Filter class. >>> If you all think it is a good idea, I can help with this. >>> >>> thanks, >>> thushara >>> >> >> >> >> -- >> Ari Rabkin [email protected] >> UC Berkeley Computer Science Department >>
