https://issues.apache.org/jira/browse/CHUKWA-734
On Sat, Feb 14, 2015 at 12:13 PM, Lewis John Mcgibbney < lewis.mcgibb...@gmail.com> wrote: > Hi Eric, > Thank you for the feedback. > This is more than helpful. > I am going to write a Gora module for Chuckwa. > I am going to progress on basis of implementing log monitor for Nutch. > Can Chuckwa currently write to file and email response? > Thanks > Lewis > > [0] http://gora.apache.org > > On Sat, Feb 14, 2015 at 9:30 AM, Eric Yang <eric...@gmail.com> wrote: > >> Hi Lewis, >> >> Parse error can be captured and store errors to another HDFS location. >> In Chukwa 0.4 and earlier, we have demux map reduce job which does the >> extraction and store structured data in HDFS, and errors are channel to >> another HDFS folder called InError, with the cause of the parsing error. >> This is still a batch oriented operation. In Chukwa 0.6, we can setup >> multiple pipeline writer. The pipeline writers can be configured to >> provide parsing and channel error to somewhere else, if data parse >> properly, then write it to HBase or HDFS. However, you will need to write >> the pipeline writer class to extend this functionality. We currently only >> have a couple pipeline writers, LocalWriter, HBaseWriter, and >> SeqFileWriter. SeqFileWriter needs to be the last one in the pipeline, if >> you choose to write data to HDFS. See this page for how to configure >> pipeline writer to achieve partially of what you are looking for: >> >> http://chukwa.apache.org/docs/r0.6.0/pipeline.html >> >> Hope this helps. >> >> regards, >> Eric >> >> On Thu, Feb 12, 2015 at 11:12 PM, Lewis John Mcgibbney < >> lewis.mcgibb...@gmail.com> wrote: >> >>> Hi Folks, >>> For some time I have been meaning to get in touch to get advice on >>> developing a tool for log analysis of Apache Nutch [0] logs. >>> What I am referring to particularly is monitoring of logs in a bid to >>> identify particular errors which we may anticipate. >>> Nutch jobs are batch oriented in architecture which are inherited from >>> Hadoop, we typically see errors in the parse phase of a crawl so it is >>> events like this that I would like to anticipate, monitor and report on, >>> possibly through email. >>> So I am therefore thinking about building a Chuckwa-powered tool for >>> Nutch which would become part of our codebase. >>> Is Chukwa the right tool for this? Any information about similar efforts >>> would be very much appreciated. >>> best >>> Lewis >>> >>> [0] http://nutch.apache.org >>> >>> -- >>> *Lewis* >>> >> >> > > > -- > *Lewis* > -- *Lewis*