To add to Hari's recommendation, if the beginning of each log event is consistent in terms of timestamps and log levels etc (yyyy-MM-dd HH:ii:ss DEBUG or INFO or WARNING or FATAL) you can create a regex that scans for these string patterns and use that as delimiters instead of line endings On Mar 11, 2013 8:08 PM, "Hari Shreedharan" <[email protected]> wrote:
> + user@ > > Hi Ravi, > > I think the best thing to do would be to write your own deserializer that > can read the file and understand the format. The reason the deserializer is > pluggable in Spooling Directory Source is exactly for this reason (in fact, > stack traces were one of the use-cases discussed on the mailing list). > Since this is pluggable, you can use any logic to figure out when an event > is complete. > > Hari > > -- > Hari Shreedharan > > On Sunday, March 10, 2013 at 11:45 PM, Ravi Kiran wrote: > > Hi Hari , > We are planning to work on Flume NG to stream all our application logs > to Hadoop using Flume. Based on the recommendations at Flume 1.3.1 > documentation, we are planning with > http://flume.apache.org/FlumeUserGuide.html#spooling-directory-source . I > would like to have the exception stacktrace that gets printed in the log of > each application be treated as a single event rather than have each line of > the exception as an event. To address this, should a change in the > application logging be done to ensure the exception is written out to a > single line in the log file or have a custom SpoolingFileLineReader that > reads lines and treat "\\n\\d\\d\\d\\d" as a new line for a event. > > Can you kindly suggest. > > Regards > Ravi. > > >
