Eric, Thx.... we use log rotate on hourly basis, just wanted to know if there's anything different that we might be missing.
-anurag On Thu, Sep 29, 2011 at 11:47 AM, Eric Hauser <ewhau...@gmail.com> wrote: > Anurag, > > I wouldn't tail the log files, but instead make use of Apache's > facilities to pipe the logs to another program: > > http://httpd.apache.org/docs/2.2/mod/core.html#errorlog > http://httpd.apache.org/docs/2.0/programs/rotatelogs.html > > > On Thu, Sep 29, 2011 at 2:38 PM, Anurag <anurag.pha...@gmail.com> wrote: >> Eric/Jun, >> Can you throw some light on how to handle apache log rotation? afaik, >> even if we write custom code to tail a file, the file handle is lost >> on rotation and might result in some loss of data. >> >> >> On Thu, Sep 29, 2011 at 11:35 AM, Jeremy Hanna >> <jeremy.hanna1...@gmail.com> wrote: >>> Thanks a lot for the comparison Eric. Really good to hear a perspective >>> from a user of both. >>> >>> On Sep 29, 2011, at 1:25 PM, Eric Hauser wrote: >>> >>>> Jeremy, >>>> >>>> I've used both Flume and Kafka, and I can provide some info for comparison: >>>> >>>> Flume >>>> - The current Flume release 0.9.4 has some pretty nasty bugs in it >>>> (most have been fixed in trunk). >>>> - Flume is a more complex to maintain operations-wise (IMO) than Kafka >>>> since you have to setup masters and collectors (you don't necessarily >>>> need collectors if you aren't writing to HDFS) >>>> - Flume has a well defined pattern for doing what you want: >>>> http://www.cloudera.com/blog/2010/09/using-flume-to-collect-apache-2-web-server-logs/ >>>> >>>> Kafka >>>> - If you need multiple Kafka partitions for the logs, you will want to >>>> partition by host so the messages arrive in order for the same host >>>> - You can use the same piped technique as Flume to publish to Kafka, >>>> but you'll have to write a little code to publish and subscribe to the >>>> stream >>>> - Kafka does not provide any of the file rolling, compression, etc. >>>> that Flume provides >>>> - If you ever want to do anything more interesting with those log >>>> files than just send them to one location, publishing them to Kafka >>>> would allow you to add additional consumers later. Flume has a >>>> concept of fanout sinks, but I don't care for the way it works. >>>> >>>> >>>> >>>> On Thu, Sep 29, 2011 at 1:48 PM, Jun Rao <jun...@gmail.com> wrote: >>>>> Jeremy, >>>>> >>>>> Yes, Kafka will be a good fit for that. >>>>> >>>>> Thanks, >>>>> >>>>> Jun >>>>> >>>>> On Thu, Sep 29, 2011 at 10:12 AM, Jeremy Hanna >>>>> <jeremy.hanna1...@gmail.com>wrote: >>>>> >>>>>> We have a number of web servers in ec2 and periodically we just blow them >>>>>> away and create new ones. That makes keeping logs problematic. We're >>>>>> looking for a way to stream the logs from those various sources directly >>>>>> to >>>>>> a central log server - either just a single server or hdfs or something >>>>>> like >>>>>> that. >>>>>> >>>>>> My question is whether kafka is a good fit for that or should I be >>>>>> looking >>>>>> more along the lines of flume or scribe? >>>>>> >>>>>> Many thanks. >>>>>> >>>>>> Jeremy >>>>> >>> >>> >> >