Re: [BBLISA] Large scale log processing

Paul Beltrani Mon, 23 Nov 2009 10:23:38 -0800

Old thread, but this reply may be useful for the archives.

We're in the process of spinning up Scribe, the application developed
by Facebook to aggregate their logs.

We like this tool for several reasons.  (In no specific order)

* Scalability
It scales out horizontally, FWIW, it can handle Facebook's log load.
I doubt anyone on the list requires more than that. At that level, the
problem is not aggregating the logs but accessing them after the fact.
 I believe Facebook aggregates down to a Hadoop infrastructure.  We're
simply going to have multiple terminations based on "context".

* Built in HA/Robustness
If the next hop in the aggregation path is down, it logs locally until
that system becomes available.  Once it does come back, the local logs
are forwarded on.

* Flexibility
You tag messages by "context" and then direct where messages go based
on context.  e.g. You can have all your syslog aggregation end up at
location A, your Apache logs at location B and the output from your
application at location C.

There are mechanisms to capture standard "flat file" logs from
application as well as API's  to log directly from custom
applications.

The following URLs may be useful if you're looking for more info:
  http://developers.facebook.com/scribe/
  http://www.facebook.com/note.php?note_id=32008268919

http://www.silassewell.com/blog/2009/05/07/scribe-scalable-real-time-log-aggregation-for-centos-5-rhel-5/

  - Paul Beltrani

On Fri, May 15, 2009 at 8:25 AM, Mike Sprague <[email protected]> wrote:
> Hi folks,
>
> Long time listener, first time caller. :-)
>
> I work for a web hosting company with about a thousand linux servers.
> We're discussing options on how to process the logs mainly from our mail
> and web servers to make troubleshooting easier.  We're not really
> looking for long term storage; just a better way to be able to search
> the logs to diagnose either specific customer issues, broad system
> attacks, issues across a pool of servers or issues with a specific server.
>
> One obvious solution is syslog-ng and a central log server.  While I'm
> sure this will work, there will still be a lot of data to search through
> which could be time consuming.
>
> A colleague mentioned hadoop/MapReduce (http://hadoop.apache.org/).  On
> the surface, this seems like it might be a good fit, but I don't have
> any experience with it.
>
> I was hoping y'all could give some suggestions on what you use for this
> stuff and your opinion on how well it works.  I'm not looking for hard
> answers, just some suggestions on where we should research for a
> possible solution.  Any pointers are appreciated.
>
> I'd be happy to post a summary back here if y'all are interested.
>
> Thanks,
> mikeS
>
> --
> Michael F. Sprague
> [email protected]
>
> _______________________________________________
> bblisa mailing list
> [email protected]
> http://www.bblisa.org/mailman/listinfo/bblisa
>

_______________________________________________
bblisa mailing list
[email protected]
http://www.bblisa.org/mailman/listinfo/bblisa

Re: [BBLISA] Large scale log processing

Reply via email to