Nathan et al,

Thanks for the feedback!

The only remaining major thing the intern has is multiple line messages
(like Java stack traces), aren't obvious how to set up.
Logstash has lots of googleable examples, but heka doesn't (yet).

After he (or I) figure it out, I would be happy to write up an example/gist.

Thanks!

tis 2 juni 2015 kl 07:18 skrev Victor Castell <[email protected]>:

> Wow that was a detailed narration, very useful thanks.
>
> My short story is that I picked Heka instead of logstash at the beggining
> without even trying logstash, I tryied fluentd instead.
>
> Fluentd performance was really disapointing.
>
> We're now using Heka for all our data shipment needs in production with a
> resonable amount of data being pushed to ES.
>
> We faced some challenges using Heka but as Nathan said, the comunity and
> Heka devs are invaluable helpful.
>
> Never regreted that decission.
>
>
> On Sat, May 30, 2015 at 7:13 PM, Nathan Williams <[email protected]
> > wrote:
>
>> Hi Matt,
>>
>> We happen to have recently (a couple of months back) made the transition
>> from a typical ELK stack, to what we now call our HEK stack (for reasons
>> both obvious and enduringly hilarious), largely due to the exact issues
>> you're concerned about. This is strictly anecdotal, as there seem to be
>> lots of other folks out there using logstash successfully; I've got nothing
>> against it, it just didn't end up working for us in the way we'd hoped. It
>> should also be said that Heka has a broader scope than Logstash, in that
>> it's made to address general stream processing needs, whereas logstash is
>> more focused on log processing specifically, so any effort to do a direct
>> comparison should bear that in mind.
>>
>> Our logstash pipeline was a pretty common one, we used beaver as an
>> agent, which slurped logs, wrapped them up as a logstash json_event, and
>> pushed them onto a redis queue as a buffer. Logstash was configured with
>> this redis queue as an input (along with a syslog input for dumber
>> devices). Logstash would then pass the event through some "filters" before
>> dumping to the Elasticsearch output.
>>
>> We ran into various problems with different pieces of this pipeline at
>> different times.
>>
>> We had minor problems getting beaver installed at one point due to
>> dependency conflicts (we used pip to install), but were able to work around
>> them by reverting to an older version of beaver.
>>
>> Whenever logstash was down for maintenance (and sometimes when it
>> wasn't), the redis queue would back up pretty quickly, which led to some
>> pretty awesome stuff like our production redis cluster going read-only when
>> the memory got big enough to no longer successfully bgsave (that was a fun
>> post-mortem). We mitigated by setting up a separate, logstash-specific
>> redis instance that we tuned to not care about being able to bgsave.
>>
>> Whenever elasticsearch was down, logstash would happily drop messages on
>> encountering an error on the output; easy enough to deal with by stopping
>> logstash before ES maintenance (just let the redis queue pile up), but not
>> so great for unplanned outages (it looks like 1.5.0 just added retries, so
>> this may not be an issue any more).
>>
>> Most frustratingly for us, logstash would occasionally, and without any
>> error logs, or reason we could ascertain, drastically slow down or (more
>> often) completely hang. Most of the time, at least at first, a simple
>> restart would get things going again. I ended up bandaiding the problem by
>> putting together a monit program check that would exit 1 if the logstash
>> redis queue was too high for too long (a couple hundred thousand messages
>> was a sure sign we had a problem) and exit 0 otherwise. It was around this
>> time that I started researching alternative solutions, and started reading
>> about heka. Unfortunately, even with the monit check, sometimes logstash
>> would stay hung after the restart. We tried various things to troubleshoot,
>> from clearing the logstash on-disk buffer while it was stopped to dumping
>> and flushing the logstash queue to give it a headstart and trash any
>> potentially poison messages. I had hopes that logstash 1.5 would address
>> these problems, but never got a chance to try it out, as it didn't get
>> released before we moved on. At one point we straced the process and found
>> it idle, polling the network socket (though any other process could reach
>> redis without issue), but didn't pursue the issue any further, as by this
>> point we'd already decided that heka was a better fit with our preferences
>> (distributed) and goals (general purpose, able to support more than just
>> log processing), and was something we wanted to adopt.
>>
>> One of the things that initially attracted us to heka was the ability to
>> run it as a distributed agent, with each instance able to handle the full
>> set of processing steps, or ship to a central system for further
>> aggregation/processing or storage. It's done very well in this role since
>> we switched, and we've yet to run into any resource utilization problems
>> doing so, which was our one concern; heka uses 50-60M of RAM on our
>> load-balancers and our web front-ends, which are the most active, and ~25M
>> elsewhere. This is actually comparable to what we saw with beaver, which
>> did a lot less (guessing this is down to Python vs Go). Heka's splitters
>> are also much nicer to work with than the less-flexible regex-based
>> multiline support in beaver.
>>
>> TOML is also a really nice way to manage configuration; there's some nice
>> libraries for ruby (we use Chef), and treating /etc/heka as an include.d
>> directory works nicely for keeping the relevant heka configuration close to
>> the service that generates the logs.
>>
>> Speaking of which, my *favorite* thing so far with heka has been being
>> able to run both a system-level heka agent for system logs, nginx logs,
>> etc, and also shipping a heka config directly in the repo with our various
>> apps, so our dev teams have better visibility into the log collection and
>> can change the log format and the log decoding in a single PR. We then just
>> run a user-level heka instance under our app process supervisor that points
>> to this config. It's been working nicely so far.
>>
>> As to downsides:
>>
>> - the learning curve is a bit steeper with heka, but not insurmountable,
>> and is largely owed to it's greater flexibility. Happily both tools have
>> amazing documentation.
>> - the logstash ecosystem is bigger, so there's more pre-built plugins
>> available that you can just drop in.
>> - go and lua are great languages, well suited to the domain, but not as
>> popular as ruby, so there may be a bit of a learning curve there depending
>> on your background. unless you need to write custom plugins, this may not
>> matter.
>> - having to recompile the binary to integrate the go plugins available
>> for heka is a bit of a bummer.
>>
>> Related to the last point, sandbox plugins (lua) don't require a
>> recompile and are actually pretty easy to throw together as needed (e.g.
>> https://gist.github.com/nathwill/d3f62d46d173b2456531). I suspect
>> that'll be the most popular method of adding plugins, with the Go stuff
>> reserved for core plugins and cases where the best possible performance is
>> needed. We've avoided the need to build a custom binary so far, and at this
>> point I don't anticipate that we'll need to.
>>
>> We've also found the heka community to be incredibly accessible and
>> helpful (thanks, y'all!) when we got stuck on something, so if you do
>> decide to try it out, chances are good that support will be easy to come by.
>>
>> Anyways, this turned into more words than I'd intended, but that's the
>> rough outline of our experience with both. Hope it helps!
>>
>> Cheers,
>>
>> Nathan W
>>
>> On Fri, May 29, 2015 at 6:32 PM, Matthew Singletary <
>> [email protected]> wrote:
>>
>>> At work we currently have an intern setting up a prototype ELK stack
>>> (Elastic Search, Logstash, Kibana). While this seems to be fairly easy to
>>> set up and will have lots of nifty looking graphs, I worry about how
>>> logstash will be aggregating the inputs from potentially many machines.
>>>
>>> I figure that elasticsearch and kibana would still be usable with
>>> possibly logstash being replaced by heka, does this sound right?
>>>
>>> Any thoughts, comparisons or war stories?
>>>
>>> Thanks,
>>> Matt
>>>
>>> _______________________________________________
>>> Heka mailing list
>>> [email protected]
>>> https://mail.mozilla.org/listinfo/heka
>>>
>>>
>>
>> _______________________________________________
>> Heka mailing list
>> [email protected]
>> https://mail.mozilla.org/listinfo/heka
>>
>>
>
>
> --
> V
>
_______________________________________________
Heka mailing list
[email protected]
https://mail.mozilla.org/listinfo/heka

Reply via email to