Re: [heka] State and future of Heka

2016-05-07 Thread Michael Trinkala
 Please don't mistake the small size of the documentation as it being
incomplete. What is there is a full reference to the current functionality.
Its small size (when compared to the Heka documentation) is due to the
reduction of complexity in it's design (Hindsight is just a skeleton around
the Heka sandbox (documented separately)). As for the configuration
examples, I actually only use a single single standard HS configuration
everywhere (home and Mozilla) I will add it to the docs. The core Hindsight
configuration needs very little tweaking but the options are fully
documented and provide flexibility down to tying/grouping plugins to a
specific thread of execution.

The bulk of the configuration work comes with the specific individual
plugins being used, of which, each has its own embedded documentation. I
have a bug to turn this into something more browsing friendly
https://bugzilla.mozilla.org/show_bug.cgi?id=1261067 but the documentation
already exists. If something is missing, unclear, or confusing please file
an issue and we will get it taken care of. This quarter will consist of a
lot of last mile work: packaging, migration of some lua code out of Heka
and into lua_sandbox, the final review and release of the 1.0 APIs (for the
lua_sandbox and all the modules we provide) etc..

As for the mailing list, Hindsight conversations have been happening here
as they are low volume and most relate to the Heka sandbox (we will
re-evaluate this as needed).
Thanks,
Trink

On Sat, May 7, 2016 at 5:38 AM, Mathieu Parent 
wrote:

> Hi Rob,
>
> Thanks for those info about the future of Heka.
>
> You're seeking help to keep heka alive, but what about Hindsight?
>
> It has currently no mailing list, the docs are minimal, and there are
> no configuration examples (except the benchmarks dir). Any plan to
> improve this? Heka's doc is one of its "selling" points.
>
> Regards
>
>
> 2016-05-06 19:51 GMT+02:00 Rob Miller :
> > Hi everyone,
> >
> > I'm lng overdue in sending out an update about the current state of
> and
> > plans for Heka. Unfortunately, what I have to share here will probably be
> > disappointing for many of you, and it might impact whether or not you
> want
> > to continue using it, as all signs point to Heka getting less support and
> > fewer updates moving forward.
> >
> > The short version is that Heka has some design flaws that make it hard to
> > incrementally improve it enough to meet the high throughput and
> reliability
> > goals that we were hoping to achieve. While it would be possible to do a
> > major overhaul of the code to resolve most of these issues, I don't have
> the
> > personal bandwidth to do that work, since most of my time is consumed
> > working on Mozilla's immediate data processing needs rather than general
> > purpose tools these days. Hindsight (https://github.com/trink/hindsight
> ),
> > built around the same Lua sandbox technology as Heka, doesn't have these
> > issues, and internally we're using it more and more instead of Heka, so
> > there's no organizational imperative for me (or anyone else) to spend the
> > time required to overhaul the Go code base.
> >
> > Heka is still in use here, though, especially on our edge nodes, so it
> will
> > see a bit more improvement and at least a couple more releases. Most
> > notably, it's on my list to switch to using the most recent Lua sandbox
> > code, which will move most of the protobuf processing to custom C code,
> and
> > will likely improve performance as well as remove a lot of the
> problematic
> > cgo code, which is what's currently keeping us from being able to
> upgrade to
> > a recent Go version.
> >
> > Beyond that, however, Heka's future is uncertain. The code that's there
> will
> > still work, of course, but I may not be doing any further improvements,
> and
> > my ability to keep up with support requests and PRs, already on the
> decline,
> > will likely continue to wane.
> >
> > So what are the options? If you're using a significant amount of Lua
> based
> > functionality, you might consider transitioning to Hindsight. Any Lua
> code
> > that works in Heka will work in Hindsight. Hindsight is a much leaner and
> > more solid foundation. Hindsight has far fewer i/o plugins than Heka,
> > though, so for many it won't be a simple transition.
> >
> > Also, if there's someone out there (an organization, most likely) that
> has a
> > strong interest in keeping Heka's codebase alive, through funding or
> coding
> > contributions, I'd be happy to support that endeavor. Some restrictions
> > apply, however; the work that needs to be done to improve Heka's
> foundation
> > is not beginner level work, and my time to help is very limited, so I'm
> only
> > willing to support folks who demonstrate that they are up to the task.
> > Please contact me off-list if you or your organization is interested.
> >
> > Anyone casually following along can probably stop reading here. Those 

Re: [heka] State and future of Heka

2016-05-07 Thread Mathieu Parent
Hi Rob,

Thanks for those info about the future of Heka.

You're seeking help to keep heka alive, but what about Hindsight?

It has currently no mailing list, the docs are minimal, and there are
no configuration examples (except the benchmarks dir). Any plan to
improve this? Heka's doc is one of its "selling" points.

Regards


2016-05-06 19:51 GMT+02:00 Rob Miller :
> Hi everyone,
>
> I'm lng overdue in sending out an update about the current state of and
> plans for Heka. Unfortunately, what I have to share here will probably be
> disappointing for many of you, and it might impact whether or not you want
> to continue using it, as all signs point to Heka getting less support and
> fewer updates moving forward.
>
> The short version is that Heka has some design flaws that make it hard to
> incrementally improve it enough to meet the high throughput and reliability
> goals that we were hoping to achieve. While it would be possible to do a
> major overhaul of the code to resolve most of these issues, I don't have the
> personal bandwidth to do that work, since most of my time is consumed
> working on Mozilla's immediate data processing needs rather than general
> purpose tools these days. Hindsight (https://github.com/trink/hindsight),
> built around the same Lua sandbox technology as Heka, doesn't have these
> issues, and internally we're using it more and more instead of Heka, so
> there's no organizational imperative for me (or anyone else) to spend the
> time required to overhaul the Go code base.
>
> Heka is still in use here, though, especially on our edge nodes, so it will
> see a bit more improvement and at least a couple more releases. Most
> notably, it's on my list to switch to using the most recent Lua sandbox
> code, which will move most of the protobuf processing to custom C code, and
> will likely improve performance as well as remove a lot of the problematic
> cgo code, which is what's currently keeping us from being able to upgrade to
> a recent Go version.
>
> Beyond that, however, Heka's future is uncertain. The code that's there will
> still work, of course, but I may not be doing any further improvements, and
> my ability to keep up with support requests and PRs, already on the decline,
> will likely continue to wane.
>
> So what are the options? If you're using a significant amount of Lua based
> functionality, you might consider transitioning to Hindsight. Any Lua code
> that works in Heka will work in Hindsight. Hindsight is a much leaner and
> more solid foundation. Hindsight has far fewer i/o plugins than Heka,
> though, so for many it won't be a simple transition.
>
> Also, if there's someone out there (an organization, most likely) that has a
> strong interest in keeping Heka's codebase alive, through funding or coding
> contributions, I'd be happy to support that endeavor. Some restrictions
> apply, however; the work that needs to be done to improve Heka's foundation
> is not beginner level work, and my time to help is very limited, so I'm only
> willing to support folks who demonstrate that they are up to the task.
> Please contact me off-list if you or your organization is interested.
>
> Anyone casually following along can probably stop reading here. Those of you
> interested in the gory details can read on to hear more about what the
> issues are and how they might be resolved.
>
> First, I'll say that I think there's a lot that Heka got right. The basic
> composition of the pipeline (input -> split -> decode -> route -> process ->
> encode -> output) seems to hit a sweet spot for composability and reuse. The
> Lua sandbox, and especially the use of LPEG for text parsing and
> transformation, has proven to be extremely efficient and powerful; it's the
> most important and valuable part of the Heka stack. The routing
> infrastructure is efficient and solid. And, perhaps most importantly, Heka
> is useful; there are a lot of you out there using it to get work done.
>
> There was one fundamental mistake made, however, which is that we shouldn't
> have used channels. There are many competing opinions about Go channels. I'm
> not going to get in to whether or not they're *ever* a good idea, but I will
> say unequivocally that their use as the means of pushing messages through
> the Heka pipeline was a mistake, for a number of reasons.
>
> First, they don't perform well enough. While Heka performs many tasks faster
> than some other popular tools, we've consistently hit a throughput ceiling
> thanks to all of the synchronization that channels require. And this
> ceiling, sadly, is generally lower than is acceptable for the amount of data
> that we at Mozilla want to push through our aggregators single system.
>
> Second, they make it very hard to prevent message loss. If unbuffered
> channels are used everywhere, performance plummets unacceptably due to
> context-switching costs. But using buffered channels means that many
> messages are in flight at a