Re: [heka] Hinsight in pull mode

2016-09-21 Thread Michael Trinkala
Message are sent when the external source requests them (no cache in this
plugin so it is just a live stream).  From the external source perspective
this is a pull.  It could have simply been an HTTP request for the current
circular buffer time series data which is what I believe you want for
Prometheus.  If you wanted to go the other way the output plugin would be
an HTTP client that POSTs the data to an HTTP server.

Trink

On Wed, Sep 21, 2016 at 7:06 AM, Mathieu Parent <math.par...@gmail.com>
wrote:

> 2016-09-21 15:57 GMT+02:00 Michael Trinkala <mtrink...@mozilla.com>:
> > We should probably be using the Hindsight list
> > https://mail.mozilla.org/listinfo/hindsight for this
>
> Didn't knew about this list. Subscribing...
>
> > but I will provide a
> > short answer here: Hindsight is agnostic when it comes to push or pull
> in an
> > I/O plugin.  It is a function of the plugin and there are currently
> examples
> > of both.  However, as Eric commented there is no HTTP server
> implementation
> > available in a sandbox yet (although there are a few options).  See
> > https://github.com/mozilla-services/lua_sandbox_
> extensions/blob/master/socket/sandboxes/heka/output/heka_tcp_matcher.lua
> > for an example of a TCP subscription based output plugin.
>
> OK. Still this is a push model, and the output plugin is listening to
> the messages (i.e the messages are not requested by the output plugin,
> but by ticker_interval or input process_message()).
>
> There is no hurry. Leaving this question open for now.
>
> --
> Mathieu
>
___
Heka mailing list
Heka@mozilla.org
https://mail.mozilla.org/listinfo/heka


Re: [heka] Lua coroutines: Does it scales?

2016-09-21 Thread Michael Trinkala
It depends on the application but there can be benefits see (it has some
performance comparisons): http://www.lua.org/pil/9.4.html

Trink


On Wed, Sep 21, 2016 at 1:20 AM, Mathieu Parent 
wrote:

> Hi,
>
> I'm currently trying to write a "syslog tcp" input. I've read heka_tcp
> input as inspiration which uses coroutines.
>
> I'm wondering, given that "Only one coroutine ever runs at a time"
> [1], does it scales well?
>
> How are you handling this at Mozilla?
>
> Cheers,
>
> [1] http://lua-users.org/wiki/CoroutinesTutorial
>
> --
> Mathieu Parent
> ___
> Heka mailing list
> Heka@mozilla.org
> https://mail.mozilla.org/listinfo/heka
>
___
Heka mailing list
Heka@mozilla.org
https://mail.mozilla.org/listinfo/heka


Re: [heka] Hinsight in pull mode

2016-09-21 Thread Michael Trinkala
We should probably be using the Hindsight list https://mail.mozilla.org/
listinfo/hindsight for this but I will provide a short answer here:
Hindsight is agnostic when it comes to push or pull in an I/O plugin.  It
is a function of the plugin and there are currently examples of both.
However, as Eric commented there is no HTTP server implementation available
in a sandbox yet (although there are a few options).  See
https://github.com/mozilla-services/lua_sandbox_
extensions/blob/master/socket/sandboxes/heka/output/heka_tcp_matcher.lua
for an example of a TCP subscription based output plugin.

Trink

On Wed, Sep 21, 2016 at 2:28 AM, Eric LEMOINE  wrote:

> On Wed, Sep 21, 2016 at 10:48 AM, Mathieu Parent 
> wrote:
> > Hello (again),
> >
> > Hindsight is using a push-model (i.e Nagios passive checks). This is
> > great, but I want to plug it with Prometheus which uses pull-model
> > [1].
> >
> > I see several ways to handle this:
> > - use the prometheus push-gateway [2]. This has several drawbacks listed
> below
> > - introduce pull model in hindsight
> > - add a new daemon, based on lua_sandbox too, but using pull model
> >
> > The drawbacks of prometheus push-gateway are:
> > - Unnecessary polling of data (data is grabbed even if not pulled by
> prometheus
> > - time lag, between data grabbing and data pulling
> > - To sum up : to reduce time lag, you increase polling rate, when us
> > decrease polling, you increase time lag.
> >
> > The push model may work like this:
>
>
> You mean "pull" here I guess.
>
>
> > - Adding pull_message_matcher config to inputs (defaults to FALSE)
> > - Adding process_pull_message() function to inputs, returning a table
> > of messages (or should it be inject_pull_message() + return 0?)
> > - Adding request_pull_message() function to outputs, which maps to
> > matching process_pull_message() and concatenates the results in a
> > table. This function is blocking.
> >
> > Opinions?
>
> If there was an "http listen" extension in lua_sandbox_extensions then
> I think it would be easy to write an output plugin compatible with
> Prometheus, without specific support for pull in Hindsight. The
> problem is that there's "http listen" extension in
> lua_sandbox_extensions right now, see [*] for a bit more information
> on this.
>
> [*] 
> ___
> Heka mailing list
> Heka@mozilla.org
> https://mail.mozilla.org/listinfo/heka
>
___
Heka mailing list
Heka@mozilla.org
https://mail.mozilla.org/listinfo/heka


Re: [heka] State and future of Heka

2016-05-07 Thread Michael Trinkala
 Please don't mistake the small size of the documentation as it being
incomplete. What is there is a full reference to the current functionality.
Its small size (when compared to the Heka documentation) is due to the
reduction of complexity in it's design (Hindsight is just a skeleton around
the Heka sandbox (documented separately)). As for the configuration
examples, I actually only use a single single standard HS configuration
everywhere (home and Mozilla) I will add it to the docs. The core Hindsight
configuration needs very little tweaking but the options are fully
documented and provide flexibility down to tying/grouping plugins to a
specific thread of execution.

The bulk of the configuration work comes with the specific individual
plugins being used, of which, each has its own embedded documentation. I
have a bug to turn this into something more browsing friendly
https://bugzilla.mozilla.org/show_bug.cgi?id=1261067 but the documentation
already exists. If something is missing, unclear, or confusing please file
an issue and we will get it taken care of. This quarter will consist of a
lot of last mile work: packaging, migration of some lua code out of Heka
and into lua_sandbox, the final review and release of the 1.0 APIs (for the
lua_sandbox and all the modules we provide) etc..

As for the mailing list, Hindsight conversations have been happening here
as they are low volume and most relate to the Heka sandbox (we will
re-evaluate this as needed).
Thanks,
Trink

On Sat, May 7, 2016 at 5:38 AM, Mathieu Parent 
wrote:

> Hi Rob,
>
> Thanks for those info about the future of Heka.
>
> You're seeking help to keep heka alive, but what about Hindsight?
>
> It has currently no mailing list, the docs are minimal, and there are
> no configuration examples (except the benchmarks dir). Any plan to
> improve this? Heka's doc is one of its "selling" points.
>
> Regards
>
>
> 2016-05-06 19:51 GMT+02:00 Rob Miller :
> > Hi everyone,
> >
> > I'm lng overdue in sending out an update about the current state of
> and
> > plans for Heka. Unfortunately, what I have to share here will probably be
> > disappointing for many of you, and it might impact whether or not you
> want
> > to continue using it, as all signs point to Heka getting less support and
> > fewer updates moving forward.
> >
> > The short version is that Heka has some design flaws that make it hard to
> > incrementally improve it enough to meet the high throughput and
> reliability
> > goals that we were hoping to achieve. While it would be possible to do a
> > major overhaul of the code to resolve most of these issues, I don't have
> the
> > personal bandwidth to do that work, since most of my time is consumed
> > working on Mozilla's immediate data processing needs rather than general
> > purpose tools these days. Hindsight (https://github.com/trink/hindsight
> ),
> > built around the same Lua sandbox technology as Heka, doesn't have these
> > issues, and internally we're using it more and more instead of Heka, so
> > there's no organizational imperative for me (or anyone else) to spend the
> > time required to overhaul the Go code base.
> >
> > Heka is still in use here, though, especially on our edge nodes, so it
> will
> > see a bit more improvement and at least a couple more releases. Most
> > notably, it's on my list to switch to using the most recent Lua sandbox
> > code, which will move most of the protobuf processing to custom C code,
> and
> > will likely improve performance as well as remove a lot of the
> problematic
> > cgo code, which is what's currently keeping us from being able to
> upgrade to
> > a recent Go version.
> >
> > Beyond that, however, Heka's future is uncertain. The code that's there
> will
> > still work, of course, but I may not be doing any further improvements,
> and
> > my ability to keep up with support requests and PRs, already on the
> decline,
> > will likely continue to wane.
> >
> > So what are the options? If you're using a significant amount of Lua
> based
> > functionality, you might consider transitioning to Hindsight. Any Lua
> code
> > that works in Heka will work in Hindsight. Hindsight is a much leaner and
> > more solid foundation. Hindsight has far fewer i/o plugins than Heka,
> > though, so for many it won't be a simple transition.
> >
> > Also, if there's someone out there (an organization, most likely) that
> has a
> > strong interest in keeping Heka's codebase alive, through funding or
> coding
> > contributions, I'd be happy to support that endeavor. Some restrictions
> > apply, however; the work that needs to be done to improve Heka's
> foundation
> > is not beginner level work, and my time to help is very limited, so I'm
> only
> > willing to support folks who demonstrate that they are up to the task.
> > Please contact me off-list if you or your organization is interested.
> >
> > Anyone casually following along can probably stop reading here. Those 

Re: [heka] Writing a SandboxInput

2015-11-19 Thread Michael Trinkala
Coroutines can be enabled (3 lines above the link Rob sent, same type of
modification as the io tweak)

Trink

On Thu, Nov 19, 2015 at 1:50 AM, Kipras Mancevičius 
wrote:

> Hi, thanks for all the info. I'll look into the options again. Most
> probably going to recompile Heka and enable the "io" module.
>
> As for coroutines - i did try to use them as they seemed to be the way to
> go for implementing this, however they didn't work - it seems that
> "yield()' is not supported in the Heka Lua sandbox (it complains about an
> undefined global).
>
> On Thu, Nov 19, 2015 at 12:16 AM, Rob Miller  wrote:
>
>> Hrm. There's a *lot* of code and complexity in the LogstreamerInput. And
>> not because it's overengineered, IMHO, but because it's providing a lot of
>> functionality and solving a complicated set of problems. While I'd be
>> thrilled to have a SandboxInput replicating this functionality (mainly so
>> it could be used with Hindsight), I suspect that this is a much larger and
>> more difficult task than you might imagine. If you do end up going down
>> that road, I suggest you start with a tight, well-defined subset of
>> LogstreamerInput's features, and then work your way up from there.
>>
>> I can't speak to the issues that you're having with trying to use
>> `os.execute` or `io.popen` to run an external sleep command. It's very
>> possible that you're bumping up against Heka bugs with the behaviour you're
>> seeing. But that approach smells a bit off to me, I'm not surprised you're
>> having issues. You might consider one of the other suggested mechanisms
>> (see http://lua-users.org/wiki/SleepFunction), especially using
>> `socket.select`, or writing a trivial C extension that exposes a sleep
>> function. You also might try using Hindsight instead of Heka to test your
>> code, since that's a much lighter code base and is likely to have fewer (or
>> at least different) bugs.
>>
>> You might already have this on your radar, but if I were working on this
>> I'd almost certainly be using coroutines (
>> http://lua-users.org/wiki/CoroutinesTutorial).
>>
>> Finally, you say that the reason you're not using LogstreamerInput is
>> because you need to use Lua's `io` module in your decoding code. If you're
>> okay with building your own custom Heka binary, and you don't plan on using
>> a SandboxManagerFilter to support dynamic sandbox injection (i.e. you trust
>> all of the Lua code you'll be deploying) then you might consider rebuilding
>> Heka with support for the io module allowed in all sandboxes. The code that
>> causes the `io` module to not be available is here:
>>
>>
>> https://github.com/mozilla-services/heka/blob/v0.10.0b1/sandbox/lua/lua_sandbox.go.in#L63
>>
>> Hope this helps,
>>
>> -r
>>
>>
>>
>> On 11/18/2015 05:40 AM, Kipras Mancevičius wrote:
>>
>>> Hey guys,
>>>
>>> TL;DR any way to properly emulate LogStreamerInput behavior in Lua with
>>> a SandboxInput ?
>>>
>>> i'm having trouble trying to write a SandboxInput and since the docs
>>> aren't the most helpful regarding that (no examples) - i thought i will
>>> ask here.
>>>
>>> Any directions/pointers for making a SandboxInput that would
>>> continuously listen for new entries in a file? The reason why i'm not
>>> using a LogStreamerInput input with a SandboxDecoder is that i need to
>>> use the Lua 'io' module, and it's not available for SandboxDecoders only
>>> for SandboxInputs.
>>>
>>> The problems i'm having so far is:
>>> * if the input does not have a "ticker_interval" set and
>>> process_message() finishes - i get a "single run completed" entry in the
>>> log and the input never restarts
>>>
>>> * i tried using os.execute('sleep 1') / io.popen('sleep 1') after the
>>> input is exhausted to wait for a bit and then recheck if there is any
>>> new input, but:
>>> os.execute('sleep 1') works for sleeping and listening for new input,
>>> but stopping heka (Ctrl+C) does not work
>>> io.popen('sleep 1') also works for sleeping and listening for new input,
>>> but when trying to stop heka (Ctrl+C) on first Ctrl+C you get log
>>> entries "Shutdown initiated.", "Stop message sent to input
>>> ''" but it does not shutdown (is there some way to handle
>>> the "stop message" in the SandboxInput??) and on some subsequent Ctrl+C
>>> it enters the infinte restart loop (read below)
>>>
>>> * if the input has a "ticker_interval" - it either also never restarts
>>> or restarts when you try to exit heka and enters a weird infinite loop
>>> state, where it looks be infinitely restarting, heka is at 100% cpu and
>>> you have to "kill -9" it to stop it. I think the infinite restart is
>>> caused by the input crashing for some reason and then my global input
>>> file position tracking is not updated and the input is restarted and so
>>> it is processing the same input again and again. This behavior depends
>>> on the set "ticker_interval", but i don't understand how those values
>>> work, e.g.:
>>> ticker_interval = 

Re: [heka] Lua Decoder Parse Error

2015-10-22 Thread Michael Trinkala
The code is failing on the inject_message line.  The message Fields table
must follow this:
https://hekad.readthedocs.org/en/latest/sandbox/index.html#lua-message-hash-based-field-structure
schema.  Also, you could trap the error from the pcall and return it with
the status code to avoid some of the confusion.

Trink

On Thu, Oct 22, 2015 at 11:59 AM, Vogt, Justin <
justin.v...@libertymutual.com> wrote:

> Here’s the whole lua file:
>
> local l = require "lpeg"
> local dt = require "date_time"
> local sp = l.space
>
> l.locale(l)
>
> local pri = l.P"<" * l.Cg(l.R"09"^0, "pri") * l.P">"
> local logtime = l.Cg(dt.build_strftime_grammar("%b %d %X"), "logtime") * sp
> local hostname = l.Cg((1 - sp)^1, "hostname") * sp
> local logname = l.Cg((1 - sp)^1, "logname") * sp
> local message = l.Cg(l.P(1)^0, "message")
> msg = pri * logtime * hostname * logname * message
>
> stack = l.Ct(msg)
>
> local msg_type = read_config("type")
>
> local msg = {
> Timestamp   = nil,
> Type= msg_type,
> Hostname= nil,
> Payload = nil,
> Fields  = nil
> }
>
> function process_message ()
> local log = read_message("Payload")
> local flds = stack:match(log)
>
> if not flds then return -1 end
>
> if flds.hostname then
> msg.Hostname = flds.hostname
> flds.hostname = nil
> end
>
> msg.Payload = log
> msg.Fields = flds
>
> if not pcall(inject_message, msg) then return -1 end
> return 0
> end
>
> I tried a few more things, but still seems to be failing. I’ll take a look
> at the rsyslog decoder, but I’m trying to learn how to do, rather than
> borrow :-)
>
> - Justin
>
> From: Heka on behalf of Michael Trinkala
> Date: Thursday, October 22, 2015 at 11:37 AM
> To: "justin.vog...@gmail.com<mailto:justin.vog...@gmail.com>"
> Cc: heka
> Subject: Re: [heka] Lua Decoder Parse Error
>
> It is probably not the grammar failing, I am betting you return -1
> somewhere else in the decoder.  Can you share the code?
>
> Trink
>
> On Wed, Oct 21, 2015 at 11:23 AM, Justin Vogt <justin.vog...@gmail.com
> <mailto:justin.vog...@gmail.com>> wrote:
>
> Hello Heka Community,
>
> I’m getting a weird error… I’ve written a custom decoder in Lua, and it
> works find when I test it on the LPEG grammar tester, but when I try to run
> it in Heka, I continually get a “Decoder error, failed parsing” message.
> I’ve tried just about everything I can think of and have been pulling my
> hair out to figure out the issue. Any help would be greatly appreciated!!
>
>
> Here's the grammar
>
> local l = require "lpeg"
>
> local dt = require "date_time"
>
> local sp = l.space
>
> l.locale(l)
>
> local pri = l.P"<" * lpeg.Cg(lpeg.R"09"^0, "pri") * lpeg.P">"
>
> local logtime = l.Cg(dt.build_strftime_grammar("%b %d %X"), "logtime") * sp
>
> local hostname = l.Cg((1 - sp)^1, "hostname") * sp
>
> local logname = l.Cg((1 - sp)^1, "logname") * sp
>
> local message = l.Cg(l.P(1)^0, "message")
>
> msg = pri * logtime * hostname * logname * message
>
> grammar = l.Ct(msg)
>
>
> And a sample log line:
>
> <14>Oct 16 02:26:17 node-85 keystone-all 192.168.0.2 - - [16/Oct/2015
> 02:26:17] "GET /v3/auth/tokens HTTP/1.1" 200 10162 0.125928
>
>
> I can literally copy and paste the grammar out of the decoder and the log
> line from the failed parse message into the LPEG tester and it works.
>
>
> Thanks in advance!
>
> ___
> Heka mailing list
> Heka@mozilla.org<mailto:Heka@mozilla.org>
> https://mail.mozilla.org/listinfo/heka
>
>
> ___
> Heka mailing list
> Heka@mozilla.org
> https://mail.mozilla.org/listinfo/heka
>
___
Heka mailing list
Heka@mozilla.org
https://mail.mozilla.org/listinfo/heka


Re: [heka] a question on heka message format

2015-10-07 Thread Michael Trinkala
Always, it is by design as described in the docs for decode_message:
https://hekad.readthedocs.org/en/latest/sandbox/index.html


   - message (table) The array based version of the message structure with
   the value member always being an array (even if there is only a single
   item). This format makes working with the output more consistent. The wide
   variation in the inject table format is to ease the construction of the
   message especially when using an LPeg grammar transformation.

Trink

On Wed, Oct 7, 2015 at 7:04 AM, Timur Batyrshin  wrote:

> Hi,
>
> I’ve got a question about decoding of Heka messages.
> Suppose the following call:
> msg = decode_message(read_message("raw"))
>
> Here msg.Fields will hold a table of values like the following:
> {
>  “name”: “foobar”,
>  “type”: “string”,
>  “value”: 123
> }
>
> What is very much unclear to me is why *some* of the fields here are
> produced not as plain values but as a table holding a single value, like
> {
>  “name”: “foobar”,
>  “value”: [123]
> }
>
> This way I need every time to check for if the value is an array or not,
> an example from yourselves:
> https://github.com/mozilla-services/heka/blob/versions/0.10/sandbox/lua/modules/ts_line_protocol.lua#L161-L162
>
>
> When does this condition occur?
>
> Thanks,
> Timur
>
> ___
> Heka mailing list
> Heka@mozilla.org
> https://mail.mozilla.org/listinfo/heka
>
>
___
Heka mailing list
Heka@mozilla.org
https://mail.mozilla.org/listinfo/heka


Re: [heka] apache logs not parsed

2015-09-23 Thread Michael Trinkala
Did you set the decoder = "apache" in the LogstreamerInput configuration?

Trink

On Wed, Sep 23, 2015 at 1:16 AM, Cristian Falcas 
wrote:

> Hello,
>
> I can't manage to make heka parse any apache logs.
>
> Here is my configuration:
>
> ## apache config
> LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\""
> combined
> CustomLog "/var/log/httpd/pulp-http_access.log" combined
>
> ## heka config
> [apache]
> type = "SandboxDecoder"
> filename = "lua_decoders/apache_access.lua"
>
> [apache.config]
> type = "combined"
> user_agent_transform = "true"
> log_format='%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"'
>
> ## heka decoded message
> 2015/09/23 11:11:17
> :Timestamp: 2015-09-23 08:11:17.004602885 + UTC
> :Type: logfile
> :Hostname: v-so-repo-05.company.net
> :Pid: 0
> :Uuid: 3b9f482b-d67c-4d39-b443-a210fb66250c
> :Logger: apache.pulp-http_.access
> :Payload: 10.220.10.117 - - [14/Sep/2015:23:40:25 +0300] "GET
> /repos/ol_latest/6Server/repodata/repomd.xml HTTP/1.1" 200 2142 "-"
> "urlgrabber/3.9.1 yum/3.2.29"
>
> :EnvVersion:
> :Severity: 7
>
> The payload is the exact same string found in the logs file.
>
> Can someone help with any ideas?
>
> Thank you,
> Cristian Falcas
>
>
> ___
> Heka mailing list
> Heka@mozilla.org
> https://mail.mozilla.org/listinfo/heka
>
>
___
Heka mailing list
Heka@mozilla.org
https://mail.mozilla.org/listinfo/heka


Re: [heka] Hindsight

2015-09-15 Thread Michael Trinkala
Generally I only use the prune for reporting in which case I don't want it
to clean up the results but it would be pretty easy to include the analysis
directory as a config option.

Yes, there is up to a second of latency on detecting/reading the rolled
logs.  It is there to prevent rapid polling when the data flow is pretty
low.  When using a reasonably sized roll configuration the potential of up
to a second of delay every few minutes hasn't been an issue for us.  In
fact most of the Hindsight backoffs/retries don't use anything smaller than
a one second resolution as Hindsight is designed to be near real time
(within a few seconds).

Cool, the numbers are still looking good on this end too and it is pretty
close to feature complete (with the recent addition of the dynamic loading).

This issue is on my radar and will be addressed soon, as well as the
documentation.

Trink

On Tue, Sep 15, 2015 at 8:04 AM, bruno binet <bruno.bi...@gmail.com> wrote:

> Hi Michael,
>
> Thanks again for your answer.
> I've also seen that you have checked in your "prune_input.lua plugin" on
> github:
> https://github.com/trink/hindsight/blob/master/sandboxes/input/prune_input.lua
> This is very useful, and I'm wondering if you also have a similar pruning
> plugin that could prune the output files produced by the analysis plugins?
>
> Also you said that we must be aware the roll transition can introduce up
> to a second of latency in the analysis and output plugins: why does the
> roll transition do introduce such a delay? Does this latency delay will be
> decreased if we use your "prune_input.lua" plugin to remove the files that
> would have been rolled?
>
> FYI, I've migrated my plugins and configuration from Heka to Hindsight,
> and the performance is actually largely improved (CPU usage decreased by
> ~10x, RAM usage decreased by ~80x !)
> So thanks a lot for that great piece of software!
> And that means that I can now use my Hindsight pipeline seamlessly on a
> Raspberry PI 1 device without facing any performance issues anymore. :-)
>
> The last thing that I would like to fix before pushing Hindsight in
> production is to be able to compile the master branch on Raspbery PI ARM
> platform (for now I monkey patch the code for the compilation to succeed),
> see github issue #7: https://github.com/trink/hindsight/pull/7
>
> Cheers,
> Bruno
>
> On 18 August 2015 at 06:18, Michael Trinkala <mtrink...@mozilla.com>
> wrote:
>
>> Yes, I will get the plugins checked in shortly I just need to add a
>> config option or two.
>>
>> run once - set the ticker_interval to 0 and return from process_message
>> when you are done
>> polling - set the ticker_interval > 0 and return from process_message, it
>> will be called again when the ticker interval expires.
>> continuous - don't return from process_message
>>
>> No, this feature does not need to be implemented in the infrastructure
>> the plugin extends the functionality nicely and can be easily customized as
>> needed.
>>
>> It totally depends on your throughput and message size, I don't expect
>> any problems even if you roll frequently (be aware the roll transition can
>> introduce up to a second of latency in the analysis and output plugins).
>>
>> Not at the moment since it is just two executables.
>>
>> Trink
>>
>>
>> On Thu, Aug 13, 2015 at 5:36 AM, bruno binet <bruno.bi...@gmail.com>
>> wrote:
>>
>>>
>>>
>>> On 12 August 2015 at 18:16, Michael Trinkala <mtrink...@mozilla.com>
>>> wrote:
>>>
>>>> Most looping requirements can be flatten out: i.e., alerting can be
>>>> handled in the output plugins in your example and
>>>> aggregation/sessionization etc can be handled in the inputs. As for
>>>> sharing: things fall into place when you start thinking about it from a
>>>> module level instead of an individual plugin level.
>>>>
>>>> The locations are configurable 'output_path' so you can put the output
>>>> files anywhere you want.
>>>>
>>>
>>> Good, I will set it somewhere on the /tmp/ partition, which is already a
>>> stored in RAM.
>>>
>>>
>>>> I have some plugins (like a stdin, simple file, TCP. and a pruning
>>>> (cleans up the output files when everyone in done with them) inputs;
>>>>
>>>
>>> I'm interested by this pruning plugin: could you share it? Or don't you
>>> think it would make sense for Hindsight to provide an option to
>>> automaticallly clean up output files?
>>>
>>>
>>>> heka

Re: [heka] Hindsight

2015-08-12 Thread Michael Trinkala
Most looping requirements can be flatten out: i.e., alerting can be handled
in the output plugins in your example and aggregation/sessionization etc
can be handled in the inputs. As for sharing: things fall into place when
you start thinking about it from a module level instead of an individual
plugin level.

The locations are configurable 'output_path' so you can put the output
files anywhere you want.

I have some plugins (like a stdin, simple file, TCP. and a pruning (cleans
up the output files when everyone in done with them) inputs; heka protobuf
and a payload outputs etc. but they haven't commited yet). As for
ProcessInput the Input sandbox has access to os.execute so there won't be a
generic version you can just call what you want and handle its output
directly (Hindsight already supports run once, polling, and continuous
input plugins)

The output files will grow until 'output_size' (defaults to 64MiB) before
they are rolled (they are not deleted by default).  I would not make it too
small unless you need to prune really quickly generally I run with a config
of 1GIB (on some of our systems that rolls several times a minute) space
permitting I would just roll them after several minutes of what would be
average data flow on your system)

Trink





On Wed, Aug 12, 2015 at 1:55 AM, bruno binet bruno.bi...@gmail.com wrote:



 On 12 August 2015 at 10:19, bruno binet bruno.bi...@gmail.com wrote:

 Thanks for all this valuable information.

 On 11 August 2015 at 17:17, Michael Trinkala mtrink...@mozilla.com
 wrote:

 There are a few intentional changes between Heka and Hindsight.  Looping
 messages in Heka has always been a bad idea so it was removed.


 Personally I like the looping messages feature in Heka as it is very
 flexible and could be useful to share ready-to-use plugins. Also it
 supports processing messages through multiple ticker_interval which can be
 useful (alerting, aggregations).


 There are a few API enhancements such as a protobuf stream reader and
 writer.  Checkpoint are all managed by the Hindsight infrastructure (so
 much of the burden is removed from the plugin writer, this also alters the
 plugin API slightly).  The write_message hack for Go has been removed since
 messages are immutable.  read_config now has access to all related sandbox
 config options (standard and user defined). read_next_field is not
 supported (this will also be removed from Heka in 0.11).

 In most cases you will find the Hindsight IOPS lower than Heka due to
 the much more efficient check pointing  (btw Heka 0.11 is moving to a disk
 buffer everywhere).


 Great, that is good to know.


 output_hi/input/* - contains the output from all of the input plugins
 output_hi/analysis/* - contains the output from all of the analysis
 plugins


 Are the above files always growing?
 I suppose the output_limit configuration allow us limit their size: what
 are the implications if I limit their size to a few KB? Will it reduce
 Hindsight performance?


 hindsight.cp - in the checkpoint file for all I/O (inputs, analysis, and
 output plugins)
 hindsight.tsv - in the self monitoring performance stats

 They files are all mandatory.  They are the reason Hindsight has an at
 least once delivery guarantee and they provide valuable insight on system
 operation and performance.


 If I don't need delivery guarantee, do you think it could make sense to
 move these files to a ramdisk (tmpfs) partition in order to preserve the
 flash sd/usb card?


 decode_message needs to be turned on for analysis and output plugins, I
 will enable it.


 Ok, thank you: this is now working as expected.

 Also, do you plan to implement some additional lua modules to help build
 input sandboxes similar to Heka input plugins (like the FilePollingInput,
 the ProcessInput, or the LogstreamerInput)?



 Trink


 On Tue, Aug 11, 2015 at 1:54 AM, bruno binet bruno.bi...@gmail.com
 wrote:

 I see, so I need to investigate how I can merge my multiple lua sandbox
 filters into a single one.

 This make me wondering if there is some other differences between
 Hindsight and Heka?
 The fact that only one analysis plugin cannot consume the output of
 another analysis plugin is the only difference beween Hindsight analysis
 plugins and Heka filter sandbox plugins?

 Also I saw in another thread that Hindsight uses disk buffers at every
 stage, so there's only ever one
 message in memory at every step of the pipeline: does it mean Hindsight
 will write much more frequently to the disk than Heka? This may be an issue
 for me since we use a raspberry pi which disk is a sdcard or usb flash key.

 I see that some data is written to the output_path (output_hl/
 directory in my case): can you explain what are all these files:
 $ tree output_hl/
 output_hl/
 |-- analysis
 |   `-- 0.log
 |-- hindsight.cp
 |-- hindsight.tsv
 `-- input
 `-- 0.log

 Can we avoid generating all these files?

 Last question: I don't manage to use the read_next_field

Re: [heka] Hindsight

2015-08-10 Thread Michael Trinkala
There is no message looping in Hindsight (one analysis plugin cannot
consume the output of another analysis plugin).  In your example the
decoding should happen in the input.  Heka has Inputs, splitters, and
decoder (in Hindsight it is just an Input and common functionality can be
split into modules for code reuse).  This in general simplifies the
configuration, is easier to follow (since everything is in one place) and
has performance benefits as well.

Trink

On Mon, Aug 10, 2015 at 9:23 AM, bruno binet bruno.bi...@gmail.com wrote:

 Back from vacations, I'm now playing again with Hindsight on a raspberry
 pi.
 As reported on github
 https://github.com/trink/hindsight/issues/1#issuecomment-119593775 the
 compilation now succeeds.

 So getting inspiration from the examples in the benchmarks directory, I
 tried to create a Hindsight configuration to use my own lua sandboxes: I
 can successfully read data from udp and use a filter to decode data, then I
 would like to use another filter to handle generated messages, but I can't
 get any message in the second filter. Does Hindsight support more than one
 filter like Heka?

 Here is the Hindsight configuration, Lua sandboxes and output directory
 generated by Hindsight:
 https://github.com/bbinet/hindsight_hl_test

 Do you see anything wrong? Do I use hindsight correctly?

 Cheers,
 Bruno

 On 8 July 2015 at 09:44, bruno binet bruno.bi...@gmail.com wrote:

 Sure, I will try your branch and report possible new compilation issues
 in github.

 Cheers,
 Bruno

 On 7 July 2015 at 18:26, Michael Trinkala mtrink...@mozilla.com wrote:

 I changed the checkpoint id to an unsigned long long. Can you test out
 the branch and add any other compilation errors to the issue (closing out
 this email thread).  I am also taking suggestions/recommendations for a CI
 build system that supports multiple platforms.  TravisCI adds almost no
 value since I am already building on a Debian based box.

 https://github.com/trink/hindsight/tree/issue_1

 Thanks,
 Trink

 On Tue, Jul 7, 2015 at 8:21 AM, bruno binet bruno.bi...@gmail.com
 wrote:

 Ok, thanks.
 And sorry, but I don't have a patch (don't know how to fix this kind of
 compilation issue).

 On 7 July 2015 at 16:17, Michael Trinkala mtrink...@mozilla.com
 wrote:

 Yeah, I have only been building on Ubuntu and haven't done any cross
 platform clean-up.  Thanks for the build output I will fix those errors
 (unless you already have a patch).

 Trink

 On Tue, Jul 7, 2015 at 5:57 AM, bruno binet bruno.bi...@gmail.com
 wrote:

 I now have some time to do a few tests with Hindsight, so I tried to
 compile it on our targeted arm platform (raspberry pi), but I get the
 following error:

 root@hl-mc--dev:~/hindsight/release# cmake
 -DCMAKE_BUILD_TYPE=release ..
 -- The C compiler identification is GNU 4.7.2
 -- The CXX compiler identification is GNU 4.7.2
 -- Check for working C compiler: /usr/bin/gcc
 -- Check for working C compiler: /usr/bin/gcc -- works
 -- Detecting C compiler ABI info
 -- Detecting C compiler ABI info - done
 -- Detecting C compile features
 -- Detecting C compile features - done
 -- Check for working CXX compiler: /usr/bin/g++
 -- Check for working CXX compiler: /usr/bin/g++ -- works
 -- Detecting CXX compiler ABI info
 -- Detecting CXX compiler ABI info - done
 -- Detecting CXX compile features
 -- Detecting CXX compile features - done
 -- Found LUASANDBOX: /usr/local/lib/libluasandbox.so
 -- Configuring done
 -- Generating done
 -- Build files have been written to: /root/hindsight/release

 root@hl-mc--dev:~/hindsight/release# make
 Scanning dependencies of target hindsight
 [  2%] Building C object src/CMakeFiles/hindsight.dir/hindsight.c.o
 [  4%] Building C object src/CMakeFiles/hindsight.dir/
 hs_analysis_plugins.c.o
 [  6%] Building C object src/CMakeFiles/hindsight.dir/
 hs_checkpoint_reader.c.o
 /root/hindsight/src/hs_checkpoint_reader.c: In function
 'find_first_id':
 /root/hindsight/src/hs_checkpoint_reader.c:46:3: error: large
 integer implicitly truncated to unsigned type [-Werror=overflow]
 /root/hindsight/src/hs_checkpoint_reader.c:55:3: error: comparison
 is always false due to limited range of data type [-Werror=type-limits]
 cc1: all warnings being treated as errors
 src/CMakeFiles/hindsight.dir/build.make:100: recipe for target
 'src/CMakeFiles/hindsight.dir/hs_checkpoint_reader.c.o' failed
 make[2]: *** [src/CMakeFiles/hindsight.dir/hs_checkpoint_reader.c.o]
 Error 1
 CMakeFiles/Makefile2:947: recipe for target
 'src/CMakeFiles/hindsight.dir/all' failed
 make[1]: *** [src/CMakeFiles/hindsight.dir/all] Error 2
 Makefile:146: recipe for target 'all' failed
 make: *** [all] Error 2

 Do you know what is going on here? I guess this is an issue with the
 arm platform only?

 Cheers,
 Bruno


 On 10 June 2015 at 18:41, bruno binet bruno.bi...@gmail.com wrote:

 Thanks a lot for your answers.

 And yes, I'm very interested in bootstrapping a first prototype of
 my own data pipeline based

Re: [heka] stock JSON decoder/encoder?

2015-08-04 Thread Michael Trinkala
FYI: https://github.com/mozilla-services/heka/issues/826

On Mon, Aug 3, 2015 at 2:58 PM, Paul Bonser mister...@gmail.com wrote:

 I've written a Go JSON decoder which allows you to configure which JSON
 fields go into which Heka message struct fields. The rest of the fields get
 put in the Heka message Fields map. For complex values it leaves the values
 as JSON-encoded byte slices.

 The matching encoder is not yet done, so one of the Heka-included JSON
 encoders needs to be used, and complex values won't be re-encoded properly.

 https://github.com/OwnLocal/heka-plugins/blob/master/json_decoder.go

 On Mon, Aug 3, 2015 at 10:55 AM Nathan Williams nat...@teamtreehouse.com
 wrote:

 yeah, I probably should have spoken up sooner. we already had this one we
 were using internally, so i figured i'd clean it up a bit for upstreaming
 and do the PR :)

 Cheers,

 Nathan

 On Mon, Aug 3, 2015 at 10:37 AM, Rob Miller rmil...@mozilla.com wrote:

 On 08/03/2015 05:55 AM, Timur Batyrshin wrote:

 Hi,

 I've looked into these 2-3 implementations of JSON decoders/encoders and
 found you are right in every word.
 Sorry for the possible confusion I've caused.

 No problem, you're asking good questions.

 On the other hand I'm worried about complexities which newcomers like me
 will meet :-)

 Do you think that sample implementation in examples/docs/contrib will be
 good for dealing with this?
 At the same time you won't have to support requests such as mine for
 contrib stuff.

 As I said, I'm +1 to providing a generic JSON decoder that will be
 useful for folks trying to get started. And, luckily, it seems someone has
 been watching this thread and has already gotten started on an
 implementation: https://github.com/mozilla-services/heka/pull/1653

 -r


 ___
 Heka mailing list
 Heka@mozilla.org
 https://mail.mozilla.org/listinfo/heka


 ___
 Heka mailing list
 Heka@mozilla.org
 https://mail.mozilla.org/listinfo/heka


 ___
 Heka mailing list
 Heka@mozilla.org
 https://mail.mozilla.org/listinfo/heka


___
Heka mailing list
Heka@mozilla.org
https://mail.mozilla.org/listinfo/heka


Re: [heka] Mirror of lpeg.trink.com for testing LPEG grammars?

2015-07-10 Thread Michael Trinkala
Yeah, that VM was in a bad state I couldn't even ssh in.  The reboot worked
though so it is back up.

Trink

On Fri, Jul 10, 2015 at 8:54 AM, Ali h...@alijnabavi.info wrote:

 Hey, all.

 lpeg.trink.com seems to be down.  Has anyone set up a publicly accessible
 instance of it?

 I'll see if I can set one up on my VPS for others to use.

 -Ali

 ___
 Heka mailing list
 Heka@mozilla.org
 https://mail.mozilla.org/listinfo/heka


___
Heka mailing list
Heka@mozilla.org
https://mail.mozilla.org/listinfo/heka


Re: [heka] Hindsight

2015-07-07 Thread Michael Trinkala
I changed the checkpoint id to an unsigned long long. Can you test out the
branch and add any other compilation errors to the issue (closing out this
email thread).  I am also taking suggestions/recommendations for a CI build
system that supports multiple platforms.  TravisCI adds almost no value
since I am already building on a Debian based box.

https://github.com/trink/hindsight/tree/issue_1

Thanks,
Trink

On Tue, Jul 7, 2015 at 8:21 AM, bruno binet bruno.bi...@gmail.com wrote:

 Ok, thanks.
 And sorry, but I don't have a patch (don't know how to fix this kind of
 compilation issue).

 On 7 July 2015 at 16:17, Michael Trinkala mtrink...@mozilla.com wrote:

 Yeah, I have only been building on Ubuntu and haven't done any cross
 platform clean-up.  Thanks for the build output I will fix those errors
 (unless you already have a patch).

 Trink

 On Tue, Jul 7, 2015 at 5:57 AM, bruno binet bruno.bi...@gmail.com
 wrote:

 I now have some time to do a few tests with Hindsight, so I tried to
 compile it on our targeted arm platform (raspberry pi), but I get the
 following error:

 root@hl-mc--dev:~/hindsight/release# cmake
 -DCMAKE_BUILD_TYPE=release ..
 -- The C compiler identification is GNU 4.7.2
 -- The CXX compiler identification is GNU 4.7.2
 -- Check for working C compiler: /usr/bin/gcc
 -- Check for working C compiler: /usr/bin/gcc -- works
 -- Detecting C compiler ABI info
 -- Detecting C compiler ABI info - done
 -- Detecting C compile features
 -- Detecting C compile features - done
 -- Check for working CXX compiler: /usr/bin/g++
 -- Check for working CXX compiler: /usr/bin/g++ -- works
 -- Detecting CXX compiler ABI info
 -- Detecting CXX compiler ABI info - done
 -- Detecting CXX compile features
 -- Detecting CXX compile features - done
 -- Found LUASANDBOX: /usr/local/lib/libluasandbox.so
 -- Configuring done
 -- Generating done
 -- Build files have been written to: /root/hindsight/release

 root@hl-mc--dev:~/hindsight/release# make
 Scanning dependencies of target hindsight
 [  2%] Building C object src/CMakeFiles/hindsight.dir/hindsight.c.o
 [  4%] Building C object src/CMakeFiles/hindsight.dir/
 hs_analysis_plugins.c.o
 [  6%] Building C object src/CMakeFiles/hindsight.dir/
 hs_checkpoint_reader.c.o
 /root/hindsight/src/hs_checkpoint_reader.c: In function 'find_first_id':
 /root/hindsight/src/hs_checkpoint_reader.c:46:3: error: large integer
 implicitly truncated to unsigned type [-Werror=overflow]
 /root/hindsight/src/hs_checkpoint_reader.c:55:3: error: comparison is
 always false due to limited range of data type [-Werror=type-limits]
 cc1: all warnings being treated as errors
 src/CMakeFiles/hindsight.dir/build.make:100: recipe for target
 'src/CMakeFiles/hindsight.dir/hs_checkpoint_reader.c.o' failed
 make[2]: *** [src/CMakeFiles/hindsight.dir/hs_checkpoint_reader.c.o]
 Error 1
 CMakeFiles/Makefile2:947: recipe for target
 'src/CMakeFiles/hindsight.dir/all' failed
 make[1]: *** [src/CMakeFiles/hindsight.dir/all] Error 2
 Makefile:146: recipe for target 'all' failed
 make: *** [all] Error 2

 Do you know what is going on here? I guess this is an issue with the arm
 platform only?

 Cheers,
 Bruno


 On 10 June 2015 at 18:41, bruno binet bruno.bi...@gmail.com wrote:

 Thanks a lot for your answers.

 And yes, I'm very interested in bootstrapping a first prototype of my
 own data pipeline based on Hindsight so that I can compare the performance
 on a raspberry pi.
 (here is the current state of our Heka-based data pipeline:
 https://bitbucket.org/helioslite/heka-hl-sandboxes)
 So it would be great if you can give me the first instructions on how
 to build and setup Hindsight.

 Thanks.
 Bruno

 On 10 June 2015 at 18:18, Michael Trinkala mtrink...@mozilla.com
 wrote:

 - It is usable and being actively developed with the intent to move it
 into production later this year.
 - We are currently running production data through it for testing but
 it is not deployed in an official capacity.  It has been very stable but
 until a more robust set of tests have been build out I will not consider 
 it
 production ready.
 - Yes, it can decode/encode Heka protobuf format
 - Yes, the router/message matcher is complete.  The only difference is
 that it supports Lua string pattern matching instead of re2 regexp  (Heka
 'Hostname =~ /^foo/' vs Hindsight 'Hostname =~ ^foo')
 - Yes, but you would need a lua-socket input and output sandbox (see
 benchmarks/hsr_run for related examples)
 - No documentation yet, only examples in the benchmarks directory.  I
 could have you bootstrapped in about 30 minutes (and hopefully turn that
 into a getting started guide) if you are interested.

 Implementation wise the only missing piece is support for dynamically
 loading plugins.  The actual code to accomplish it is very small (just
 detecting files in the load directory and moving them to the run 
 directory)
 but ideally it would be fronted by a web server and a GUI with access
 control and validation (a much

Re: [heka] Hindsight

2015-07-07 Thread Michael Trinkala
Yeah, I have only been building on Ubuntu and haven't done any cross
platform clean-up.  Thanks for the build output I will fix those errors
(unless you already have a patch).

Trink

On Tue, Jul 7, 2015 at 5:57 AM, bruno binet bruno.bi...@gmail.com wrote:

 I now have some time to do a few tests with Hindsight, so I tried to
 compile it on our targeted arm platform (raspberry pi), but I get the
 following error:

 root@hl-mc--dev:~/hindsight/release# cmake -DCMAKE_BUILD_TYPE=release
 ..
 -- The C compiler identification is GNU 4.7.2
 -- The CXX compiler identification is GNU 4.7.2
 -- Check for working C compiler: /usr/bin/gcc
 -- Check for working C compiler: /usr/bin/gcc -- works
 -- Detecting C compiler ABI info
 -- Detecting C compiler ABI info - done
 -- Detecting C compile features
 -- Detecting C compile features - done
 -- Check for working CXX compiler: /usr/bin/g++
 -- Check for working CXX compiler: /usr/bin/g++ -- works
 -- Detecting CXX compiler ABI info
 -- Detecting CXX compiler ABI info - done
 -- Detecting CXX compile features
 -- Detecting CXX compile features - done
 -- Found LUASANDBOX: /usr/local/lib/libluasandbox.so
 -- Configuring done
 -- Generating done
 -- Build files have been written to: /root/hindsight/release

 root@hl-mc--dev:~/hindsight/release# make
 Scanning dependencies of target hindsight
 [  2%] Building C object src/CMakeFiles/hindsight.dir/hindsight.c.o
 [  4%] Building C object src/CMakeFiles/hindsight.dir/
 hs_analysis_plugins.c.o
 [  6%] Building C object src/CMakeFiles/hindsight.dir/
 hs_checkpoint_reader.c.o
 /root/hindsight/src/hs_checkpoint_reader.c: In function 'find_first_id':
 /root/hindsight/src/hs_checkpoint_reader.c:46:3: error: large integer
 implicitly truncated to unsigned type [-Werror=overflow]
 /root/hindsight/src/hs_checkpoint_reader.c:55:3: error: comparison is
 always false due to limited range of data type [-Werror=type-limits]
 cc1: all warnings being treated as errors
 src/CMakeFiles/hindsight.dir/build.make:100: recipe for target
 'src/CMakeFiles/hindsight.dir/hs_checkpoint_reader.c.o' failed
 make[2]: *** [src/CMakeFiles/hindsight.dir/hs_checkpoint_reader.c.o]
 Error 1
 CMakeFiles/Makefile2:947: recipe for target 'src/CMakeFiles/hindsight.dir/all'
 failed
 make[1]: *** [src/CMakeFiles/hindsight.dir/all] Error 2
 Makefile:146: recipe for target 'all' failed
 make: *** [all] Error 2

 Do you know what is going on here? I guess this is an issue with the arm
 platform only?

 Cheers,
 Bruno


 On 10 June 2015 at 18:41, bruno binet bruno.bi...@gmail.com wrote:

 Thanks a lot for your answers.

 And yes, I'm very interested in bootstrapping a first prototype of my own
 data pipeline based on Hindsight so that I can compare the performance on a
 raspberry pi.
 (here is the current state of our Heka-based data pipeline:
 https://bitbucket.org/helioslite/heka-hl-sandboxes)
 So it would be great if you can give me the first instructions on how to
 build and setup Hindsight.

 Thanks.
 Bruno

 On 10 June 2015 at 18:18, Michael Trinkala mtrink...@mozilla.com wrote:

 - It is usable and being actively developed with the intent to move it
 into production later this year.
 - We are currently running production data through it for testing but it
 is not deployed in an official capacity.  It has been very stable but until
 a more robust set of tests have been build out I will not consider it
 production ready.
 - Yes, it can decode/encode Heka protobuf format
 - Yes, the router/message matcher is complete.  The only difference is
 that it supports Lua string pattern matching instead of re2 regexp  (Heka
 'Hostname =~ /^foo/' vs Hindsight 'Hostname =~ ^foo')
 - Yes, but you would need a lua-socket input and output sandbox (see
 benchmarks/hsr_run for related examples)
 - No documentation yet, only examples in the benchmarks directory.  I
 could have you bootstrapped in about 30 minutes (and hopefully turn that
 into a getting started guide) if you are interested.

 Implementation wise the only missing piece is support for dynamically
 loading plugins.  The actual code to accomplish it is very small (just
 detecting files in the load directory and moving them to the run directory)
 but ideally it would be fronted by a web server and a GUI with access
 control and validation (a much larger effort and actually a separate
 project).

 Trink

 On Wed, Jun 10, 2015 at 8:15 AM, bruno binet bruno.bi...@gmail.com
 wrote:

 Hi,

 I recently discovered the work pushed into the Hindsight repository (
 https://github.com/trink/hindsight) which seems to be a lightweight
 alternative to Heka, based on the lua sandbox.
 The Hindsight vs Heka benchmarks are quite impressive.

 I'm currently running Heka on the raspberry pi (not so powerful) device
 and the load average quickly increases and exceeds 1 when Heka is ingesting
 data, so Hindsight could be a good fit for us if it can perform better than
 Heka in terms of CPU cycles.

 What is the current status

Re: [heka] My custom Lua decoder

2015-07-07 Thread Michael Trinkala
Feedback added to the gist.

Trink

On Tue, Jul 7, 2015 at 12:06 PM, Rob Miller rmil...@mozilla.com wrote:

 On 07/07/2015 11:40 AM, Ali wrote:

 Hi, all!

 I'm finally done with my Lua decoder and thought I would post a link to
 it here, both for any constructive criticism anyone might have and for
 helping out anyone who was in the same situation I was in.  Here's the
 link:

 https://gist.github.com/hourback/56e93786df14a17b14da

 Looks quite reasonable to me. Nice work!

 This was basically born out of the need to parse fractional seconds in
 the WebSphere systemout.log files.  Go's time package doesn't recognize
 separators that aren't periods.  (There's an open issue for adding
 support for commas, at least.)  So I was unable to use
 PayloadRegexDecoder; I saw no way to do it other than writing a Lua
 decoder or a Go decoder.

 Using Lua and LPeg is the recommended way to do parsing in Heka.
 PayloadRegexDecoder is available for very simple cases, but it's not very
 fast nor is it very flexible, and there is very little benefit to writing a
 custom Go decoder.

 Thanks for sharing your code.

 -r

 ___
 Heka mailing list
 Heka@mozilla.org
 https://mail.mozilla.org/listinfo/heka

___
Heka mailing list
Heka@mozilla.org
https://mail.mozilla.org/listinfo/heka


Re: [heka] Need help writing heka-s3 output plugin

2015-05-13 Thread Michael Trinkala
You may want to take a look at
https://github.com/mozilla-services/data-pipeline/blob/master/heka/plugins/s3splitfile/s3splitfile_output.go

Trink

On Wed, May 13, 2015 at 3:42 AM, Alex Jiao uohzx...@gmail.com wrote:

 Hi, I'm writing an S3 output plugin for heka. As I am new to golang and
 heka, I'm not too sure what's the best way to go about writing the plugin.
 Hence I wish to seek you guys' help to resolve some issues that are
 plaguing my code.

 heka-s3 plugin code: https://github.com/uohzxela/heka-s3/blob/master/s3.go

 My plugin needs to send message packs from the pipeline at regular
 intervals as specified in the .toml file. It writes to a buffer during each
 interval and upload to a S3 bucket at the next tick. The interval is
 specified by the 'ticker_interval' option. However, I'm not too sure
 whether you need to take care of the ticker logic in the for loop or that
 the output Run function will be invoked at the specified ticker interval by
 Heka service itself.

 I have implemented the first case as seen in my code and it led to a weird
 issue. The plugin can be loaded upon starting the Heka service, however the
 ticker can only run once after the elapsed time interval. It's only after I
 sent a TERM signal to the Heka process using kill (kill heka-pid), that
 the ticker starts to work.

 Could you guys shed light on resolving this issue? Why does the ticker not
 work upon starting the plugin? Does the programmer have to take care of the
 ticker logic or the output runner will automatically be invoked at regular
 intervals by the Heka service? I'd appreciate it as well if you can point
 me to some output plugins that have the ticker functionality.

 Thank you.

 --
 Alex

 ___
 Heka mailing list
 Heka@mozilla.org
 https://mail.mozilla.org/listinfo/heka


___
Heka mailing list
Heka@mozilla.org
https://mail.mozilla.org/listinfo/heka


Re: [heka] Sandbox Decoder, not enough memory

2015-04-08 Thread Michael Trinkala
The default limit is 8MiB, it can be adjusted with the memory_limit
configuration parameter.
https://hekad.readthedocs.org/en/latest/config/common_sandbox_parameter.html#config-common-sandbox-parameters

I would recommend setting it to at least 10x the size of your largest
document.  The memory is not reserved so even if you make it 100MiB you
will only pay for what you use (you can look at the sandbox stats for the
current and peak usage).

The sandbox is a very good choice: its JSON parser is very efficient and
multiple times faster than the one in Go.

Trink

On Wed, Apr 8, 2015 at 3:07 PM, M GS mgs...@gmail.com wrote:

 I am trying to create stats based about 2mb json response from
 elasticsearch. Our goal is to be able to graph and alert on shard state
 changes.

 I have an HTTPInput that queries /_cluster/state, but when im using my
 decoder to try to extract the state of each shard, my plugin encounters a
 fatal error:

 Decoder 'ESClusterStatusInput-ESClusterStatusDecoder' error: FATAL:
 process_message() not enough memory

 Below is my code:

 local resp = read_message(Payload)
 local data = cjson.decode(resp)  -- it seems like here is where the
 error is thrown

 for _, index in pairs(data.routing_table.indices) do
 for _, shards in pairs(index.shards) do
 for _, shard in pairs(shards) do
 msg.Fields[shard.state] = msg.Fields[shard.state]+1
 end
 end
 end

 I tried with the same data outside of heka and it works, so I am assuming
 its a limit imposed on SandboxDecoders. My question is that since what im
 doing is probably pretty inefficient, is it even a good idea to be doing
 this kind of thing in general? Should I be doing this in go instead?


 Thanks!

 ___
 Heka mailing list
 Heka@mozilla.org
 https://mail.mozilla.org/listinfo/heka


___
Heka mailing list
Heka@mozilla.org
https://mail.mozilla.org/listinfo/heka


Re: [heka] Timestamp is set to UTC

2015-04-03 Thread Michael Trinkala
Timestamp is the number on nanoseconds since the Unix epoch. The
presentation is left to the consumer e.g., the RstEncoder will present it
as UTC.

Trink

On Fri, Apr 3, 2015 at 1:35 AM, Cristian Falcas cristi.fal...@gmail.com
wrote:

 Hello,

 I started playing with heka and I'm seeing that the Timestamp field is
 set to UTC instead of servers localtime (EEST).

 Is this normal, or I have something wrong on my deployment?

 Thank you,
 Cristian Falcas
 ___
 Heka mailing list
 Heka@mozilla.org
 https://mail.mozilla.org/listinfo/heka

___
Heka mailing list
Heka@mozilla.org
https://mail.mozilla.org/listinfo/heka


Re: [heka] get ticker_interval value from a lua sandbox filter

2015-03-23 Thread Michael Trinkala
Currently there is no way to access it, you would have to duplicate the
entry in the config section of the plugin.  Please file a feature request
if desired, thanks.

Trink

On Mon, Mar 23, 2015 at 4:16 AM, bruno binet bruno.bi...@gmail.com wrote:

 Hi,

 I'd like to know the ticker_interval which is configured from the
 currently running lua sandbox filter (this is just to add it as a metadata
 to the output message fields).
 I tried to do `read_config('ticker_interval')` without success.
 Is there a way to get the 'ticker_interval' value from a lua sandbox?

 Cheers,
 Bruno

 ___
 Heka mailing list
 Heka@mozilla.org
 https://mail.mozilla.org/listinfo/heka


___
Heka mailing list
Heka@mozilla.org
https://mail.mozilla.org/listinfo/heka


[heka] Fwd: Timestamp Conversion

2015-03-16 Thread Michael Trinkala
In the full context you will need the capture group (the original grammar
is correct).  If you can provide the specific input causing the error and
the associated output (i.e., something like this works fine for me: [Sat
Mar 14 23:42:45.990158 2015] [test:xyz] [pt 0:mnp 99] message) I will be
better able to assist you.

Trink

-- Forwarded message --
From: Madhukar Thota madhukar.th...@gmail.com
Date: Sat, Mar 14, 2015 at 10:04 PM
Subject: Re: [heka] Timestamp Conversion
To: Michael Trinkala mtrink...@mozilla.com


Getting following error:

error: FATAL: process_message() invalid key to 'next'

Here is my lua script:

local dt = require date_time
local l = require 'lpeg'
l.locale(l)
local sp = l.space
--local timestamp = l.P[ * l.Cg(dt.build_strftime_grammar(%a %b %d
%H:%M:%S.%s %Y) / dt.time_to_ns, timestamp) * ]
local timestamp = l.P[ * dt.build_strftime_grammar(%a %b %d %H:%M:%S.%s
%Y) / dt.time_to_ns * ]
local md =  l.P[ * l.Cg(l.R('AZ','az','__')^1 , module)  * : *
l.Cg(l.R('AZ','az')^1, log_level ) *]
local pt = l.P[ * l.P(l.R('az')^1) * sp * l.Cg(l.R('09')^1 ,processid)
* : * l.P(l.R('az')^1) * sp * l.Cg(l.R('09')^1 ,threadid) *]

local grammar = l.Ct(timestamp * sp * md * sp * pt * l.Cg( l.P(1)^0,
message ))
local msg_type = read_config(type)
local payload_keep  = read_config(payload_keep)

local msg = {
Timestamp = nil,
Type = msg_type,
Payload = nil,
Fields = nil
}

function process_message ()
local log = read_message(Payload)
local fields = grammar:match(log)
if not fields then return -1 end

--if fields.timestamp then
   msg.Timestamp = fields.timestamp
   fields.timestamp = nil
--end

   if timestamp then
   msg.Timestamp = timestamp
end

if payload_keep then
   msg.Payload = log
end
msg.Fields = fields
inject_message(msg)
return 0
end



On Sun, Mar 15, 2015 at 12:44 AM, Madhukar Thota madhukar.th...@gmail.com
wrote:

 Thanks for quick response. Let me try it.

 On Sun, Mar 15, 2015 at 12:07 AM, Michael Trinkala mtrink...@mozilla.com
 wrote:

 Just remove the capture group and the timestamp variable will contain the
 value you want.

 local timestamp = l.P[ * dt.build_strftime_grammar(%a %b %d
 %H:%M:%S.%s %Y) / dt.time_to_ns * ]
 if timestamp then msg.Timestamp = timestamp end

 Trink

 On Sat, Mar 14, 2015 at 8:58 PM, Madhukar Thota madhukar.th...@gmail.com
  wrote:

  I am trying to decode apache error logs using lua script. I was able
 to extract the fileds i needed but having problem with time conversion.

 my log timestamp is in the following format : [Sat Mar 14
 23:42:45.990158 2015]

 in my lua script, i am converting timestamp as follows:

 local timestamp = l.P[ * l.Cg(dt.build_strftime_grammar(%a %b %d
 %H:%M:%S.%s %Y) / dt.time_to_ns, timestamp) * ]

 and passing this timestamp filed to msg.Timestamp as follows:

 if fields.timestamp then
 msg.Timestamp = fields.timestamp
 fields.timestamp = nil
 end

 But i am seeing the output as follows.
 2015/03/14 23:42:47
 :Timestamp: 1970-01-12 19:02:38 + UTC

 Please let me know me how to convert the timestamp to actual timestamp
 of log entry.

 ___
 Heka mailing list
 Heka@mozilla.org
 https://mail.mozilla.org/listinfo/heka




___
Heka mailing list
Heka@mozilla.org
https://mail.mozilla.org/listinfo/heka


Re: [heka] Timestamp Conversion

2015-03-14 Thread Michael Trinkala
Just remove the capture group and the timestamp variable will contain the
value you want.

local timestamp = l.P[ * dt.build_strftime_grammar(%a %b %d %H:%M:%S.%s
%Y) / dt.time_to_ns * ]
if timestamp then msg.Timestamp = timestamp end

Trink

On Sat, Mar 14, 2015 at 8:58 PM, Madhukar Thota madhukar.th...@gmail.com
wrote:

  I am trying to decode apache error logs using lua script. I was able to
 extract the fileds i needed but having problem with time conversion.

 my log timestamp is in the following format : [Sat Mar 14 23:42:45.990158
 2015]

 in my lua script, i am converting timestamp as follows:

 local timestamp = l.P[ * l.Cg(dt.build_strftime_grammar(%a %b %d
 %H:%M:%S.%s %Y) / dt.time_to_ns, timestamp) * ]

 and passing this timestamp filed to msg.Timestamp as follows:

 if fields.timestamp then
 msg.Timestamp = fields.timestamp
 fields.timestamp = nil
 end

 But i am seeing the output as follows.
 2015/03/14 23:42:47
 :Timestamp: 1970-01-12 19:02:38 + UTC

 Please let me know me how to convert the timestamp to actual timestamp of
 log entry.

 ___
 Heka mailing list
 Heka@mozilla.org
 https://mail.mozilla.org/listinfo/heka


___
Heka mailing list
Heka@mozilla.org
https://mail.mozilla.org/listinfo/heka


Re: [heka] GeoIPdecoder with nginx access logs

2015-03-06 Thread Michael Trinkala
You could just add it to the decoder if desired. Using the Lua geoip module
it would look like this:

require geoip.city
local city_db = assert(geoip.city.open(read_config(geoip_city_db)))
.
.
.
fields.country_code = city_db:query_by_addr(fields.remote_addr,
country_code)



On Thu, Mar 5, 2015 at 9:09 AM, Madhukar Thota madhukar.th...@gmail.com
wrote:

 Hi there

 is there an working example on how to use GeoIPdecoder with nginx or
 apache access logs?




 ___
 Heka mailing list
 Heka@mozilla.org
 https://mail.mozilla.org/listinfo/heka


___
Heka mailing list
Heka@mozilla.org
https://mail.mozilla.org/listinfo/heka


Re: [heka] Heka Release including Kafka Output

2015-02-10 Thread Michael Trinkala
The current release as Kafka support.

Trink

On Tue, Feb 10, 2015 at 7:49 AM, Adriano Santos adriano.san...@gmail.com
wrote:

 Hi All,

 Does anybody know when a new release of heka including Kafka will be
 out?

 Best,
 Adriano

 ___
 Heka mailing list
 Heka@mozilla.org
 https://mail.mozilla.org/listinfo/heka


___
Heka mailing list
Heka@mozilla.org
https://mail.mozilla.org/listinfo/heka


Re: [heka] Heka Release including Kafka Output

2015-02-10 Thread Michael Trinkala
Ha, yeah sorry it made the change log for 0.8 but it isn't actually there.
0.9 will most likely be released next week.

Trink

On Tue, Feb 10, 2015 at 8:21 AM, Adriano Santos adriano.san...@gmail.com
wrote:

 Hi Michael,

  Thanks for you promptly reply and correct me if I'm wrong.
  I downloaded the version 0.8.3 from
 https://github.com/mozilla-services/heka/releases and it looks like it is
 still doesn't support kafka, I could see kafka documentation on heka
 version 0.9 but it is not out yet. Do you have an idea when next release
 supporting kafka will be out?

 Best,
 Adriano

 On Tue, Feb 10, 2015 at 10:10 AM, Michael Trinkala mtrink...@mozilla.com
 wrote:

 The current release as Kafka support.

 Trink

 On Tue, Feb 10, 2015 at 7:49 AM, Adriano Santos adriano.san...@gmail.com
  wrote:

 Hi All,

 Does anybody know when a new release of heka including Kafka will be
 out?

 Best,
 Adriano

 ___
 Heka mailing list
 Heka@mozilla.org
 https://mail.mozilla.org/listinfo/heka



 ___
 Heka mailing list
 Heka@mozilla.org
 https://mail.mozilla.org/listinfo/heka



___
Heka mailing list
Heka@mozilla.org
https://mail.mozilla.org/listinfo/heka


Re: [heka] load a large number of sandboxes.

2015-02-09 Thread Michael Trinkala
It is only part of the build and not included in the
installation/distribution.  Since it just loads multiple versions of the
same test sandbox it is only really useful for debugging during development.

Trink

On Mon, Feb 9, 2015 at 10:13 AM, Djamel F. djamel...@gmail.com wrote:

 recently I tried to load many sandboxes at once, but I don't found how to
 use heka-sbmgrload command.

 After reading the doc
 https://github.com/mozilla-services/heka/blob/master/docs/source/sandbox/manager.rst#heka-sbmgrload
  I
 understood I need to run this following command: heka-sbmgrload [-config
 config_file] [-action load|unload] [-num number of sandbox instances]
 ​
 However this command is not found.

 Any suggestion ?

 Thanks.​

 ___
 Heka mailing list
 Heka@mozilla.org
 https://mail.mozilla.org/listinfo/heka


___
Heka mailing list
Heka@mozilla.org
https://mail.mozilla.org/listinfo/heka


Re: [heka] [Heka] Test Sandboxes

2015-01-22 Thread Michael Trinkala
Our sandbox unit tests can be found here:
https://github.com/mozilla-services/heka/tree/dev/sandbox/plugins  They
should provide a reasonable example of how it can be done.

Trink

On Thu, Jan 22, 2015 at 10:42 AM, Djamel F. djamel...@gmail.com wrote:

 I wrote many sandboxes to treat data, and I would
 ​to
  do some test
 ​in order ​
 to make sure of the reliability of my sandboxes
 ​​
 . But I didn't find how write test file
 ​.

 ​
 ​Please help me to find how to make test​
 ,

 ---
 Djamel FELLAH

 ___
 Heka mailing list
 Heka@mozilla.org
 https://mail.mozilla.org/listinfo/heka


___
Heka mailing list
Heka@mozilla.org
https://mail.mozilla.org/listinfo/heka


Re: [heka] Throttling email alerts

2014-11-19 Thread Michael Trinkala
We usually throttle them at the source like 
https://github.com/mozilla-services/heka/blob/dev/sandbox/lua/filters/http_status.lua#L85
 
but there is nothing to prevent you from adding global throttling or 
aggregation in the alert encoder (by host, plugin, application... whatever) 

Trink 

- Original Message -

 From: Klaus Post klausp...@gmail.com
 To: heka@mozilla.org
 Sent: Wednesday, November 19, 2014 6:15:39 AM
 Subject: [heka] Throttling email alerts

 Hi!

 I have been looking all through the documentation and source code to see if I
 could find a way to throttle email sending.

 I have a simple setup, where I have a simple matcher for severity = 3 like
 this:

 [RstEncoder]

 [ErrorAlert]
 type = SmtpOutput
 message_matcher = Severity = 3
 send_from =  h...@xxx.com 
 send_to = [ klausp...@xxx.com ]
 auth = none
 host =  xxx.com:25 
 encoder = RstEncoder

 However, there is the issue that if something goes wrong, various subsystems
 like syslog, individual applications start spewing out errors at a steady
 rate, often resulting in thousands of mails being sent.

 Is there a way to limit the number of emails with configuration or does
 anyone have a lua script that can do this?

 Also, thanks for the great work on heka!

 Regards, Klaus Post

 http://www.klauspost.com

 ___
 Heka mailing list
 Heka@mozilla.org
 https://mail.mozilla.org/listinfo/heka
___
Heka mailing list
Heka@mozilla.org
https://mail.mozilla.org/listinfo/heka