Most looping requirements can be flatten out: i.e., alerting can be handled
in the output plugins in your example and aggregation/sessionization etc
can be handled in the inputs. As for sharing: things fall into place when
you start thinking about it from a module level instead of an individual
plugin level.

The locations are configurable 'output_path' so you can put the output
files anywhere you want.

I have some plugins (like a stdin, simple file, TCP. and a pruning (cleans
up the output files when everyone in done with them) inputs; heka protobuf
and a payload outputs etc. but they haven't commited yet). As for
ProcessInput the Input sandbox has access to os.execute so there won't be a
generic version you can just call what you want and handle its output
directly (Hindsight already supports run once, polling, and continuous
input plugins)

The output files will grow until 'output_size' (defaults to 64MiB) before
they are rolled (they are not deleted by default).  I would not make it too
small unless you need to prune really quickly generally I run with a config
of 1GIB (on some of our systems that rolls several times a minute) space
permitting I would just roll them after several minutes of what would be
average data flow on your system)

Trink





On Wed, Aug 12, 2015 at 1:55 AM, bruno binet <[email protected]> wrote:

>
>
> On 12 August 2015 at 10:19, bruno binet <[email protected]> wrote:
>
>> Thanks for all this valuable information.
>>
>> On 11 August 2015 at 17:17, Michael Trinkala <[email protected]>
>> wrote:
>>
>>> There are a few intentional changes between Heka and Hindsight.  Looping
>>> messages in Heka has always been a bad idea so it was removed.
>>>
>>
>> Personally I like the looping messages feature in Heka as it is very
>> flexible and could be useful to share ready-to-use plugins. Also it
>> supports processing messages through multiple ticker_interval which can be
>> useful (alerting, aggregations).
>>
>>
>>> There are a few API enhancements such as a protobuf stream reader and
>>> writer.  Checkpoint are all managed by the Hindsight infrastructure (so
>>> much of the burden is removed from the plugin writer, this also alters the
>>> plugin API slightly).  The write_message hack for Go has been removed since
>>> messages are immutable.  read_config now has access to all related sandbox
>>> config options (standard and user defined). read_next_field is not
>>> supported (this will also be removed from Heka in 0.11).
>>>
>>> In most cases you will find the Hindsight IOPS lower than Heka due to
>>> the much more efficient check pointing  (btw Heka 0.11 is moving to a disk
>>> buffer everywhere).
>>>
>>
>> Great, that is good to know.
>>
>>
>>> output_hi/input/* - contains the output from all of the input plugins
>>> output_hi/analysis/* - contains the output from all of the analysis
>>> plugins
>>>
>>
> Are the above files always growing?
> I suppose the output_limit configuration allow us limit their size: what
> are the implications if I limit their size to a few KB? Will it reduce
> Hindsight performance?
>
>
>> hindsight.cp - in the checkpoint file for all I/O (inputs, analysis, and
>>> output plugins)
>>> hindsight.tsv - in the self monitoring performance stats
>>>
>>> They files are all mandatory.  They are the reason Hindsight has an at
>>> least once delivery guarantee and they provide valuable insight on system
>>> operation and performance.
>>>
>>
>> If I don't need delivery guarantee, do you think it could make sense to
>> move these files to a ramdisk (tmpfs) partition in order to preserve the
>> flash sd/usb card?
>>
>>
>>> decode_message needs to be turned on for analysis and output plugins, I
>>> will enable it.
>>>
>>
>> Ok, thank you: this is now working as expected.
>>
>> Also, do you plan to implement some additional lua modules to help build
>> input sandboxes similar to Heka input plugins (like the FilePollingInput,
>> the ProcessInput, or the LogstreamerInput)?
>>
>>
>>>
>>> Trink
>>>
>>>
>>> On Tue, Aug 11, 2015 at 1:54 AM, bruno binet <[email protected]>
>>> wrote:
>>>
>>>> I see, so I need to investigate how I can merge my multiple lua sandbox
>>>> filters into a single one.
>>>>
>>>> This make me wondering if there is some other differences between
>>>> Hindsight and Heka?
>>>> The fact that only one analysis plugin cannot consume the output of
>>>> another analysis plugin is the only difference beween Hindsight analysis
>>>> plugins and Heka filter sandbox plugins?
>>>>
>>>> Also I saw in another thread that Hindsight uses disk buffers at every
>>>> stage, so there's only ever one
>>>> message in memory at every step of the pipeline: does it mean Hindsight
>>>> will write much more frequently to the disk than Heka? This may be an issue
>>>> for me since we use a raspberry pi which disk is a sdcard or usb flash key.
>>>>
>>>> I see that some data is written to the output_path (output_hl/
>>>> directory in my case): can you explain what are all these files:
>>>> $ tree output_hl/
>>>> output_hl/
>>>> |-- analysis
>>>> |   `-- 0.log
>>>> |-- hindsight.cp
>>>> |-- hindsight.tsv
>>>> `-- input
>>>>     `-- 0.log
>>>>
>>>> Can we avoid generating all these files?
>>>>
>>>> Last question: I don't manage to use the "read_next_field" or
>>>> "decode_message" api function from the output plugin, are they available?
>>>>
>>>> The following error is returned:
>>>> 1439280624615780495 [error] output_plugins terminated:
>>>> output/encode_metric.cfg msg: process_message()
>>>> _hl/output/encode_metric.lua:16: attempt to call global 'read_next_field'
>>>> (a nil value)
>>>>
>>>> or when I change my output plugin to use the decode_message api
>>>> function:
>>>> 1439282566555139226 [error] output_plugins terminated:
>>>> output/encode_metric.cfg msg: process_message()
>>>> _hl/output/encode_metric.lua:15: attempt to call global 'decode_message' (a
>>>> nil value)
>>>>
>>>>
>>>> Thanks,
>>>> Bruno
>>>>
>>>>
>>>> On 10 August 2015 at 19:19, Michael Trinkala <[email protected]>
>>>> wrote:
>>>>
>>>>> There is no message looping in Hindsight (one analysis plugin cannot
>>>>> consume the output of another analysis plugin).  In your example the
>>>>> decoding should happen in the input.  Heka has Inputs, splitters, and
>>>>> decoder (in Hindsight it is just an Input and common functionality can be
>>>>> split into modules for code reuse).  This in general simplifies the
>>>>> configuration, is easier to follow (since everything is in one place) and
>>>>> has performance benefits as well.
>>>>>
>>>>> Trink
>>>>>
>>>>> On Mon, Aug 10, 2015 at 9:23 AM, bruno binet <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Back from vacations, I'm now playing again with Hindsight on a
>>>>>> raspberry pi.
>>>>>> As reported on github
>>>>>> https://github.com/trink/hindsight/issues/1#issuecomment-119593775
>>>>>> the compilation now succeeds.
>>>>>>
>>>>>> So getting inspiration from the examples in the benchmarks directory,
>>>>>> I tried to create a Hindsight configuration to use my own lua sandboxes: 
>>>>>> I
>>>>>> can successfully read data from udp and use a filter to decode data, 
>>>>>> then I
>>>>>> would like to use another filter to handle generated messages, but I 
>>>>>> can't
>>>>>> get any message in the second filter. Does Hindsight support more than 
>>>>>> one
>>>>>> filter like Heka?
>>>>>>
>>>>>> Here is the Hindsight configuration, Lua sandboxes and output
>>>>>> directory generated by Hindsight:
>>>>>> https://github.com/bbinet/hindsight_hl_test
>>>>>>
>>>>>> Do you see anything wrong? Do I use hindsight correctly?
>>>>>>
>>>>>> Cheers,
>>>>>> Bruno
>>>>>>
>>>>>> On 8 July 2015 at 09:44, bruno binet <[email protected]> wrote:
>>>>>>
>>>>>>> Sure, I will try your branch and report possible new compilation
>>>>>>> issues in github.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Bruno
>>>>>>>
>>>>>>> On 7 July 2015 at 18:26, Michael Trinkala <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I changed the checkpoint id to an unsigned long long. Can you test
>>>>>>>> out the branch and add any other compilation errors to the issue 
>>>>>>>> (closing
>>>>>>>> out this email thread).  I am also taking suggestions/recommendations 
>>>>>>>> for a
>>>>>>>> CI build system that supports multiple platforms.  TravisCI adds 
>>>>>>>> almost no
>>>>>>>> value since I am already building on a Debian based box.
>>>>>>>>
>>>>>>>> https://github.com/trink/hindsight/tree/issue_1
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Trink
>>>>>>>>
>>>>>>>> On Tue, Jul 7, 2015 at 8:21 AM, bruno binet <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Ok, thanks.
>>>>>>>>> And sorry, but I don't have a patch (don't know how to fix this
>>>>>>>>> kind of compilation issue).
>>>>>>>>>
>>>>>>>>> On 7 July 2015 at 16:17, Michael Trinkala <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Yeah, I have only been building on Ubuntu and haven't done any
>>>>>>>>>> cross platform clean-up.  Thanks for the build output I will fix 
>>>>>>>>>> those
>>>>>>>>>> errors (unless you already have a patch).
>>>>>>>>>>
>>>>>>>>>> Trink
>>>>>>>>>>
>>>>>>>>>> On Tue, Jul 7, 2015 at 5:57 AM, bruno binet <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> I now have some time to do a few tests with Hindsight, so I
>>>>>>>>>>> tried to compile it on our targeted arm platform (raspberry pi), 
>>>>>>>>>>> but I get
>>>>>>>>>>> the following error:
>>>>>>>>>>>
>>>>>>>>>>> root@hl-mc-9999-dev:~/hindsight/release# cmake
>>>>>>>>>>> -DCMAKE_BUILD_TYPE=release ..
>>>>>>>>>>> -- The C compiler identification is GNU 4.7.2
>>>>>>>>>>> -- The CXX compiler identification is GNU 4.7.2
>>>>>>>>>>> -- Check for working C compiler: /usr/bin/gcc
>>>>>>>>>>> -- Check for working C compiler: /usr/bin/gcc -- works
>>>>>>>>>>> -- Detecting C compiler ABI info
>>>>>>>>>>> -- Detecting C compiler ABI info - done
>>>>>>>>>>> -- Detecting C compile features
>>>>>>>>>>> -- Detecting C compile features - done
>>>>>>>>>>> -- Check for working CXX compiler: /usr/bin/g++
>>>>>>>>>>> -- Check for working CXX compiler: /usr/bin/g++ -- works
>>>>>>>>>>> -- Detecting CXX compiler ABI info
>>>>>>>>>>> -- Detecting CXX compiler ABI info - done
>>>>>>>>>>> -- Detecting CXX compile features
>>>>>>>>>>> -- Detecting CXX compile features - done
>>>>>>>>>>> -- Found LUASANDBOX: /usr/local/lib/libluasandbox.so
>>>>>>>>>>> -- Configuring done
>>>>>>>>>>> -- Generating done
>>>>>>>>>>> -- Build files have been written to: /root/hindsight/release
>>>>>>>>>>>
>>>>>>>>>>> root@hl-mc-9999-dev:~/hindsight/release# make
>>>>>>>>>>> Scanning dependencies of target hindsight
>>>>>>>>>>> [  2%] Building C object src/CMakeFiles/hindsight.dir/
>>>>>>>>>>> hindsight.c.o
>>>>>>>>>>> [  4%] Building C object src/CMakeFiles/hindsight.dir/
>>>>>>>>>>> hs_analysis_plugins.c.o
>>>>>>>>>>> [  6%] Building C object src/CMakeFiles/hindsight.dir/
>>>>>>>>>>> hs_checkpoint_reader.c.o
>>>>>>>>>>> /root/hindsight/src/hs_checkpoint_reader.c: In function
>>>>>>>>>>> 'find_first_id':
>>>>>>>>>>> /root/hindsight/src/hs_checkpoint_reader.c:46:3: error: large
>>>>>>>>>>> integer implicitly truncated to unsigned type [-Werror=overflow]
>>>>>>>>>>> /root/hindsight/src/hs_checkpoint_reader.c:55:3: error:
>>>>>>>>>>> comparison is always false due to limited range of data type
>>>>>>>>>>> [-Werror=type-limits]
>>>>>>>>>>> cc1: all warnings being treated as errors
>>>>>>>>>>> src/CMakeFiles/hindsight.dir/build.make:100: recipe for target
>>>>>>>>>>> 'src/CMakeFiles/hindsight.dir/hs_checkpoint_reader.c.o' failed
>>>>>>>>>>> make[2]: *** [src/CMakeFiles/hindsight.dir/hs_checkpoint_reader.c.o]
>>>>>>>>>>> Error 1
>>>>>>>>>>> CMakeFiles/Makefile2:947: recipe for target
>>>>>>>>>>> 'src/CMakeFiles/hindsight.dir/all' failed
>>>>>>>>>>> make[1]: *** [src/CMakeFiles/hindsight.dir/all] Error 2
>>>>>>>>>>> Makefile:146: recipe for target 'all' failed
>>>>>>>>>>> make: *** [all] Error 2
>>>>>>>>>>>
>>>>>>>>>>> Do you know what is going on here? I guess this is an issue with
>>>>>>>>>>> the arm platform only?
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> Bruno
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 10 June 2015 at 18:41, bruno binet <[email protected]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Thanks a lot for your answers.
>>>>>>>>>>>>
>>>>>>>>>>>> And yes, I'm very interested in bootstrapping a first prototype
>>>>>>>>>>>> of my own data pipeline based on Hindsight so that I can compare 
>>>>>>>>>>>> the
>>>>>>>>>>>> performance on a raspberry pi.
>>>>>>>>>>>> (here is the current state of our Heka-based data pipeline:
>>>>>>>>>>>> https://bitbucket.org/helioslite/heka-hl-sandboxes)
>>>>>>>>>>>> So it would be great if you can give me the first instructions
>>>>>>>>>>>> on how to build and setup Hindsight.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks.
>>>>>>>>>>>> Bruno
>>>>>>>>>>>>
>>>>>>>>>>>> On 10 June 2015 at 18:18, Michael Trinkala <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> - It is usable and being actively developed with the intent to
>>>>>>>>>>>>> move it into production later this year.
>>>>>>>>>>>>> - We are currently running production data through it for
>>>>>>>>>>>>> testing but it is not deployed in an official capacity.  It has 
>>>>>>>>>>>>> been very
>>>>>>>>>>>>> stable but until a more robust set of tests have been build out I 
>>>>>>>>>>>>> will not
>>>>>>>>>>>>> consider it production ready.
>>>>>>>>>>>>> - Yes, it can decode/encode Heka protobuf format
>>>>>>>>>>>>> - Yes, the router/message matcher is complete.  The only
>>>>>>>>>>>>> difference is that it supports Lua string pattern matching 
>>>>>>>>>>>>> instead of re2
>>>>>>>>>>>>> regexp  (Heka 'Hostname =~ /^foo/' vs Hindsight 'Hostname =~ 
>>>>>>>>>>>>> "^foo"')
>>>>>>>>>>>>> - Yes, but you would need a lua-socket input and output
>>>>>>>>>>>>> sandbox (see benchmarks/hsr_run for related examples)
>>>>>>>>>>>>> - No documentation yet, only examples in the benchmarks
>>>>>>>>>>>>> directory.  I could have you bootstrapped in about 30 minutes (and
>>>>>>>>>>>>> hopefully turn that into a getting started guide) if you are 
>>>>>>>>>>>>> interested.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Implementation wise the only missing piece is support for
>>>>>>>>>>>>> dynamically loading plugins.  The actual code to accomplish it is 
>>>>>>>>>>>>> very
>>>>>>>>>>>>> small (just detecting files in the load directory and moving them 
>>>>>>>>>>>>> to the
>>>>>>>>>>>>> run directory) but ideally it would be fronted by a web server 
>>>>>>>>>>>>> and a GUI
>>>>>>>>>>>>> with access control and validation (a much larger effort and 
>>>>>>>>>>>>> actually a
>>>>>>>>>>>>> separate project).
>>>>>>>>>>>>>
>>>>>>>>>>>>> Trink
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Jun 10, 2015 at 8:15 AM, bruno binet <
>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I recently discovered the work pushed into the Hindsight
>>>>>>>>>>>>>> repository (https://github.com/trink/hindsight) which seems
>>>>>>>>>>>>>> to be a lightweight alternative to Heka, based on the lua 
>>>>>>>>>>>>>> sandbox.
>>>>>>>>>>>>>> The Hindsight vs Heka benchmarks are quite impressive.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'm currently running Heka on the raspberry pi (not so
>>>>>>>>>>>>>> powerful) device and the load average quickly increases and 
>>>>>>>>>>>>>> exceeds 1 when
>>>>>>>>>>>>>> Heka is ingesting data, so Hindsight could be a good fit for us 
>>>>>>>>>>>>>> if it can
>>>>>>>>>>>>>> perform better than Heka in terms of CPU cycles.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> What is the current status of Hindsight? Is it just an
>>>>>>>>>>>>>> temporary experiment or will it be maintained and actually used 
>>>>>>>>>>>>>> in
>>>>>>>>>>>>>> production?
>>>>>>>>>>>>>> Is it currently usable and stable?
>>>>>>>>>>>>>> Is Hindsight able to decode and encode Heka protobuf format?
>>>>>>>>>>>>>> Does Hindsight have a complete router implementation to
>>>>>>>>>>>>>> dispatch messages to sandboxes like in Heka?
>>>>>>>>>>>>>> My use case is basically to read raw text data from UDP
>>>>>>>>>>>>>> socket, parse text data with lua patterns or lpeg, process data 
>>>>>>>>>>>>>> through a
>>>>>>>>>>>>>> few lua sandbox filters, then write output messages both to a 
>>>>>>>>>>>>>> file
>>>>>>>>>>>>>> (protobuf heka format) and a HTTP server (json format): can this 
>>>>>>>>>>>>>> be easily
>>>>>>>>>>>>>> accomplished with Hindsight?
>>>>>>>>>>>>>> Is there any documentation somewhere to get started with
>>>>>>>>>>>>>> Hindsight?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Bruno
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> Heka mailing list
>>>>>>>>>>>>>> [email protected]
>>>>>>>>>>>>>> https://mail.mozilla.org/listinfo/heka
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
_______________________________________________
Heka mailing list
[email protected]
https://mail.mozilla.org/listinfo/heka

Reply via email to