Hi all,
As anyone who's asked about text parsing in the #heka IRC channel will
tell you, we strongly recommend people use Lua Parsing Expression
Grammars (LPEG) in a SandboxDecoder to handle your parsing needs, rather
than regular expressions and the PayloadRegexDecoder. With good reason:
LPEG is more expressive, more flexible and composable, and considerably
faster.
Unfortunately, familiarity with writing and using regular expressions is
much more widespread than familiarity working with context-free
grammars, and many find getting started with LPEG daunting. To help
bridge this gap, our resident LPEG expert Trink has written up a
detailed example of his process of converting a working
PayloadRegexDecoder / MultiDecoder setup to a functionally equivalent
one using a SandboxDecoder and LPEG:
https://github.com/mozilla-services/heka/wiki/How-to-covert-a-PayloadRegex-MultiDecoder-to-a-SandboxDecoder-using-an-LPeg-Grammar
Hopefully this will encourage some of you who may be using the
PayloadRegexDecoder to experiment with a similar process of your own,
using this page as a guide.
Good luck, and happy Heka-ing!
-r
p.s. It's worth mentioning that this is only relevant if you need to
parse a text format that Heka doesn't already support. If you're dealing
w/ log files that are being generated by nginx, apache, rsyslog, or
(alpha quality) haproxy, there are existing SandboxDecoders to use, you
won't need to write your own.
_______________________________________________
Heka mailing list
[email protected]
https://mail.mozilla.org/listinfo/heka