That makes sense, thanks Rob! On Thu, Feb 5, 2015 at 6:06 PM, Rob Miller <[email protected]> wrote:
> On 02/05/2015 04:39 AM, Victor Castell wrote: > >> Yeah I know it's different because my input is LogStreamer and the >> example in docs was for receiving protobuf by tcp. >> >> I want to understand. >> >> When my LogStreamer reads a message it passes to the decoder a Protocol >> Buffer message with the with the log line in the message payload, that's >> right? >> > Nope, this is the misunderstanding. When Logstreamer reads a text file, it > passes to the decoder an instantiated Message struct with the log line in > the message payload. Protocol buffers aren't involved at all. The only time > it makes sense to use a ProtobufDecoder with a LogstreamerInput is if the > file(s) you're loading contain binary, protobuf encoded Heka messages, such > as those generated by Heka itself using a FileOutput with a > ProtobufEncoder. This is a valid use case; in fact at Mozilla we do this > often. Heka even ships with a command line utility called `heka-cat` ( > http://hekad.readthedocs.org/en/dev/developing/testing.html#heka-cat) > which lets you browse and query the contents of such files. > > If the files you're loading are plain text log files, however, a > ProtobufDecoder will have no idea what to do with them. It will fail on > every message. And it will slow things down considerably. > >> Using the following config as my input decoder (this is what I actually >> tried): >> >> [syslog-decoder] >> subs = ['nginx-access-decoder', 'ProtobufDecoder'] >> cascade_strategy = "first-wins" >> log_sub_errors = true >> >> [ProtobufDecoder] >> >> This should capture my nginx log lines and remove it from the decoding >> "cascade" and pass all the rest to ProtobufDecoder that in turn doesn't >> do nothing. >> >> Is this correct? >> > The first part is correct, any successfully parsed nginx log files won't > make it through to the ProtobufDecoder. But any messages that fail the > nginx parser will be given to the ProtobufDecoder, which will have no idea > what to do with them. > >> And if it is, why is this so slow? >> > See above. :) > > -r > > >> On Wed, Feb 4, 2015 at 8:08 PM, Rob Miller <[email protected] >> <mailto:[email protected]>> wrote: >> >> The config that you cargo-culted from the docs is meant for an >> entirely different use case. That's meant to handle cases where >> you're receiving protocol buffer encoded Heka messages, each of >> which contains an Nginx access log line as the message payload. This >> would be useful in a case where one Heka is loading the log files >> but instead of parsing them it's sending them along in protobuf >> format to another Heka that's doing the parsing. The config below >> would be used on the listener. >> >> If you want to see the decoding errors all you need to do is change >> your `log_sub_errors` setting from false to true. >> >> -r >> >> >> >> On 02/04/2015 04:19 AM, Victor Castell wrote: >> >> I managed to get a working config but I want to understand >> what's going on: >> >> [syslog-decoder] >> type = "MultiDecoder" >> subs = ['nginx-access-decoder', 'rsyslog-decoder'] >> cascade_strategy = "first-wins" >> log_sub_errors = false >> >> In the nginx-access I'm deconding the corresponding access.log >> entries >> from my rsyslog and in the rsyslog decoder I'm capturing any other >> rsyslog entries and discarding them. >> >> That works well but in my first attempt I tried with the config >> extracted from the documentation: >> >> [shipped-nginx-decoder] >> type = "MultiDecoder" >> subs = ['ProtobufDecoder', 'nginx-access-decoder'] >> cascade_strategy = "all" >> log_sub_errors = true >> >> [ProtobufDecoder] >> >> I would rather this config than the previous one, because it can >> log me >> the errors of my nginx decoding. >> >> The problem is that when using the ProtobufDecoder the speed of >> decoding >> is really slow, and my nginx logs doesn't keep up with the current >> events in time, and it's always behind the current time. >> >> This doesn't happen with the rsyslog-decoder config, it parses >> the logs >> really fast. >> >> I thought it will be much faster using the internal >> ProtobufDecoder than >> a lua one but it's not the case. >> >> What's the reason for this? >> >> >> On Fri, Jan 30, 2015 at 11:31 AM, Victor Castell >> <[email protected] <mailto:[email protected]> >> <mailto:victor@victorcastell.__com >> <mailto:[email protected]>>> wrote: >> >> Didn't know of that! Life saver >> >> Thanks! >> >> El 30/1/2015 11:17, "Krzysztof Krzyżaniak" >> <[email protected] <mailto:[email protected]> >> <mailto:[email protected] >> <mailto:[email protected]>__>> escribió: >> >> W dni pią 30 sty, 2015 o 10∶34 użytkownik Victor Castell >> <[email protected] >> <mailto:[email protected]> >> <mailto:victor@victorcastell.__com >> <mailto:[email protected]>>> >> napisał: >> > Hi! >> > >> > I have a centralized rsyslog formatted logfile and I'm >> > extracting nginx logs from there using heka and the >> nginx >> > access log decoder. >> > >> > The problem is that the parser also logs every other log >> > message out to heka.log. >> > >> > The volume of non nginx logs mixed in my rsyslog log is >> really >> > huge so heka.log file is growing like crazy (I have >> > logrotating before you ask) >> > >> > Is there a way to conditionally/intentionally suppress >> the >> > parsing errors of a given decoder? >> >> You probably want to use MultiDecoder which split nginx >> logs >> from the rest and use log_sub_errors = false in >> MultiDecoder >> section. >> >> eloy >> >> >> >> >> -- >> V >> >> >> _________________________________________________ >> Heka mailing list >> [email protected] <mailto:[email protected]> >> https://mail.mozilla.org/__listinfo/heka >> <https://mail.mozilla.org/listinfo/heka> >> >> >> >> >> >> -- >> V >> > > -- V
_______________________________________________ Heka mailing list [email protected] https://mail.mozilla.org/listinfo/heka

