On 05/18/2015 02:10 AM, Kai Storbeck wrote:
Hello Heka,

I'm currently streaming a logfile containing large XML messages. They
are separated by a long line with dashes, so I'm making use of
RegexSplitter containing those dashes.

This works, the messages are getting thrown over to elasticsearch for
indexing.


Restarting heka will now give me an error:

> 2015/05/18 10:29:56 Decoder 'b2bsoap-b2bdecoder-1' error: No match: ..
> .....
> .....
> ...
> </closing xml tag>

I percieve that his is a problem in the bookkeeping of the seek
position, as that points to the middle of a multiline record.
Yes, I think that's correct.
Can I assist in curing this? Is it curable? Is it a good starting point
to help improving heka? Or are there smaller outstanding issues to
assist with...
Sure, your help resolving this would be welcome. I took a quick peek and I 
think that the issue is related to the following code:

https://github.com/mozilla-services/heka/blob/dev/plugins/logstreamer/logstreamer_input.go#L359

That's the LogstreamInput (a pool of which are managed by each LogstreamerInput) 
telling the underlying stream to update the ring buffer with the latest read 
position. You'll notice that it's happening there whenever n > 0, i.e. whenever 
any data is successfully read from the input stream. What you're asking for is to 
instead only update the read position if len(record) > 0, which implies that a 
full record was retrieved.

You'll want to test this out, though, rather than take my word for it. There's 
a lot of code in there, it might be that even if you change that code the 
location will still get flushed to disk at shutdown. Hopefully this is a good 
starting point.

If you do tackle this, I think it would be nice to retain backwards 
compatibility by turning the new behavior on with a config flag, say if the 
user sets `update_cursor_on_record_boundary` to true, or something.

-r


Regards,
Kai



_______________________________________________
Heka mailing list
[email protected]
https://mail.mozilla.org/listinfo/heka


_______________________________________________
Heka mailing list
[email protected]
https://mail.mozilla.org/listinfo/heka

Reply via email to