Forgot to mention, in the sample code you included, your problem will go away 
if instead of iterating through the fields returned from the grammar you just 
set the msg.Fields value every time:


function process_message()
  local data = read_message("Payload")
  msg.Fields = grammar:match(data)
  inject_message(msg)
  return 0
end


-r

On 11/02/2015 10:03 AM, Rob Miller wrote:
If you're not careful to zero out the values, or to explicitly set each
value every time, then yes, you'll end up leaking data from one
process_message call to the next.

Even so, however, it's often a good idea to define the msg table outside
of the process_message call because then the same block of memory will
be reused each time. If you define the table inside of process_message,
then a new chunk of memory will be allocated with every call, which will
go out of scope when the call exits. This will cause a great deal of
garbage collection churn, likely impacting performance greatly.

So, yes, you should be careful to not let values leak through, but it's
generally worth taking the extra care.

-r


On 11/01/2015 06:10 AM, Timur Batyrshin wrote:
> Hi,
>
> In many stock decoders I see the  construct like this:
>
> local msg = {
>     Timestamp = nil,
>     EnvVersion = nil,
>     Hostname = nil,
>     Type = msg_type,
>     Payload = nil,
>     Fields = nil,
>     Severity = nil
>     }
>
>     function process_message()
>
>
> Here the local variable is defined outside of main functions.
>
>  From the docs here
> 
(http://hekad.readthedocs.org/en/v0.10.0b1/sandbox/index.html#how-to-create-a-simple-sandbox-filter)
>
> I understand that this variable is initialized once at Heka start and
> after that it is reused.
> This would mean that previous decodes could affect the subsequent
> decodes.
>
> Does this sound like a bug or I'm missing something?
>
> I'm asking about that because I'm using the similar approach in my code
> and I've seen leaking the old data into new messages (some non-relevant
> parts were skipped):
>
> local msg = {
>    Type = msg_type,
>    Payload = nil,
>    Hostname = read_config("Hostname"),
>    Fields = {},
> }
>
> function process_message()
>    local data = read_message("Payload")
>    fields = grammar:match(data)
>    for k,v in pairs(fields) do
>        msg.Fields[k] = v
>    end
>
>    inject_message(msg)
>    return 0
> end
>
> In this case the fields set in the first message appeared in the
> successive message.
> After I've moved the local msg {} into inside of process_message() all
> seemed to start working fine.
>
> Why I'm writing here about that is this behaviour could be subtly
> affecting many other decoders in Heka.
>
>
> Thanks and regards,
> Timur
>
>
> _______________________________________________
> Heka mailing list
> [email protected]
> https://mail.mozilla.org/listinfo/heka
>


_______________________________________________
Heka mailing list
[email protected]
https://mail.mozilla.org/listinfo/heka

Reply via email to