On Fri, 11 Feb 2011, Rainer Gerhards wrote:
-----Original Message-----
From: [email protected] [mailto:rsyslog-
[email protected]] On Behalf Of [email protected]
On Fri, 11 Feb 2011, Rainer Gerhards wrote:
Have a look at ./runtime/parser.c, function SanitizeMsg. It builds a
new buffer and uses MsgSetRawMsg to set the new buffer. MsgSetRawMsg
handles the "dirty" internals of message object buffer manipulation.
Note that it may be quicker to manipulate the buffer pointers
yourself. But then you must be very careful. MsgSetRawMsg should
provide the necessary hints. The thing to keep on your mind is that up
to a certain message length, a buffer is used from the msg object
itself (thus saving one malloc/free call) whereas for larger size
messages, memory is allocated. You need to keep that straight during
manipulation.
I'll look at it and see how hard it is to separate these two cases.
thanks
for the pointer here.
Just let me add that I did find it of questionable value to try avoid the
malloc here. At least in the sanitization problem, this would have resulted
in very complex code. And while saving memory writes and calls to the malloc
subsystem is useful, I thought that it would not have brought much benefit in
that case. Depending on what you intend to do (well-defined insert at late
point) things may be different, though.
My initial thought is something along the following
1. find out how much space is available in whatever buffer the message is
in (potentially 0 if the buffer is exactly the right size)
document what needs to happen to adjust how much of the buffer is used
(I've already figured out some of this with the existing parser modules)
2. if there is not enough space, document what the process is to allocate
a new buffer and make the system use it.
at this point it should be fairly straightforward to write a routine to do
something along the lines of 'make sure I have enough space in the buffer
to add X characters' and have it either return immediatly if there's
enough space or allocate the larger buffer if needed and return after
doing that.
there will be some things that will need to be documented as side effects
(pointers into the existing message may be invalid at that point,
including values in the msg structure)
this could be mis-used (running this routine for every control character
found could result in many malloc/free pairs for example), and so examples
will need to be given of doing a 2-pass routine, pass 1 to figure out what
you want to do, and then make sure there's enough space and do pass 2 to
modify the buffer as needed.
Using this for sanitizing would still be slightly less efficient than the
approach you probably use now (allocate a new buffer, copy things into it
as you go to construct a new message, then set the message into the
structure), but probably not by more than two copies of the text. As a
result, it may be that the result will be enough cleaner to be worth the
cost. I'm thinking that the new routine would be to copy the text from the
old buffer to the new one, then copy everything after your first insert to
the end of the buffer. after that you are copying data from late in the
buffer to earlier in the buffer, which may even be faster than copying
small amounts of data from one buffer to another as it may result in
better cache behavior.
in fact, this pattern is probably common enough to make it a routine
itself
something like
int InsertIntoRawMsg(int offset, int count)
inserts at least count spaces into the message at position offset from
the beginning of the message, returns the number of spaces actually
inserted (may be more than the number requested)
or would it be better to return the number of extra characters available
in the buffer after the end of the string?
I figure error checking on the return is not needed because if it can't
allocate the space we need to bail out (with whatever rsyslog does when it
runs out of memory, probably aborting the message entirely)
David Lang
Rainer
As a side-note, it would probably be useful if you could take some
bullet points on how to modify things, so that others can find that
information in the case they want to do that themselves. Could go to
the wiki or I could include it in the doc set. Just a suggestion,
though...
I'll see what I can do.
David Lang
Rainer
-----Original Message-----
From: [email protected] [mailto:rsyslog-
[email protected]] On Behalf Of [email protected]
Sent: Friday, February 11, 2011 5:38 AM
To: rsyslog-users
Subject: [rsyslog] how can a parser insert data into a message
the various parser modules that I've submitted are all removing data
from
the log message or overwriting the data in place.
But I've now run across a situation where I need to insert
information
into the message. I know that this can be done because the
sanitizing
call
does exactly this. I am assuming that this is doing something like
allocating a new string and copying the data into the new string.
the concern is how to do this in a way that will survive the exit
from
the
module, not confuse any of the many pointers or sizes that are
involved,
and make sure everything is properly freed afterwords.
should I just search for the sanitizing routine and copy what it
does
(and
can you point me at it?), or do you want me to wait until you have
time
to
write something up on this?
David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com