On 11-07-11 03:29 PM, [email protected] wrote:
>
> we do have some logic in the imfile to allow it to combine multiple line
> logs into one log entry,
Interesting.
> it has a problem right now that it doesn't
> sanitize the newlines so when you forward it things will probably get
> broken up again.
OK.
One thing I did notice when I sent a multiline message to rsyslogd just
using logger is that it converted the newline to a literal "#012". Is
that what rsyslogd does with a multiline message? Is it configurable?
Do most people really just have long syslog lines with a bunch of
"#012"s in them?
most people don't put multi-line messages in syslog :-) technically the
syslog spec says that a new line starts a new message.
rsyslog defaults to escaping all control characters, so in a case where
you do manage to get newlines inside a message, they should be escaped to
#012 so that everything you send the message to will not break up the
message.
> similar logic could probably be added to the kmesg input, but it is
> going to be tricky.
Yeah. I don't think I imagined it would always be perfect.
> the problem that you have is how to clearly identify the beginning of a
> new message.
Indeed. I was thinking of using time as a heuristic. Perhaps since for
a given spate of messages, like an OOM or other stack trace, the
messages all come in so tightly packed (in terms of time) I wonder if
can be reasonable to allow an amount of time to expire to mark the end
of a multiline message.
the problem is that you can get _lots_ of messages in a short amount of
time, and they may not all be related. also, it's actually pretty
expensive to lookup the time when you are receiving a message.
> in imfile it can do three things
>
> 1. each line is a separate log
>
> 2. if a line starts with whitespace, it's part of a earlier log entry
An interesting heuristic. I wonder if we can apply this to the kernel,
or have the kernel fixed to abide by some such indicator.
> 3. log entries are separated by a blank line
Another interesting mechanism. Although that does mean banning empty
lines elsewhere, not that I know that there are any used anywhere else.
> now, #2 and #3 both mean that the log entry cannot be sent until the
> next log entry arrives
Indeed. With time however you only have to wait long enough.
> with messages from the kernel, if they have timestamps on the front of
> them added by the kernel, it is going to be hard to tell if it is a
> continuation or not, but without those timestampes (or if you teach
> kmesg input to skip them), I think that approach #2 will work.
Yeah, I need to look more deeply into how the kernel is pushing messages
up to userspace and what they look like in that channel.
> I would suggest dumping dmesg output from a few systems and try running
> them through the imfile mode #2 and see how the results look.
Yeah, that's not going to work just given the first sample I looked at
(sorry for the wrapping -- stupid thunderbird):
[ 7680.300076] INFO: task dbus-daemon:3613 blocked for more than 120
seconds.
[ 7680.300082] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 7680.300085] dbus-daemon D f48035b0 0 3613 1 0x00000000
[ 7680.300091] eff19dc0 00000086 f4803584 f48035b0 d9ea3af8 000006cf
f081daec c183a8c0
[ 7680.300098] d9e2bc15 000006cf f081dae8 c183a8c0 c183a8c0 f48068c0
f081d860 f0b80ca0
[ 7680.300104] c105685c f8eaa4e0 eff19d80 c1509dad eff19dbc f8e99444
0000b4c0 00000028
[ 7680.300111] Call Trace:
[ 7680.300122] [<c105685c>] ? irq_exit+0x3c/0x80
[ 7680.300139] [<c1509dad>] ? _raw_spin_lock+0xd/0x10
[ 7680.300166] [<f8e99444>] ? rpcauth_lookup_credcache+0x134/0x1f0 [sunrpc]
[ 7680.300190] [<f8f2921d>] ? nfs_access_get_cached+0x5d/0x160 [nfs]
[ 7680.300195] [<c1508dc6>] __mutex_lock_slowpath+0xd6/0x140
[ 7680.300207] [<f8f29342>] ? nfs_do_access+0x22/0xc0 [nfs]
[ 7680.300211] [<c1508935>] mutex_lock+0x25/0x40
[ 7680.300216] [<c1131ee7>] do_lookup+0x157/0x260
[ 7680.300219] [<c11327bb>] link_path_walk+0x13b/0xb10
[ 7680.300224] [<c1249ec3>] ? apparmor_file_alloc_security+0x23/0x50
[ 7680.300227] [<c1249ec3>] ? apparmor_file_alloc_security+0x23/0x50
[ 7680.300231] [<c1133404>] do_path_lookup+0x44/0x120
[ 7680.300234] [<c113414f>] do_filp_open+0x18f/0x6e0
[ 7680.300238] [<c113384a>] ? user_path_at+0x4a/0x80
[ 7680.300242] [<c127bc12>] ? copy_to_user+0x42/0x60
[ 7680.300246] [<c1136d50>] ? filldir+0x0/0xe0
[ 7680.300250] [<c113fee9>] ? vfsmount_lock_local_unlock+0x19/0x20
[ 7680.300255] [<c1126736>] do_sys_open+0x56/0x120
[ 7680.300258] [<c112853d>] ? fput+0x1d/0x30
[ 7680.300261] [<c112682e>] sys_open+0x2e/0x40
[ 7680.300265] [<c150a1d4>] syscall_call+0x7/0xb
> I think that the kernel is pretty consistant about having every new
> message start at the beginning of a line, but I'm not sure that it
> indents everything (especially oops and stack trace messages)
Right. It seems to return to position "0" for the Call Trace: line
before the stack trace as well as other places in what I would consider
a single message.
even with this being 3-4 messages, it would still be far more coherant
than all of the lines arriving seprately (especially since separate lines
may get reordered in transit)
> even if it doesn't indent them, you could add logic to the kmesg input
> module to watch for,
Indeed, I had wondered about teaching the kmesg input what various
multiline messages look like too, combined with time, it might be fairly
effective.
> but that
> could become very messy (and change from kernel version to kernel
> version),
Agreed.
> if the simple indentation approach does not work, I would
> suggest asking on the kernel mailing list before starting any other work.
Yes, good idea. I'm not holding my breath, but worth a query. :-)
do you want to make this query, or do you want me to?
David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com