(changed subject since this diverges from the original point a bit)
On Oct 10, 2014, at 9:02 AM, Rainer Gerhards <[email protected]> wrote:
> And thinking about escaping, there is a subtle difference. Let's say you
> have a message that just says "a\nb". By default, this gets converted
> "a\\nb", so in a proper receiver, the string will again be "a\nb" (4
> chars). No message modification happens.
>
> In the jsonr case, we try to find C escape sequences in the property in
> question and try to unescape them. So "a\nb" (4 chars) becomes "a<LF>b" (3
> chars), what then is json-encoded into "a\nb" and the proper receiver will
> decode it as "a<LF>b" (3 chars). As such, the receiver actually does see a
> DIFFERENT message than the one the original sender emitted. This may be the
> desired result, even in many cases, but in other cases it may just be
> plainly wrong. For example, if a message hash was taken, that hash would of
> course become invalid by changing "\n" to the single LF character.
>
> Does that explain why there are the different options?
Yes, thanks for the clarification!
I was bumping into this on v7.6 when we had UNIX and Windows logs hitting the
same action. Some of the UNIX stuff would wind up with a double-escaped quote
(\\") because the originating server escaped it (\") but that results in
invalid JSON since now the " is no longer escaped correctly. For example:
Source message with pre-escaped quotes:
Sep 24 20:00:20 someserver xenstored: A697399.1 write
/xapi/14/private/vbd/51744/vdi-id [[\"VDI\",
\"984de168-aabb-d3be-4357-33f62ee8a9a3\\/b1b99507-260d-423b-a1ef-5262d5443376\"]]
'json' format output (with newlines for clarity):
{"time_received":"2014-09-24T20:00:20.150207-05:00",
"receiver":"loghost",
"from":"10.11.12.13",
"time_reported":"Sep 24 20:00:20",
"host":"someserver",
"severity":"info",
"facility":"local3",
"app_name":"xenstored",
"proc_id":"-",
"message":"Sep 24 20:00:20 someserver/10.11.12.13 xenstored: A697399.1
write /xapi/14/private/vbd/51744/vdi-id [[\\"VDI\\",
\\"984de168-aabb-d3be-4357-33f62ee8a9a3\\/b1b99507-260d-423b-a1ef-5262d5443376\\"]]"
}
I was hoping that switching to 'jsonr' (and v8.2+) would fix this, but haven't
tested yet. However, (non-escaped) tab-delimeted windows logs can contain
filesystem paths with invalid (or worse, valid) C escape sequences, so 'jsonr'
may mangle them.
So maybe I really need 'json' format, but I also need mmnormalize to fix the "
handling, so I could turn (\") into (\\\") and leave all other (\) alone
(making it Windows path-safe).
- Dave
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE
THAT.