Is it possible that rsyslog is not receiving the 4 ASCII characters <, 8, 0, >; but rather a single character (hex 0x80) that the JSON encoder is trying to interpret as a start character for a multi-byte character sequence, and something else is then displaying as <80> in the logline?
https://en.wikipedia.org/wiki/UTF-8 indicates that 0080 is one of several valid start characters for a 2-byte unicode value, and JSON expects all strings to be UTF-8. To potentially resolve, I'd try adding action(type="mmutf8fix") to your rsyslog ruleset. As to how the 'invalid' character got into the log stream in the first place: I've seen similar situations where Windows hosts were sending CP-1252 character set, which Wikipedia says: > ... is a superset of ISO 8859-1, but differs from the IANA's ISO-8859-1 by > using > displayable characters rather than control characters in the 80 to 9F > (hex) range. and it frequently occurs in windows event descriptions that contain apostrophes that in ASCII would be the ['] character (decimal 39), but in CP-1252 are [’] (decimal 146) instead. - Dave > On Feb 23, 2016, at 9:09 AM, Joe Blow <[email protected]> wrote: > > Correct. I get things like this in my omelasticsearch error log: > > "error": "MapperParsingException[failed to parse [csuriquery]]; > nested: JsonParseException[Invalid UTF-8 start byte 0x80\n at [Source: > [B@2210517d; line: 1, column: 450]] > > Then within the normalized JSON i see my <80> tags at that line. > > Any ideas? > > Cheers, > > JB > > On Tue, Feb 23, 2016 at 9:33 AM, Rainer Gerhards <[email protected]> > wrote: > >> 2016-02-23 15:29 GMT+01:00 Joe Blow <[email protected]>: >>> Hey all, >>> >>> I've got some logs which might have different languages in them, and it >>> appears that things like this are tripping up when i try and send them to >>> elasticsearch: >>> >>> KEDANOVA%20FA<80>ANES&sec=08& >>> >>> Specifically the <80>. What is the best way to escape both the < and >> the > >>> in the normalized field? I'm already specifying the format as JSON, so >>> backslashes are being escaped properly. Any ideas? >> >> I am not aware that <> need to be escaped. Maybe another ES JSON >> incompatibility? >> >> Rainer >>> >>> Thanks in advance. >>> >>> Cheers, >>> >>> JB >>> _______________________________________________ >>> rsyslog mailing list >>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>> http://www.rsyslog.com/professional-services/ >>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >> DON'T LIKE THAT. >> _______________________________________________ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com/professional-services/ >> What's up with rsyslog? Follow https://twitter.com/rgerhards >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >> DON'T LIKE THAT. >> > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of > sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T > LIKE THAT. _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

