Clay, > I have been struggling with handling some pesky encoded characters in > mail logs as of late. > > My issue involves passing the message on to a remote rsyslog server > which then processes the messages into a database. From time to time I > see messages with something similar to the following: > > "Subject:<RE: VIAGRA \256 Official Site ID031831740>" > > In my case, messages are passed to processing mechanisms via triggers > prior to a final insertion but I am getting DB errors based on the > handling of these "invalid" characters.
I'm not sure how you are getting an 8-bit character out of the log. Amavisd logging encodes all nonprintable characters as \nnn or \{xxxx}, so syslog should not be seeing any. If your syslog daemon decodes that, it must not assume the decoded string will be a valid string in some encoding such as UTF-8, as the sender can put any junk in his subject or display name, and need not play by the rules. > I don't think the use of mime2utf8 will help me in these types of > instances. I think the mime2utf8 comes closest to what can be achieved with decoding of text in Subject and From header fields. Anyway, the mime2utf8 should be able to always produce a valid UTF-8 -encoded text (although not necessarily printable), which is further protected by turning nonprintable characters to \nnn by the log writing code. If logging get further processing, you may prefer [:b64encode[:mime2utf8... instead of [:dquote|[:mime2utf8... in the log template. > Also, after looking through previous mailing list messages > I've seen references to somewhat similar types of issues with the use of > setting the LANG= value with the invocation of amavisd-new. I may be not > thinking this through clearly, but I don't think that would help much in > this case since everything is happening after the fact on a completely > different system. That's unrelated. Setting locale to "C" on a mailer is still a sensible thing to do in principle. > Would I be better off in the log template doing some type of find and > replace regex on the fields to help escape these characters? Is that > even possible? Typically I'd do this in the code that is called by the > trigger but I'm not quite getting that far along in the processing yet > to do that. My thinking at this point is to escape any backslashes > before I write the log and then hopefully I can handle the escaped > characters elsewhere. Are you replacing the write_log() with your own code? > I see the following noted in the README.customize file: > > "If assigning to variables, care must be taken to properly quote certain > special characters (like backslash), as required by Perl quoting rules. > Text read from amavisd file or from external files is not subject > to Perl quoting rules." > > But what is the best practice to do so within the templates and the > available macros? Or is it more raw perl within the template that I need > to consider? No interpretation/decoding of characters in templates is done by amavisd - it uses the text as provided in a variable. How you put it into a variable depends on your config file: assigning a "..." or '...' enclosed text to a variable is subject to Perl's interpretation of qq() and q() or a 'here-document'. Or you may be reading the text from a file, in which no interpretation occurs. > After additional troubleshooting and trial and error of my issue, it is > looking more like I need to take a more simple approach of a find and > replace mentality to the content passed to amavisd-new. > > Looking through README.customize I don't see any predefined macros which > do such a thing thus far, and am considering a custom macro to do just > this. > > I wanted to throw this out to the list to make sure I'm not overlooking > something obvious or already available. I can adapt the macro mime2utf8 or similar if you can explain what is it that it is supposed to do in place of its current function. Mark ------------------------------------------------------------------------------ Free Software Download: Index, Search & Analyze Logs and other IT data in Real-Time with Splunk. Collect, index and harness all the fast moving IT data generated by your applications, servers and devices whether physical, virtual or in the cloud. Deliver compliance at lower cost and gain new business insights. http://p.sf.net/sfu/splunk-dev2dev _______________________________________________ AMaViS-user mailing list AMaViS-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/amavis-user Please visit http://www.ijs.si/software/amavisd/ regularly For administrativa requests please send email to rainer at openantivirus dot org