Mark,

>> I have been struggling with handling some pesky encoded characters in
>> mail logs as of late.
>>
>> My issue involves passing the message on to a remote rsyslog server
>> which then processes the messages into a database. From time to time I
>> see messages with something similar to the following:
>>
>> "Subject:<RE: VIAGRA \256 Official Site ID031831740>"
>>
>> In my case, messages are passed to processing mechanisms via triggers
>> prior to a final insertion but I am getting DB errors based on the
>> handling of these "invalid" characters.
>
> I'm not sure how you are getting an 8-bit character out of the log.
> Amavisd logging encodes all nonprintable characters as \nnn or \{xxxx},
> so syslog should not be seeing any. If your syslog daemon decodes that,
> it must not assume the decoded string will be a valid string in some
> encoding such as UTF-8, as the sender can put any junk in his subject
> or display name, and need not play by the rules.
>

Good points here as well. This is stock syslog on CentOS 5. I will look 
into this area as well.

>> I don't think the use of mime2utf8 will help me in these types of
>> instances.
>
> I think the mime2utf8 comes closest to what can be achieved with
> decoding of text in Subject and From header fields. Anyway, the
> mime2utf8 should be able to always produce a valid UTF-8 -encoded text
> (although not necessarily printable), which is further protected by turning
> nonprintable characters to \nnn by the log writing code.
>
> If logging get further processing, you may prefer [:b64encode[:mime2utf8...
> instead of [:dquote|[:mime2utf8...  in the log template.
>

This idea came to my mind last night too. I am going to do some testing 
of live data with it and see what results it provides as well.

>> Also, after looking through previous mailing list messages
>> I've seen references to somewhat similar types of issues with the use of
>> setting the LANG= value with the invocation of amavisd-new. I may be not
>> thinking this through clearly, but I don't think that would help much in
>> this case since everything is happening after the fact on a completely
>> different system.
>
> That's unrelated. Setting locale to "C" on a mailer is still a sensible
> thing to do in principle.
>
>> Would I be better off in the log template doing some type of find and
>> replace regex on the fields to help escape these characters? Is that
>> even possible? Typically I'd do this in the code that is called by the
>> trigger but I'm not quite getting that far along in the processing yet
>> to do that. My thinking at this point is to escape any backslashes
>> before I write the log and then hopefully I can handle the escaped
>> characters elsewhere.
>
> Are you replacing the write_log() with your own code?
>

Not at this time no.

>> I see the following noted in the README.customize file:
>>
>> "If assigning to variables, care must be taken to properly quote certain
>> special characters (like backslash), as required by Perl quoting rules.
>> Text read from amavisd file or from external files is not subject
>> to Perl quoting rules."
>>
>> But what is the best practice to do so within the templates and the
>> available macros? Or is it more raw perl within the template that I need
>> to consider?
>
> No interpretation/decoding of characters in templates is done by
> amavisd - it uses the text as provided in a variable. How you put it
> into a variable depends on your config file: assigning a "..." or '...'
> enclosed text to a variable is subject to Perl's interpretation of
> qq() and q() or a 'here-document'. Or you may be reading the
> text from a file, in which no interpretation occurs.
>
>> After additional troubleshooting and trial and error of my issue, it is
>> looking more like I need to take a more simple approach of a find and
>> replace mentality to the content passed to amavisd-new.
>>
>> Looking through README.customize I don't see any predefined macros which
>> do such a thing thus far, and am considering a custom macro to do just
>> this.
>>
>> I wanted to throw this out to the list to make sure I'm not overlooking
>> something obvious or already available.
>
> I can adapt the macro mime2utf8 or similar if you can explain
> what is it that it is supposed to do in place of its current function.
>

I'm not sure that is necessary at this time, but was something I was 
toying around with possibly doing here as well. Again, I am going to 
allow for more time to test on my end of things. Unfortunately, I can't 
quite answer for you what exactly tell you what I am wanting it to do in 
addition just yet. Hopefully I can better answer that as I get more data.

Thanks again for your response. After I do more troubleshooting/testing 
on my side I will report any findings that may be beneficial to the 
discussion.

Clay


------------------------------------------------------------------------------
Free Software Download: Index, Search & Analyze Logs and other IT data in 
Real-Time with Splunk. Collect, index and harness all the fast moving IT data 
generated by your applications, servers and devices whether physical, virtual
or in the cloud. Deliver compliance at lower cost and gain new business 
insights. http://p.sf.net/sfu/splunk-dev2dev 
_______________________________________________
AMaViS-user mailing list
AMaViS-user@lists.sourceforge.net 
https://lists.sourceforge.net/lists/listinfo/amavis-user 
 Please visit http://www.ijs.si/software/amavisd/ regularly
 For administrativa requests please send email to rainer at openantivirus dot 
org

Reply via email to