New submission from Ezio Melotti <ezio.melo...@gmail.com>:

def _decode_header_to_utf8(self, hdr):
        l = []
        for part, encoding in decode_header(hdr):
            if encoding:
                part = part.decode(encoding)
            l.append(part)
        return ''.join([s.encode('utf-8') for s in l])

If the encoding is specified, l becomes a list of unicode strings that can be 
encoded in the listcomp, but if the encoding is not specified, l becomes a list 
of byte strings that can't be encoded if they contain non-ascii characters.
The latter causes lot of decoding errors that gets reported to the admins due 
to all the spam messages (apparently with no encoding specified) that get sent 
to b.p.o.

I'm going to fix this by attempting the decoding of the part using utf-8 and 
falling back to iso-8859-1 in case of error.  This will ensure that l is a list 
of unicode strings that can be encoded.  This will also stop the decoding 
errors in the listcomp, and let the spam messages through, hopefully to be 
blocked shortly after when Roundup figures out the user is not registered.

----------
assignedto: ezio.melotti
messages: 3549
nosy: ezio.melotti
priority: urgent
status: in-progress
title: Wrong header encoding handling in mailgw.py

_______________________________________________________
PSF Meta Tracker <metatrac...@psf.upfronthosting.co.za>
<http://psf.upfronthosting.co.za/roundup/meta/issue668>
_______________________________________________________
_______________________________________________
Tracker-discuss mailing list
Tracker-discuss@python.org
https://mail.python.org/mailman/listinfo/tracker-discuss
Code of Conduct: https://www.python.org/psf/codeofconduct/

Reply via email to