Hi, Ping!
On Thu, Jun 12, 2014 at 01:04:46AM +0530, Siddhesh Poyarekar wrote: > Trying again after signing up to the mailing list (patch is slightly > modified from my first submission, which may either be in moderation > or may have gotten lost somehow): > > On Wed, Jun 11, 2014 at 04:09:16PM +0530, Siddhesh Poyarekar wrote: > > Hi, > > > > We recently encountered a case in our glibc patchwork instance on > > sourceware, where a patch was dropped because it had x-unknown > > charset. I used the following patch to fix this in our instance. The > > fix I used was to fall back on a set of encodings (instead of just > > utf-8) when the charset is not mentioned or if it is set as x-unknown. > > > > I hope this is useful. I'd love to know if you all think there is a > > better way to fix this so that I can implement that in our instance > > instead of my hack. > > > > Cheers, > > Siddhesh > > --- a/apps/patchwork/bin/parsemail.py 2014-06-11 15:53:12.685666812 +0530 > +++ b/apps/patchwork/bin/parsemail.py 2014-06-11 15:53:03.991667186 +0530 > @@ -147,6 +147,13 @@ > return match.group(1) > return None > > +def try_decode(payload, charset): > + try: > + payload = unicode(payload, charset) > + except UnicodeDecodeError: > + return None > + return payload > + > def find_content(project, mail): > patchbuf = None > commentbuf = '' > @@ -157,15 +164,27 @@ > continue > > payload = part.get_payload(decode=True) > - charset = part.get_content_charset() > subtype = part.get_content_subtype() > > - # if we don't have a charset, assume utf-8 > - if charset is None: > - charset = 'utf-8' > - > if not isinstance(payload, unicode): > - payload = unicode(payload, charset) > + charset = part.get_content_charset() > + > + # If there is no charset or if it is unknown, then try some > common > + # charsets before we fail. > + if charset is None or charset == 'x-unknown': > + try_charsets = ['utf-8', 'windows-1252', 'ascii', > 'iso-8859-1'] > + else: > + try_charsets = [charset] > + > + for cset in try_charsets: > + decoded_payload = try_decode(payload, cset) > + if decoded_payload is not None: > + break > + payload = decoded_payload > + > + # Could not find a valid decoded payload. Fail. > + if payload is None: > + return (None, None) > > if subtype in ['x-patch', 'x-diff']: > patchbuf = payload > _______________________________________________ > Patchwork mailing list > Patchwork@lists.ozlabs.org > https://lists.ozlabs.org/listinfo/patchwork
pgpEl97XP4of9.pgp
Description: PGP signature
_______________________________________________ Patchwork mailing list Patchwork@lists.ozlabs.org https://lists.ozlabs.org/listinfo/patchwork