Bug#320185: rss2email: non-ASCII long header is encoded incorrectly

Joey Hess Wed, 27 Jul 2005 23:06:07 -0700

Tatsuya Kinoshita wrote:
> Package: rss2email
> Version: 1:2.54-6
> Severity: normal
> 
> I've tried using rss2email and found a bug.
> 
> A Subject field is encoded incorrectly if the RSS feed contains
> non-ASCII characters in the title and the word is too long.
> 
> For instance,
> 
> <title>á12345678901234567890123456789012345678901234567890123456789012345678901234567890<title>
> 
> is converted to
> 
> Subject: 
> =?utf-8?Q?=C3=A112345678901234567890123456789012345678901234567890123456789012345678=
> 901234567890?=
> 
> It seems that "=\n" is inserted incorrectly.
> 
> This bug might be in Python's mimify.py.  Anyway, to prevent this
> problem, I've applied the follwing patch to rss2email.py.
> 
> ----
> --- rss2email.py.orig
> +++ rss2email.py
> @@ -137,7 +137,11 @@
>  
>  def header7bit(s):
>       """QP_CORRUPT headers."""
> -     return mimify.mime_encode_header(s + ' ')[:-1]
> +     #return mimify.mime_encode_header(s + ' ')[:-1]
> +     # XXX due to mime_encode_header bug
> +     import re
> +     p = re.compile('=\n([^ \t])');
> +     return p.sub(r'\1', mimify.mime_encode_header(s + ' ')[:-1])
>  
>  ### Parsing Utilities ###
>  
> ----
> 
> Typically, this problem is appeared in Japanese documents.  Because
> Japanese multibyte words are not separated with the space character.


Thanks, I've actually seen this once or twice with English feeds, never
took the time to track it down.

-- 
see shy jo

signature.asc
Description: Digital signature

Bug#320185: rss2email: non-ASCII long header is encoded incorrectly

Reply via email to