On 18 Mar 2011 10:14, Lars Reimann wrote: > Hi all, > > the following problem is very annoying: > > RT Encodes Subject lines using the following concept: > > Original example Header > > Subject: > =?UTF-8?B?W3NlcnZpY2UubWV0YXdheXMubmV0ICM2NzAyOF0gU3BlaWNoZXJwbGF0eiBF?= > =?UTF-8?B?cmjDtmh1bmcgd2FzbWFpbjogNTAwIEdC?= > > The header is split into 2 parts: > > 1st part decoded: "[Queue Name #Ticket nubmer] First part of subject line" > 2nd part decoded: "Second part of subject line" > > Completely decoded string: "[Queue Name #Ticket nubmer] First part of > subject line"_"Second part of subject line" > > The underscore (_) marks an additional space character which is > introduced into ALL emails on decoding the two UTF parts.
I think this is actually a bug in Encode::MIME::Header's parsing/generation of the encoded header lines. I tracked it down when it broke a test in other code. I believe it was introduced with the fix for https://rt.cpan.org/Public/Bug/Display.html?id=40027. I've copied this mail to the bug tracker for Encode. > I double checked with decoding UTF in python. Results: When using 2 UTF > parts, a decode introduces an additional space. When using only ONE > UTF-string (the above subject w/o padding and UTF header) the decode is > done correctly! > > If would be very glad the resolve this problem. If RT could use only one > UTF string, the problem would go away. > How can we do that? If you're really, really annoyed by it, I believe you can downgrade to an older Encode. But you'll regain other bugs that have been fixed as well, and I can't suggest it. > And: does anyone have the same problem with email clients (we use > evolution and thunderbird, but most likely other clients are also > affected). > > p.s. It's unclear to me when UTF encoding is used. Sometimes the Subject > line is not UTF encoded and uses ASCII. Perhaps it depends on non-ASCII > characters within the subject. It's used when there are characters other than ascii in a mail header. Thomas
