On Wed, 31 Aug 2016 23:12:28 +0200 Albert Shih <[email protected]> wrote: > So until known everything is correct. The problem is when the person who > answer this ticket encode the subject like this > > =?utf-8?q?Re=3A?==?utf-8?q?_=5BRTTAG =?utf-8?q? #NUMBER=5D?= Bonjour > =?utf-8?q?=C3=A0?= vous > > because in that case RT drop the space between the RTTAG and the #NUMBER.
What mail client is generating that? Whatever it is, it is violating RFC 2047 spec in _multiple_ ways. First, https://tools.ietf.org/html/rfc2047#page-5 unencoded white space characters (such as SPACE and HTAB) are FORBIDDEN within an 'encoded-word' As such, "=?utf-8?q? #NUMBER=5D?=" is not a valid encoded-word. Secondly, https://tools.ietf.org/html/rfc2047#page-7 However, an 'encoded-word' that appears in a header field defined as '*text' MUST be separated from any adjacent 'encoded-word' or 'text' by 'linear-white-space'. As such, "=?utf-8?q?Re=3A?==?utf-8?" is not valid, as the two "encoded-word"s are not separated by spaces. Even ignoring those errors, the example you gave still isn't parsable. My best attempt splits it into the following tokens: =?utf-8?q?Re=3A?= # "Re: =?utf-8?q?_=5BRTTAG # " [RTTAG", but no closing "?=" ?! =?utf-8?q?#NUMBER=5D?= # "#NUMBER]" Bonjour # "bonjour" =?utf-8?q?=C3=A0?= # "à vous # "vous" Were it somehow parsed as the above, RT would _still_ be correct in omitting the space before the number, because space between encoded-words is removed, https://tools.ietf.org/html/rfc2047#page-10 : When displaying a particular header field that contains multiple 'encoded-word's, any 'linear-white-space' that separates a pair of adjacent 'encoded-word's is ignored. In short, fix the mail client. Failing that, set $ExtractSubjectTagMatch, as this is not a bug in RT. - Alex --------- RT 4.4 and RTIR training sessions, and a new workshop day! https://bestpractical.com/training * Boston - October 24-26 * Los Angeles - Q1 2017
