date:20071223

Re: [Evolution-hackers] [patch] fixed incorrect rfc2047 decode for CJKheader

2007-12-23 Thread Peter Volkov


В Пнд, 24/12/2007 в 13:21 +0800, jacky пишет:
> --- Jeff Stedfast <[EMAIL PROTECTED]>wrote:
> There are two kind of email need to support:
> 1) An encoded-word was divided into two line. This was
> sent by dotProject v2.0.1 .

And there are even more users affected by this. I've already reported
similar problem in bug 315513. Thus this affects not only CJK people:

http://bugzilla.gnome.org/show_bug.cgi?id=315513

-- 
Peter.


signature.asc
Description: Эта	 часть	 сообщения	 подписана	 цифровой	 подписью
___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers

Re: [Evolution-hackers] [patch] fixed incorrect rfc2047 decode for CJKheader

2007-12-23 Thread jacky


--- Jeff Stedfast <[EMAIL PROTECTED]>wrote:

> Hi Jacky,
> 
> I've looked over your patch, but unfortunately it is
> unusable. The patch
> is riddled with buffer overflows and incorrect
> logic.
> 

Yes, I use a fixed length string to store some value,
it maybe overflow. I write another version by using
heap insteads of stack. I think the stack version is
simple and enough, so I  send it only. Two version of
rfc2047_decode_word() is in attachment.
Can you explain the incorrect logic in my patch?

> What types of bugs are you actually trying to fix?
> What is it about CJK
> messages in particular that are not getting decoded
> properly? Your email
> was overly vague.
> 

Maybe I used the wrong word. I think I just enhance
the CJK header support. The patch enhance three point:
1) You know, encoded-words must be separated by CRLF
SPACE, but some email client do not do that.
2) A CJK character's encoded string must in an
encoded-word, but some email client divide it into two
encoded-words.
3) Some CJK character need to encode to GBK charset,
but the charset name in encoded-word is GB2312.

There are two kind of email need to support:
1) An encoded-word was divided into two line. This was
sent by dotProject v2.0.1 .
2) Use GB2312 to encode CJK character directly. Some
of them was supported by evolution, but some of them
didn't.

> Your changes to e-iconv can probably be taken if I
> understand correctly
> that GBK is a superset of gb2312 (
> http://en.wikipedia.org/wiki/GBK ),
> altho it would have been nice to have gotten some
> sort of link
> explaining that with your original email (or via a
> ChangeLog entry) :)
> 
> Thanks,
> 
> Jeff
> 
> >>> jacky <[EMAIL PROTECTED]> 12/23/07 10:09 AM
> >>>
> Hi, all.
> 
> The rfc2047 decoder in libcamel can not decode some
> CJK header correctly. Although some of them are not
> correspond to RFC, but I need to decode it correctly
> and I thought if evolution can display there email
> correctly more people like it.
> 
> So I write a new rfc2047 decoder, and it's in the
> patch. With the patch, libcamel can decode CJK
> header
> correctly and evolution can display CJK header
> correctly now. I had test it in my mailbox. My
> mailbox
> has 2000 emails which were sent by evolution,
> thunderbird, outlook, outlook express, foxmail, open
> webmail, yahoo, gmail, lotus notes, etc. Without
> this
> patch, almost 20% of there emails can't be decoded
> and
> displayed correctly, with this patch, 99% of there
> emails can be decoded and displayed correctly.
> 
> And I found that the attachment with CJK name can't
> be
> recognised and displayed by outlook / outlook
> express
> / foxmail. This is because there email clients do
> not
> support RFC2184. Evolution always use RFC2184 encode
> mothod to encode attachment name, so the email with
> CJK named attachment can't display in outlook /
> outlook express / foxmail. In thunderbird, you can
> set
> the option "mail.strictly_mime.parm_folding" to 0 or
> 1
> for using RFC2047 encode mothod to encode attachment
> name. Can we add a similar option?
> 
> Best regards.
> 



  ___ 
雅虎邮箱传递新年祝福，个性贺卡送亲朋！ 
http://cn.mail.yahoo.com/gc/index.html?entry=5&souce=mail_mailletter_tagline/* decode rfc 2047 encoded string segment */
#define DECWORD_LEN 1024
#define UTF8_DECWORD_LEN 2048

#if 1 //USE_STACK
static char *
rfc2047_decode_word(const char *in, size_t len)
{
	char prev_charset[32], curr_charset[32];
	char encode;
	char *start, *inptr, *inend;
	char decword[DECWORD_LEN], utf8_decword[UTF8_DECWORD_LEN];
	char *decword_ptr, *utf8_decword_ptr;
	size_t inlen, outlen, ret;

	prev_charset[0] = curr_charset[0] = '\0';

	decword_ptr = decword;
	utf8_decword_ptr = utf8_decword;

	/* quick check to see if this could possibly be a real encoded word */
	if (len < 8
	|| !(in[0] == '=' && in[1] == '?'
		 && in[len-1] == '=' && in[len-2] == '?')) {
		return NULL;
	}

	inptr = in;
	inend = in + len;
	outlen = sizeof(utf8_decword);

	while (inptr < inend) {
		/* begin */
		inptr = memchr (inptr, '?', inend-inptr);
		if (!inptr || *(inptr-1) != '=') {
			return NULL;
		}
		inptr++;

		/* charset */
		start = inptr;
		inptr = memchr (inptr, '?', inend-inptr);
		if (!inptr) {
			return NULL;
		}
		strncpy (curr_charset, start, inptr-start); /* maybe overflow */
		curr_charset[inptr-start] = '\0';
		if (prev_charset[0] == '\0') { /* first charset in multi encode words */
			strcpy (prev_charset, curr_charset);
		}
		d(printf ("curr_charset = %s\n", curr_charset));

		/* if (charset.perv != charset.curr) iconv perv to utf8 */
		if (prev_charset[0] != '\0' && strcmp(prev_charset, curr_charset)) {
			inlen = decword_ptr - decword;
			ret = conv_to_utf8 (prev_charset, decword, inlen, utf8_decword_ptr, outlen);
			if (ret == (size_t)-1) {
printf ("conv_to_utf8() error!\n");
return NULL;
			}

			utf8_decword_ptr += ret;
			outlen = outlen - ret;

			decword_ptr = decword; /* reset decword_ptr */

Re: [Evolution-hackers] [patch] fixed incorrect rfc2047 decode for CJK header

2007-12-23 Thread jacky


--- Philip Van Hoof <[EMAIL PROTECTED]>wrote:

> Hey Jacky,
> 
> This is a port of your patch to Tinymail's
> camel-lite
> 

Thank you.


> On Sun, 2007-12-23 at 23:09 +0800, jacky wrote:
> > Hi, all.
> > 
> > The rfc2047 decoder in libcamel can not decode
> some
> > CJK header correctly. Although some of them are
> not
> > correspond to RFC, but I need to decode it
> correctly
> > and I thought if evolution can display there email
> > correctly more people like it.
> > 
> > So I write a new rfc2047 decoder, and it's in the
> > patch. With the patch, libcamel can decode CJK
> header
> > correctly and evolution can display CJK header
> > correctly now. I had test it in my mailbox. My
> mailbox
> > has 2000 emails which were sent by evolution,
> > thunderbird, outlook, outlook express, foxmail,
> open
> > webmail, yahoo, gmail, lotus notes, etc. Without
> this
> > patch, almost 20% of there emails can't be decoded
> and
> > displayed correctly, with this patch, 99% of there
> > emails can be decoded and displayed correctly.
> > 
> > And I found that the attachment with CJK name
> can't be
> > recognised and displayed by outlook / outlook
> express
> > / foxmail. This is because there email clients do
> not
> > support RFC2184. Evolution always use RFC2184
> encode
> > mothod to encode attachment name, so the email
> with
> > CJK named attachment can't display in outlook /
> > outlook express / foxmail. In thunderbird, you can
> set
> > the option "mail.strictly_mime.parm_folding" to 0
> or 1
> > for using RFC2047 encode mothod to encode
> attachment
> > name. Can we add a similar option?
> > 
> > Best regards.
> > 
> > 
> >  
>
___
> 
> > 雅虎邮箱传递新年祝福，个性贺卡送亲朋！ 
> >
>
http://cn.mail.yahoo.com/gc/index.html?entry=5&souce=mail_mailletter_tagline
> > ___
> Evolution-hackers mailing list
> Evolution-hackers@gnome.org
>
http://mail.gnome.org/mailman/listinfo/evolution-hackers
> -- 
> Philip Van Hoof, freelance software developer
> home: me at pvanhoof dot be 
> gnome: pvanhoof at gnome dot org 
> http://pvanhoof.be/blog
> http://codeminded.be
> 
> 
> 
> > Index:
>
libtinymail-camel/camel-lite/camel/camel-mime-utils.c
>
===
> ---
>
libtinymail-camel/camel-lite/camel/camel-mime-utils.c
> (revision 3190)
> +++
>
libtinymail-camel/camel-lite/camel/camel-mime-utils.c
> (working copy)
> @@ -821,125 +821,207 @@
>   *in = inptr;
>  }
>  
> +static void
> +print_hex (unsigned char *data, size_t len)
> +{
> + size_t i, x;
> + unsigned char *p = data;
> + char high, low;
> +
> + x = 0;
> + printf ("%04u", x);
> + for (i = 0; i < len; i++) {
> + high = *p >> 4;
> + high = (high<10) ? high + '0' : high + 'a' - 10;
> +
> + low = *p & 0x0f;
> + low = (low<10) ? low + '0' : low + 'a' - 10;
> +
> + printf ("0x%c%c  ", high, low);
> +
> + p++;
> + x++;
> + if (i % 8 == 7) {
> + printf ("\n%04u", x);
> + }
> + }
> + printf ("\n");
> +}
> +
> +static size_t
> +conv_to_utf8 (const char *encname, char *in, size_t
> inlen, char *out, size_t outlen)
> +{
> + char *charset, *inbuf, *outbuf;
> + iconv_t ic;
> + size_t inbuf_len, outbuf_len, ret;
> +
> + charset = (char *) e_iconv_charset_name (encname);
> +
> + ic = e_iconv_open ("UTF-8", charset);
> + if (ic == (iconv_t) -1) {
> + printf ("e_iconv_open() error\n");
> + return (size_t)-1;
> + }
> +
> + inbuf = in;
> + inbuf_len = inlen;
> +
> + outbuf = out;
> + outbuf_len = outlen;
> +
> + ret = e_iconv (ic, (const char **) &inbuf,
> &inbuf_len, &outbuf, &outbuf_len);
> + if (ret == (size_t)-1) {
> + printf ("e_iconv() error! source charset is %s,
> target charset is %s\n", charset, "UTF-8");
> + printf ("converted %u bytes, but last %u bytes
> can't convert!!\n", inlen - inbuf_len, inbuf_len);
> + printf ("source data:\n");
> + print_hex (in, inlen);
> +
> + *outbuf = '\0';
> + printf ("target string is \"%s\"\n", out);
> +
> + return (size_t)-1;
> + }
> +
> + ret = outlen - outbuf_len;
> + out[ret] = '\0';
> +
> + e_iconv_close (ic);
> +
> + return ret;
> +}
> +
>  /* decode rfc 2047 encoded string segment */
> +#define DECWORD_LEN 1024
> +#define UTF8_DECWORD_LEN 2048
> +
>  static char *
>  rfc2047_decode_word(const char *in, size_t len)
>  {
> - const char *inptr = in+2;
> - const char *inend = in+len-2;
> - const char *inbuf;
> - const char *charset;
> - char *encname, *p;
> - int tmplen;
> - size_t ret;
> - char *decword = NULL;
> - char *decoded = NULL;
> - char *outbase = NULL;
> - char *outbuf;
> - size_t inlen, outlen;
>

Re: [Evolution-hackers] [patch] fixed incorrect rfc2047 decode forCJKheader

2007-12-23 Thread Jeff Stedfast

>>> Philip Van Hoof <[EMAIL PROTECTED]> 12/23/07 5:09 PM >>>
> On Sun, 2007-12-23 at 14:51 -0700, Jeff Stedfast wrote:
> > What types of bugs are you actually trying to fix? What is it about CJK
> > messages in particular that are not getting decoded properly? Your email
> > was overly vague.
> 
> Looks like he wants to support both 'B' and 'b' and 'Q' and 'q' in stead
> of just 'B' and 'Q' for the first characters for Base64 or Quoted
> strings, for example. 

This is already supported by the current code:

switch(toupper(inptr[0])) {
case 'Q':

> The rfc2047_decode_word implementation indeed has a few serious
> potential buffer overflows. For example the charset buffer will be
> overflowed if larger than 32 bytes.

Correct. The output buffer is a hard-coded size as well, which it shouldn't be.

Jeff


___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers

Re: [Evolution-hackers] [patch] fixed incorrect rfc2047 decode for CJKheader

2007-12-23 Thread Philip Van Hoof


On Sun, 2007-12-23 at 14:51 -0700, Jeff Stedfast wrote:

> What types of bugs are you actually trying to fix? What is it about CJK
> messages in particular that are not getting decoded properly? Your email
> was overly vague.

Looks like he wants to support both 'B' and 'b' and 'Q' and 'q' in stead
of just 'B' and 'Q' for the first characters for Base64 or Quoted
strings, for example. 

The rfc2047_decode_word implementation indeed has a few serious
potential buffer overflows. For example the charset buffer will be
overflowed if larger than 32 bytes.



> >>> jacky <[EMAIL PROTECTED]> 12/23/07 10:09 AM >>>
> Hi, all.
> 
> The rfc2047 decoder in libcamel can not decode some
> CJK header correctly. Although some of them are not
> correspond to RFC, but I need to decode it correctly
> and I thought if evolution can display there email
> correctly more people like it.
> 
> So I write a new rfc2047 decoder, and it's in the
> patch. With the patch, libcamel can decode CJK header
> correctly and evolution can display CJK header
> correctly now. I had test it in my mailbox. My mailbox
> has 2000 emails which were sent by evolution,
> thunderbird, outlook, outlook express, foxmail, open
> webmail, yahoo, gmail, lotus notes, etc. Without this
> patch, almost 20% of there emails can't be decoded and
> displayed correctly, with this patch, 99% of there
> emails can be decoded and displayed correctly.
> 
> And I found that the attachment with CJK name can't be
> recognised and displayed by outlook / outlook express
> / foxmail. This is because there email clients do not
> support RFC2184. Evolution always use RFC2184 encode
> mothod to encode attachment name, so the email with
> CJK named attachment can't display in outlook /
> outlook express / foxmail. In thunderbird, you can set
> the option "mail.strictly_mime.parm_folding" to 0 or 1
> for using RFC2047 encode mothod to encode attachment
> name. Can we add a similar option?
> 
> Best regards.
> 
> 
>   ___ 
> 雅虎邮箱传递新年祝福，个性贺卡送亲朋！ 
> http://cn.mail.yahoo.com/gc/index.html?entry=5&souce=mail_mailletter_taglin
> ___
> Evolution-hackers mailing list
> Evolution-hackers@gnome.org
> http://mail.gnome.org/mailman/listinfo/evolution-hackers
-- 
Philip Van Hoof, freelance software developer
home: me at pvanhoof dot be 
gnome: pvanhoof at gnome dot org 
http://pvanhoof.be/blog
http://codeminded.be




___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers

Re: [Evolution-hackers] [patch] fixed incorrect rfc2047 decode for CJKheader

2007-12-23 Thread Jeff Stedfast

Hi Jacky,

I've looked over your patch, but unfortunately it is unusable. The patch
is riddled with buffer overflows and incorrect logic.

What types of bugs are you actually trying to fix? What is it about CJK
messages in particular that are not getting decoded properly? Your email
was overly vague.

Your changes to e-iconv can probably be taken if I understand correctly
that GBK is a superset of gb2312 ( http://en.wikipedia.org/wiki/GBK ),
altho it would have been nice to have gotten some sort of link
explaining that with your original email (or via a ChangeLog entry) :)

Thanks,

Jeff

>>> jacky <[EMAIL PROTECTED]> 12/23/07 10:09 AM >>>
Hi, all.

The rfc2047 decoder in libcamel can not decode some
CJK header correctly. Although some of them are not
correspond to RFC, but I need to decode it correctly
and I thought if evolution can display there email
correctly more people like it.

So I write a new rfc2047 decoder, and it's in the
patch. With the patch, libcamel can decode CJK header
correctly and evolution can display CJK header
correctly now. I had test it in my mailbox. My mailbox
has 2000 emails which were sent by evolution,
thunderbird, outlook, outlook express, foxmail, open
webmail, yahoo, gmail, lotus notes, etc. Without this
patch, almost 20% of there emails can't be decoded and
displayed correctly, with this patch, 99% of there
emails can be decoded and displayed correctly.

And I found that the attachment with CJK name can't be
recognised and displayed by outlook / outlook express
/ foxmail. This is because there email clients do not
support RFC2184. Evolution always use RFC2184 encode
mothod to encode attachment name, so the email with
CJK named attachment can't display in outlook /
outlook express / foxmail. In thunderbird, you can set
the option "mail.strictly_mime.parm_folding" to 0 or 1
for using RFC2047 encode mothod to encode attachment
name. Can we add a similar option?

Best regards.


  ___ 
雅虎邮箱传递新年祝福，个性贺卡送亲朋！ 
http://cn.mail.yahoo.com/gc/index.html?entry=5&souce=mail_mailletter_taglin
___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers

Re: [Evolution-hackers] [patch] fixed incorrect rfc2047 decode for CJK header

2007-12-23 Thread Philip Van Hoof

Hey Jacky,

This is a port of your patch to Tinymail's camel-lite

On Sun, 2007-12-23 at 23:09 +0800, jacky wrote:
> Hi, all.
> 
> The rfc2047 decoder in libcamel can not decode some
> CJK header correctly. Although some of them are not
> correspond to RFC, but I need to decode it correctly
> and I thought if evolution can display there email
> correctly more people like it.
> 
> So I write a new rfc2047 decoder, and it's in the
> patch. With the patch, libcamel can decode CJK header
> correctly and evolution can display CJK header
> correctly now. I had test it in my mailbox. My mailbox
> has 2000 emails which were sent by evolution,
> thunderbird, outlook, outlook express, foxmail, open
> webmail, yahoo, gmail, lotus notes, etc. Without this
> patch, almost 20% of there emails can't be decoded and
> displayed correctly, with this patch, 99% of there
> emails can be decoded and displayed correctly.
> 
> And I found that the attachment with CJK name can't be
> recognised and displayed by outlook / outlook express
> / foxmail. This is because there email clients do not
> support RFC2184. Evolution always use RFC2184 encode
> mothod to encode attachment name, so the email with
> CJK named attachment can't display in outlook /
> outlook express / foxmail. In thunderbird, you can set
> the option "mail.strictly_mime.parm_folding" to 0 or 1
> for using RFC2047 encode mothod to encode attachment
> name. Can we add a similar option?
> 
> Best regards.
> 
> 
>   ___ 
> 雅虎邮箱传递新年祝福，个性贺卡送亲朋！ 
> http://cn.mail.yahoo.com/gc/index.html?entry=5&souce=mail_mailletter_tagline
> ___ Evolution-hackers mailing 
> list Evolution-hackers@gnome.org 
> http://mail.gnome.org/mailman/listinfo/evolution-hackers
-- 
Philip Van Hoof, freelance software developer
home: me at pvanhoof dot be 
gnome: pvanhoof at gnome dot org 
http://pvanhoof.be/blog
http://codeminded.be



Index: libtinymail-camel/camel-lite/camel/camel-mime-utils.c
===
--- libtinymail-camel/camel-lite/camel/camel-mime-utils.c	(revision 3190)
+++ libtinymail-camel/camel-lite/camel/camel-mime-utils.c	(working copy)
@@ -821,125 +821,207 @@
 	*in = inptr;
 }
 
+static void
+print_hex (unsigned char *data, size_t len)
+{
+	size_t i, x;
+	unsigned char *p = data;
+	char high, low;
+
+	x = 0;
+	printf ("%04u", x);
+	for (i = 0; i < len; i++) {
+		high = *p >> 4;
+		high = (high<10) ? high + '0' : high + 'a' - 10;
+
+		low = *p & 0x0f;
+		low = (low<10) ? low + '0' : low + 'a' - 10;
+
+		printf ("0x%c%c  ", high, low);
+
+		p++;
+		x++;
+		if (i % 8 == 7) {
+			printf ("\n%04u", x);
+		}
+	}
+	printf ("\n");
+}
+
+static size_t
+conv_to_utf8 (const char *encname, char *in, size_t inlen, char *out, size_t outlen)
+{
+	char *charset, *inbuf, *outbuf;
+	iconv_t ic;
+	size_t inbuf_len, outbuf_len, ret;
+
+	charset = (char *) e_iconv_charset_name (encname);
+
+	ic = e_iconv_open ("UTF-8", charset);
+	if (ic == (iconv_t) -1) {
+		printf ("e_iconv_open() error\n");
+		return (size_t)-1;
+	}
+
+	inbuf = in;
+	inbuf_len = inlen;
+
+	outbuf = out;
+	outbuf_len = outlen;
+
+	ret = e_iconv (ic, (const char **) &inbuf, &inbuf_len, &outbuf, &outbuf_len);
+	if (ret == (size_t)-1) {
+		printf ("e_iconv() error! source charset is %s, target charset is %s\n", charset, "UTF-8");
+		printf ("converted %u bytes, but last %u bytes can't convert!!\n", inlen - inbuf_len, inbuf_len);
+		printf ("source data:\n");
+		print_hex (in, inlen);
+
+		*outbuf = '\0';
+		printf ("target string is \"%s\"\n", out);
+
+		return (size_t)-1;
+	}
+
+	ret = outlen - outbuf_len;
+	out[ret] = '\0';
+
+	e_iconv_close (ic);
+
+	return ret;
+}
+
 /* decode rfc 2047 encoded string segment */
+#define DECWORD_LEN 1024
+#define UTF8_DECWORD_LEN 2048
+
 static char *
 rfc2047_decode_word(const char *in, size_t len)
 {
-	const char *inptr = in+2;
-	const char *inend = in+len-2;
-	const char *inbuf;
-	const char *charset;
-	char *encname, *p;
-	int tmplen;
-	size_t ret;
-	char *decword = NULL;
-	char *decoded = NULL;
-	char *outbase = NULL;
-	char *outbuf;
-	size_t inlen, outlen;
-	gboolean retried = FALSE;
-	iconv_t ic;
-	int idx = 0;
+	char prev_charset[32], curr_charset[32];
+	char encode;
+	char *start, *inptr, *inend;
+	char decword[DECWORD_LEN], utf8_decword[UTF8_DECWORD_LEN];
+	char *decword_ptr, *utf8_decword_ptr;
+	size_t inlen, outlen, ret;
 
 	d(printf("rfc2047: decoding '%.*s'\n", len, in));
 
+	prev_charset[0] = curr_charset[0] = '\0';
+
+	decword_ptr = decword;
+	utf8_decword_ptr = utf8_decword;
+
 	/* quick check to see if this could possibly be a real encoded word */
-
-	if (len < 8 || !(in[0] == '=' && in[1] == '?')) {
+	if (len < 8
+	|| !(in[0] == '=' && in[1] == '?'
+		 && in[len-1] == '=' && in[len-2] == '?')) {
 		d(printf("invalid\n"));
 		return NULL;
 	}
 
-	/* skip past the charset to the encoding type */
-	inptr = memchr (inp

[Evolution-hackers] [patch] fixed incorrect rfc2047 decode for CJK header

2007-12-23 Thread jacky

Hi, all.

The rfc2047 decoder in libcamel can not decode some
CJK header correctly. Although some of them are not
correspond to RFC, but I need to decode it correctly
and I thought if evolution can display there email
correctly more people like it.

So I write a new rfc2047 decoder, and it's in the
patch. With the patch, libcamel can decode CJK header
correctly and evolution can display CJK header
correctly now. I had test it in my mailbox. My mailbox
has 2000 emails which were sent by evolution,
thunderbird, outlook, outlook express, foxmail, open
webmail, yahoo, gmail, lotus notes, etc. Without this
patch, almost 20% of there emails can't be decoded and
displayed correctly, with this patch, 99% of there
emails can be decoded and displayed correctly.

And I found that the attachment with CJK name can't be
recognised and displayed by outlook / outlook express
/ foxmail. This is because there email clients do not
support RFC2184. Evolution always use RFC2184 encode
mothod to encode attachment name, so the email with
CJK named attachment can't display in outlook /
outlook express / foxmail. In thunderbird, you can set
the option "mail.strictly_mime.parm_folding" to 0 or 1
for using RFC2047 encode mothod to encode attachment
name. Can we add a similar option?

Best regards.


  ___ 
雅虎邮箱传递新年祝福，个性贺卡送亲朋！ 
http://cn.mail.yahoo.com/gc/index.html?entry=5&souce=mail_mailletter_taglinediff -ru evolution-data-server-2.21.4/camel/camel-mime-utils.c evolution-data-server-liuzhy/camel/camel-mime-utils.c
--- evolution-data-server-2.21.4/camel/camel-mime-utils.c	2007-12-22 16:50:44.0 +0800
+++ evolution-data-server-liuzhy/camel/camel-mime-utils.c	2007-12-23 14:55:07.0 +0800
@@ -821,116 +821,207 @@
 	*in = inptr;
 }
 
+static void
+print_hex (unsigned char *data, size_t len)
+{
+	size_t i, x;
+	unsigned char *p = data;
+	char high, low;
+
+	x = 0;
+	printf ("%04u", x);
+	for (i = 0; i < len; i++) {
+		high = *p >> 4;
+		high = (high<10) ? high + '0' : high + 'a' - 10;
+
+		low = *p & 0x0f;
+		low = (low<10) ? low + '0' : low + 'a' - 10;
+
+		printf ("0x%c%c  ", high, low);
+
+		p++;
+		x++;
+		if (i % 8 == 7) {
+			printf ("\n%04u", x);
+		}
+	}
+	printf ("\n");
+}
+
+static size_t
+conv_to_utf8 (const char *encname, char *in, size_t inlen, char *out, size_t outlen)
+{
+	char *charset, *inbuf, *outbuf;
+	iconv_t ic;
+	size_t inbuf_len, outbuf_len, ret;
+
+	charset = e_iconv_charset_name (encname);
+
+	ic = e_iconv_open ("UTF-8", charset);
+	if (ic == (iconv_t) -1) {
+		printf ("e_iconv_open() error\n");
+		return (size_t)-1;
+	}
+
+	inbuf = in;
+	inbuf_len = inlen;
+
+	outbuf = out;
+	outbuf_len = outlen;
+
+	ret = e_iconv (ic, &inbuf, &inbuf_len, &outbuf, &outbuf_len);
+	if (ret == (size_t)-1) {
+		printf ("e_iconv() error! source charset is %s, target charset is %s\n", charset, "UTF-8");
+		printf ("converted %u bytes, but last %u bytes can't convert!!\n", inlen - inbuf_len, inbuf_len);
+		printf ("source data:\n");
+		print_hex (in, inlen);
+
+		*outbuf = '\0';
+		printf ("target string is \"%s\"\n", out);
+
+		return (size_t)-1;
+	}
+
+	ret = outlen - outbuf_len;
+	out[ret] = '\0';
+
+	e_iconv_close (ic);
+
+	return ret;
+}
+
 /* decode rfc 2047 encoded string segment */
+#define DECWORD_LEN 1024
+#define UTF8_DECWORD_LEN 2048
+
 static char *
 rfc2047_decode_word(const char *in, size_t len)
 {
-	const char *inptr = in+2;
-	const char *inend = in+len-2;
-	const char *inbuf;
-	const char *charset;
-	char *encname, *p;
-	int tmplen;
-	size_t ret;
-	char *decword = NULL;
-	char *decoded = NULL;
-	char *outbase = NULL;
-	char *outbuf;
-	size_t inlen, outlen;
-	gboolean retried = FALSE;
-	iconv_t ic;
+	char prev_charset[32], curr_charset[32];
+	char encode;
+	char *start, *inptr, *inend;
+	char decword[DECWORD_LEN], utf8_decword[UTF8_DECWORD_LEN];
+	char *decword_ptr, *utf8_decword_ptr;
+	size_t inlen, outlen, ret;
 
 	d(printf("rfc2047: decoding '%.*s'\n", len, in));
 
+	prev_charset[0] = curr_charset[0] = '\0';
+
+	decword_ptr = decword;
+	utf8_decword_ptr = utf8_decword;
+
 	/* quick check to see if this could possibly be a real encoded word */
-	if (len < 8 || !(in[0] == '=' && in[1] == '?' && in[len-1] == '=' && in[len-2] == '?')) {
+	if (len < 8
+	|| !(in[0] == '=' && in[1] == '?'
+		 && in[len-1] == '=' && in[len-2] == '?')) {
 		d(printf("invalid\n"));
 		return NULL;
 	}
 
-	/* skip past the charset to the encoding type */
-	inptr = memchr (inptr, '?', inend-inptr);
-	if (inptr != NULL && inptr < inend + 2 && inptr[2] == '?') {
-		d(printf("found ?, encoding is '%c'\n", inptr[0]));
+	inptr = in;
+	inend = in + len;
+	outlen = sizeof(utf8_decword);
+
+	while (inptr < inend) {
+		/* begin */
+		inptr = memchr (inptr, '?', inend-inptr);
+		if (!inptr || *(inptr-1) != '=') {
+			return NULL;
+		}
+
+		inptr++;
+
+		/* charset */
+		start = inptr;
+		inptr = memchr (inptr, '?', inend-inptr);
+		if (!inptr) {
+			return NULL;
+		}
+

Re: [Evolution-hackers] [patch] fixed incorrect rfc2047 decode for CJKheader

Re: [Evolution-hackers] [patch] fixed incorrect rfc2047 decode for CJKheader

Re: [Evolution-hackers] [patch] fixed incorrect rfc2047 decode for CJK header

Re: [Evolution-hackers] [patch] fixed incorrect rfc2047 decode forCJKheader

Re: [Evolution-hackers] [patch] fixed incorrect rfc2047 decode for CJKheader

Re: [Evolution-hackers] [patch] fixed incorrect rfc2047 decode for CJKheader

Re: [Evolution-hackers] [patch] fixed incorrect rfc2047 decode for CJK header

[Evolution-hackers] [patch] fixed incorrect rfc2047 decode for CJK header

8 matches

Site Navigation

Mail list logo

Footer information