Failure to decode Korean characters when not base64-encoded

Yoni Gibbs Wed, 06 Mar 2024 04:19:44 -0800

Hello! We’ve found a small issue that we think might be a bug in mime4j (or 
maybe we’re just missing some setting somewhere)…


We are trying to parse an eml file where one of the recipients has a name 
containing Korean characters. If the `To:` header is base64-encoded, it’s fine; 
if it’s not base64-encoded, it fails to be read the name correctly.

Say we have an eml file with this header:


To: =?UTF-8?B?7Iuc7ZeY?= <koreant...@example.com>

This parses fine.

But this header the name is parsed incorrectly:


To: "시험" <koreant...@example.com>

I am using org.apache.james:apache-mime4j:0.8.10, and this is some sample Java 
code to that produces the issue:


var is = 
Main.class.getClassLoader().getResourceAsStream("not-base64-encoded.eml");
var parsed = new DefaultMessageBuilder().parseMessage(is);
var to = ((Mailbox) parsed.getTo().get(0)).getName();
assertEquals("시험", to);

Note that to generate an eml with the two different possible headers above we 
are using org.eclipse.angus:angus-mail:2.0.2. The default behaviour there is to 
generate the base64-encoded header. However if you set the mail.mime.allowutf8 
property in the SMTP session to true then it creates the not-base64-encoded 
header, which causes the problem for mime4j.

Does this look like a bug in mime4j? Or is there something we’re not setting 
correctly?

Thanks in advance,

Yoni Gibbs.

Failure to decode Korean characters when not base64-encoded

Reply via email to