Hi Markus,
Thanks for identifying the commit introducing the regression, I added
this as a comment to the JIRA ticket, hope it can help greatly the
person that will try fixing it.
Regards,
Rene.
On 9/25/24 3:12 AM, Markus Wiederkehr wrote:
Hi,
A quick bisect seems to indicate that the regression was introduced in
commit 7e23d5e1cc2bd77f3e5129622472b995cd3f034e : MIME4J-316 Parts missing
in case of a specific combination of boundaries
---
a/core/src/main/java/org/apache/james/mime4j/io/MimeBoundaryInputStream.java
+++
b/core/src/main/java/org/apache/james/mime4j/io/MimeBoundaryInputStream.java
@@ -244,11 +244,14 @@ public class MimeBoundaryInputStream extends
LineReaderInputStream {
// Make sure the boundary is terminated with EOS
break;
} else {
- // or with a whitespace or '-' char
+ // or with a whitespace or '--'
char ch = (char)(buffer.byteAt(pos));
- if (CharsetUtil.isWhitespace(ch) || ch == '-') {
+ if (CharsetUtil.isWhitespace(ch)) {
break;
}
+ if (ch == '-' && remaining > 1 &&
(char)(buffer.byteAt(pos+1)) == '-') {
+ break;
+ }
Regards,
Markus
On Tue, Sep 24, 2024 at 10:32 AM Rene Cordier <rcord...@apache.org> wrote:
Hi Madis,
Sorry for the delay on the answer.
I created a task on the JIRA regarding your report:
https://issues.apache.org/jira/browse/MIME4J-330
Do you feel like perhaps trying contributing a fix for this by the way?
If not we might take a look, I just can't guarantee when though.
Thanks again for the report!
Rene.
On 9/17/24 2:28 PM, Madis Loitmaa wrote:
Hello,
Looks like the attached unit test got lost in transit, it's present in
my sent folder. I'll try to inline it at the end of the email this
time.
Best regards,
Madis Loitmaa
// file src/test/java/org/apache/james/mime4j/parser/PartLengthTest.java
package org.apache.james.mime4j.parser;
import org.apache.commons.io.IOUtils;
import org.apache.james.mime4j.MimeException;
import org.apache.james.mime4j.stream.BodyDescriptor;
import org.junit.Assert;
import org.junit.Test;
import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.nio.charset.StandardCharsets;
public class PartLengthTest {
@Test
public void testExtractPartWithDifferentLengths() throws Exception {
StringBuilder partBuilder = new StringBuilder();
for (int i = 1; i <= 5000; i++) {
partBuilder.append(i % 80 == 0 ? "\n" : "A");
String part = partBuilder.toString();
String mimeMessage = createMimeMultipart(part);
String extracted = extractPart(mimeMessage);
if (!part.equals(extracted)) {
System.out.println("Extracted part comparison failed
for part length " + i);
}
Assert.assertEquals(part, extracted);
}
}
private String createMimeMultipart(String part) {
return "Content-type: multipart/mixed;
boundary=QvEgqhjEnYxz\r\n"
+ "\r\n"
+ "--QvEgqhjEnYxz\r\n"
+ "Content-Type: text/plain\r\n"
+ "\r\n"
+ part
+ "\r\n"
+ "--QvEgqhjEnYxz--\r\n";
}
private String extractPart(String mimeMessage) throws
MimeException, IOException {
String[] resultWrapper = new String[1];
MimeStreamParser parser = new MimeStreamParser();
parser.setContentHandler(new AbstractContentHandler() {
@Override
public void body(BodyDescriptor bd, InputStream is) throws
MimeException, IOException {
resultWrapper[0] = new String(IOUtils.toString(is,
StandardCharsets.UTF_8).getBytes());
}
});
parser.parse(new ByteArrayInputStream(mimeMessage.getBytes()));
return resultWrapper[0];
}
}
//end-of-file
Kontakt Rene Cordier (<rcord...@apache.org>) kirjutas kuupäeval T, 17.
september 2024 kell 05:55:
Hello Madis,
First of all thank you for your report. If there is indeed a regression
we should take a look and fix it.
However, I don't see any file attached to your email? Maybe you forgot.
Could you please try to resend your unit test attached to your mail
response please? That would help the community to check, understand and
take proper actions for a fix.
Thank you and best regards,
Rene.
On 9/11/24 1:32 PM, Madis Loitmaa wrote:
Hello Mime4j Developers,
I am reporting a regression in the MimeStreamParser when extracting
parts of a multipart message.
Specifically, when processing messages with certain body lengths, the
CR character (from the CRLF sequence preceding the boundary marker) is
incorrectly included as the last character of the part body.
Environment:
This issue was identified after upgrading Mime4j in our project from
version 0.8.7 to 0.8.11. It appears to affect all versions starting
from 0.8.8.
Reproduction:
- I have attached a unit test to this email, which demonstrates the
problem. The test fails for a part body length of 4051 bytes on the
current master branch (commit
85995590ad6700cc8bf7a3b8462ce87843dab5bd), but passes when tested with
version 0.8.7 (commit ed5a50c8071080b4eaedd6ab13baf25843d691a3).
- The bug appears when CRLF is used as the line separator. The issue
does not occur when LF is used.
Attachments:
A unit test demonstrating the issue.
src/test/java/org/apache/james/mime4j/parser/PartLengthTest.java
Please let me know if you need any further information or
clarification.
Best regards,
Madis Loitmaa