Hi,
A quick bisect seems to indicate that the regression was introduced in
commit 7e23d5e1cc2bd77f3e5129622472b995cd3f034e : MIME4J-316 Parts missing
in case of a specific combination of boundaries
---
a/core/src/main/java/org/apache/james/mime4j/io/MimeBoundaryInputStream.java
+++
b/core/src/main/java/org/apache/james/mime4j/io/MimeBoundaryInputStream.java
@@ -244,11 +244,14 @@ public class MimeBoundaryInputStream extends
LineReaderInputStream {
// Make sure the boundary is terminated with EOS
break;
} else {
- // or with a whitespace or '-' char
+ // or with a whitespace or '--'
char ch = (char)(buffer.byteAt(pos));
- if (CharsetUtil.isWhitespace(ch) || ch == '-') {
+ if (CharsetUtil.isWhitespace(ch)) {
break;
}
+ if (ch == '-' && remaining > 1 &&
(char)(buffer.byteAt(pos+1)) == '-') {
+ break;
+ }
Regards,
Markus
On Tue, Sep 24, 2024 at 10:32 AM Rene Cordier <[email protected]> wrote:
> Hi Madis,
>
> Sorry for the delay on the answer.
>
> I created a task on the JIRA regarding your report:
> https://issues.apache.org/jira/browse/MIME4J-330
>
> Do you feel like perhaps trying contributing a fix for this by the way?
> If not we might take a look, I just can't guarantee when though.
>
> Thanks again for the report!
>
> Rene.
>
> On 9/17/24 2:28 PM, Madis Loitmaa wrote:
> > Hello,
> > Looks like the attached unit test got lost in transit, it's present in
> > my sent folder. I'll try to inline it at the end of the email this
> > time.
> >
> > Best regards,
> > Madis Loitmaa
> >
> > // file src/test/java/org/apache/james/mime4j/parser/PartLengthTest.java
> > package org.apache.james.mime4j.parser;
> >
> > import org.apache.commons.io.IOUtils;
> > import org.apache.james.mime4j.MimeException;
> > import org.apache.james.mime4j.stream.BodyDescriptor;
> > import org.junit.Assert;
> > import org.junit.Test;
> >
> > import java.io.ByteArrayInputStream;
> > import java.io.IOException;
> > import java.io.InputStream;
> > import java.nio.charset.StandardCharsets;
> >
> > public class PartLengthTest {
> > @Test
> > public void testExtractPartWithDifferentLengths() throws Exception {
> > StringBuilder partBuilder = new StringBuilder();
> > for (int i = 1; i <= 5000; i++) {
> > partBuilder.append(i % 80 == 0 ? "\n" : "A");
> > String part = partBuilder.toString();
> > String mimeMessage = createMimeMultipart(part);
> >
> > String extracted = extractPart(mimeMessage);
> >
> > if (!part.equals(extracted)) {
> > System.out.println("Extracted part comparison failed
> > for part length " + i);
> > }
> > Assert.assertEquals(part, extracted);
> > }
> > }
> >
> > private String createMimeMultipart(String part) {
> > return "Content-type: multipart/mixed;
> boundary=QvEgqhjEnYxz\r\n"
> > + "\r\n"
> > + "--QvEgqhjEnYxz\r\n"
> > + "Content-Type: text/plain\r\n"
> > + "\r\n"
> > + part
> > + "\r\n"
> > + "--QvEgqhjEnYxz--\r\n";
> > }
> >
> > private String extractPart(String mimeMessage) throws
> > MimeException, IOException {
> > String[] resultWrapper = new String[1];
> >
> > MimeStreamParser parser = new MimeStreamParser();
> > parser.setContentHandler(new AbstractContentHandler() {
> > @Override
> > public void body(BodyDescriptor bd, InputStream is) throws
> > MimeException, IOException {
> > resultWrapper[0] = new String(IOUtils.toString(is,
> > StandardCharsets.UTF_8).getBytes());
> > }
> > });
> > parser.parse(new ByteArrayInputStream(mimeMessage.getBytes()));
> > return resultWrapper[0];
> > }
> > }
> > //end-of-file
> >
> >
> > Kontakt Rene Cordier (<[email protected]>) kirjutas kuupäeval T, 17.
> > september 2024 kell 05:55:
> >> Hello Madis,
> >>
> >> First of all thank you for your report. If there is indeed a regression
> >> we should take a look and fix it.
> >>
> >> However, I don't see any file attached to your email? Maybe you forgot.
> >> Could you please try to resend your unit test attached to your mail
> >> response please? That would help the community to check, understand and
> >> take proper actions for a fix.
> >>
> >> Thank you and best regards,
> >>
> >> Rene.
> >>
> >> On 9/11/24 1:32 PM, Madis Loitmaa wrote:
> >>> Hello Mime4j Developers,
> >>>
> >>> I am reporting a regression in the MimeStreamParser when extracting
> >>> parts of a multipart message.
> >>>
> >>> Specifically, when processing messages with certain body lengths, the
> >>> CR character (from the CRLF sequence preceding the boundary marker) is
> >>> incorrectly included as the last character of the part body.
> >>>
> >>> Environment:
> >>> This issue was identified after upgrading Mime4j in our project from
> >>> version 0.8.7 to 0.8.11. It appears to affect all versions starting
> >>> from 0.8.8.
> >>>
> >>> Reproduction:
> >>> - I have attached a unit test to this email, which demonstrates the
> >>> problem. The test fails for a part body length of 4051 bytes on the
> >>> current master branch (commit
> >>> 85995590ad6700cc8bf7a3b8462ce87843dab5bd), but passes when tested with
> >>> version 0.8.7 (commit ed5a50c8071080b4eaedd6ab13baf25843d691a3).
> >>>
> >>> - The bug appears when CRLF is used as the line separator. The issue
> >>> does not occur when LF is used.
> >>>
> >>> Attachments:
> >>> A unit test demonstrating the issue.
> >>> src/test/java/org/apache/james/mime4j/parser/PartLengthTest.java
> >>>
> >>> Please let me know if you need any further information or
> clarification.
> >>>
> >>> Best regards,
> >>> Madis Loitmaa
>