Re: Regression in MimeStreamParser: part body stream ends with CR

Markus Wiederkehr Tue, 24 Sep 2024 13:18:37 -0700

Hi,

A quick bisect seems to indicate that the regression was introduced in
commit 7e23d5e1cc2bd77f3e5129622472b995cd3f034e : MIME4J-316 Parts missing
in case of a specific combination of boundaries


---
a/core/src/main/java/org/apache/james/mime4j/io/MimeBoundaryInputStream.java
+++
b/core/src/main/java/org/apache/james/mime4j/io/MimeBoundaryInputStream.java
@@ -244,11 +244,14 @@ public class MimeBoundaryInputStream extends
LineReaderInputStream {
                     // Make sure the boundary is terminated with EOS
                     break;
                 } else {
-                    // or with a whitespace or '-' char
+                    // or with a whitespace or '--'
                     char ch = (char)(buffer.byteAt(pos));
-                    if (CharsetUtil.isWhitespace(ch) || ch == '-') {
+                    if (CharsetUtil.isWhitespace(ch)) {
                         break;
                     }
+                    if (ch == '-' && remaining > 1 &&
(char)(buffer.byteAt(pos+1)) == '-') {
+                        break;
+                    }

Regards,
Markus


On Tue, Sep 24, 2024 at 10:32 AM Rene Cordier <rcord...@apache.org> wrote:

> Hi Madis,
>
> Sorry for the delay on the answer.
>
> I created a task on the JIRA regarding your report:
> https://issues.apache.org/jira/browse/MIME4J-330
>
> Do you feel like perhaps trying contributing a fix for this by the way?
> If not we might take a look, I just can't guarantee when though.
>
> Thanks again for the report!
>
> Rene.
>
> On 9/17/24 2:28 PM, Madis Loitmaa wrote:
> > Hello,
> > Looks like the attached unit test got lost in transit, it's present in
> > my sent folder. I'll try to inline it at the end of the email this
> > time.
> >
> > Best regards,
> > Madis Loitmaa
> >
> > // file src/test/java/org/apache/james/mime4j/parser/PartLengthTest.java
> > package org.apache.james.mime4j.parser;
> >
> > import org.apache.commons.io.IOUtils;
> > import org.apache.james.mime4j.MimeException;
> > import org.apache.james.mime4j.stream.BodyDescriptor;
> > import org.junit.Assert;
> > import org.junit.Test;
> >
> > import java.io.ByteArrayInputStream;
> > import java.io.IOException;
> > import java.io.InputStream;
> > import java.nio.charset.StandardCharsets;
> >
> > public class PartLengthTest {
> >      @Test
> >      public void testExtractPartWithDifferentLengths() throws Exception {
> >          StringBuilder partBuilder = new StringBuilder();
> >          for (int i = 1; i <= 5000; i++) {
> >              partBuilder.append(i % 80 == 0 ? "\n" : "A");
> >              String part = partBuilder.toString();
> >              String mimeMessage = createMimeMultipart(part);
> >
> >              String extracted = extractPart(mimeMessage);
> >
> >              if (!part.equals(extracted)) {
> >                  System.out.println("Extracted part comparison failed
> > for part length " + i);
> >              }
> >              Assert.assertEquals(part, extracted);
> >          }
> >      }
> >
> >      private String createMimeMultipart(String part) {
> >          return "Content-type: multipart/mixed;
> boundary=QvEgqhjEnYxz\r\n"
> >                  + "\r\n"
> >                  + "--QvEgqhjEnYxz\r\n"
> >                  + "Content-Type: text/plain\r\n"
> >                  + "\r\n"
> >                  + part
> >                  + "\r\n"
> >                  + "--QvEgqhjEnYxz--\r\n";
> >      }
> >
> >      private String extractPart(String mimeMessage) throws
> > MimeException, IOException {
> >          String[] resultWrapper = new String[1];
> >
> >          MimeStreamParser parser = new MimeStreamParser();
> >          parser.setContentHandler(new AbstractContentHandler() {
> >              @Override
> >              public void body(BodyDescriptor bd, InputStream is) throws
> > MimeException, IOException {
> >                  resultWrapper[0] = new String(IOUtils.toString(is,
> > StandardCharsets.UTF_8).getBytes());
> >              }
> >          });
> >          parser.parse(new ByteArrayInputStream(mimeMessage.getBytes()));
> >          return resultWrapper[0];
> >      }
> > }
> > //end-of-file
> >
> >
> > Kontakt Rene Cordier (<rcord...@apache.org>) kirjutas kuupäeval T, 17.
> > september 2024 kell 05:55:
> >> Hello Madis,
> >>
> >> First of all thank you for your report. If there is indeed a regression
> >> we should take a look and fix it.
> >>
> >> However, I don't see any file attached to your email? Maybe you forgot.
> >> Could you please try to resend your unit test attached to your mail
> >> response please? That would help the community to check, understand and
> >> take proper actions for a fix.
> >>
> >> Thank you and best regards,
> >>
> >> Rene.
> >>
> >> On 9/11/24 1:32 PM, Madis Loitmaa wrote:
> >>> Hello Mime4j Developers,
> >>>
> >>> I am reporting a regression in the MimeStreamParser when extracting
> >>> parts of a multipart message.
> >>>
> >>> Specifically, when processing messages with certain body lengths, the
> >>> CR character (from the CRLF sequence preceding the boundary marker) is
> >>> incorrectly included as the last character of the part body.
> >>>
> >>> Environment:
> >>> This issue was identified after upgrading Mime4j in our project from
> >>> version 0.8.7 to 0.8.11. It appears to affect all versions starting
> >>> from 0.8.8.
> >>>
> >>> Reproduction:
> >>> - I have attached a unit test to this email, which demonstrates the
> >>> problem. The test fails for a part body length of 4051 bytes on the
> >>> current master branch (commit
> >>> 85995590ad6700cc8bf7a3b8462ce87843dab5bd), but passes when tested with
> >>> version 0.8.7 (commit ed5a50c8071080b4eaedd6ab13baf25843d691a3).
> >>>
> >>> - The bug appears when CRLF is used as the line separator. The issue
> >>> does not occur when LF is used.
> >>>
> >>> Attachments:
> >>> A unit test demonstrating the issue.
> >>> src/test/java/org/apache/james/mime4j/parser/PartLengthTest.java
> >>>
> >>> Please let me know if you need any further information or
> clarification.
> >>>
> >>> Best regards,
> >>> Madis Loitmaa
>

Re: Regression in MimeStreamParser: part body stream ends with CR

Reply via email to