Thomas Fricker created MIME4J-316: ------------------------------------- Summary: Parts missing in case of a specific combination of boundaries Key: MIME4J-316 URL: https://issues.apache.org/jira/browse/MIME4J-316 Project: James Mime4j Issue Type: Bug Components: parser (core) Affects Versions: 0.7.2 Reporter: Thomas Fricker
The problem can be reproduced by parsing a very specific email structure, where an inner boundary matches the name of another outer boundary followed by a "-" character. In the following example, the attached pdf file will be ignored by the parser. {code:java} Content-Type: multipart/mixed; boundary="--boundary.1652331600846930886" ----boundary.1652331600846930886 Content-Type: multipart/alternative; boundary="--boundary.1652331600846930886-1" ----boundary.1652331600846930886-1 Content-Type: text/plain; charset=utf-8 sometext ----boundary.1652331600846930886-1 Content-Type: text/html; charset=utf-8 <html lang="en"> <body> </body> </html> ----boundary.1652331600846930886-1-- ----boundary.1652331600846930886 Content-Type: application/pdf; name="test.pdf" Content-Transfer-Encoding: base64 Content-Disposition: Attachment; filename="test.pdf" JVBERi0xLj4Kc3RhcnR4cmVmCjUzNjEwCiUlRU9GCgshortened== ----boundary.1652331600846930886-- {code} Dumping the EntityState during parsing produces {code:java} State: T_START_MULTIPART State: T_PREAMBLE State: T_END_MULTIPART State: T_END_BODYPART State: T_START_BODYPART State: T_START_HEADER Header field detected: Content-Type: text/plain; charset=utf-8 State: T_END_HEADER Body detected, contents = [LineReaderInputStreamAdaptor: [pos: 43][limit: 103]....]], header data = [mimeType=text/plain, mediaType=text, subType=plain, boundary=null, charset=utf-8] State: T_END_BODYPART State: T_START_BODYPART State: T_START_HEADER Header field detected: Content-Type: text/html; charset=utf-8 State: T_END_HEADER Body detected, contents = [LineReaderInputStreamAdaptor: [pos: 42][limit: 313]], header data = [mimeType=text/html, mediaType=text, subType=html, boundary=null, charset=utf-8] State: T_END_BODYPART State: T_EPILOGUE State: T_END_MULTIPART State: T_END_MESSAGE {code} The PDF attachment is missing. I proposed following fix : [https://github.com/apache/james-mime4j/pull/71] which produces following structure: {code:java} State: T_START_MULTIPART State: T_START_BODYPART State: T_START_HEADER Header field detected: Content-Type: multipart/alternative; boundary="--boundary.1652331600846930886-1" State: T_END_HEADER Multipart message detexted, header data = [mimeType=multipart/alternative, mediaType=multipart, subType=alternative, boundary=--boundary.1652331600846930886-1, charset=null] State: T_START_MULTIPART State: T_START_BODYPART State: T_START_HEADER Header field detected: Content-Type: text/plain; charset=utf-8 State: T_END_HEADER Body detected, contents = [LineReaderInputStreamAdaptor: [pos: 43][limit: 103]], header data = [mimeType=text/plain, mediaType=text, subType=plain, boundary=null, charset=utf-8] State: T_END_BODYPART State: T_START_BODYPART State: T_START_HEADER Header field detected: Content-Type: text/html; charset=utf-8 State: T_END_HEADER Body detected, contents = [LineReaderInputStreamAdaptor: [pos: 42][limit: 313]]], header data = [mimeType=text/html, mediaType=text, subType=html, boundary=null, charset=utf-8] State: T_END_BODYPART State: T_END_MULTIPART State: T_END_BODYPART State: T_START_BODYPART State: T_START_HEADER Header field detected: Content-Type: application/pdf; name="Daily_Stats-2022-05-12-0700.pdf" Header field detected: Content-Transfer-Encoding: base64 Header field detected: Content-Disposition: Attachment; filename="Daily_Stats-2022-05-12-0700.pdf" State: T_END_HEADER Body detected, contents = [LineReaderInputStreamAdaptor: [pos: 189][limit: 235][JVBERi0xLj4Kc3RhcnR4cmVmCjUzNjEwCiUlRU9GCg== ]], header data = [mimeType=application/pdf, mediaType=application, subType=pdf, boundary=null, charset=null] State: T_END_BODYPART State: T_END_MULTIPART State: T_END_MESSAGE {code} I shortened the output of the body parts. -- This message was sent by Atlassian Jira (v8.20.7#820007)