[ https://issues.apache.org/jira/browse/TIKA-2594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16390028#comment-16390028 ]
Luis Filipe Nassif commented on TIKA-2594: ------------------------------------------ We have used that magic restricted to 0:1000 for a long time, with very few false positives, along with: {code} <match value="\nReturn-Path:" type="stringignorecase" offset="0:1000"/> <match value="\nX-Originating-IP:" type="stringignorecase" offset="0:1000"/> <match value="\nReceived:" type="stringignorecase" offset="0:1000"/> <match value="\nMessage-ID:" type="stringignorecase" offset="0:1000"/> {code} > Mail detected as application/xhtml+xml > -------------------------------------- > > Key: TIKA-2594 > URL: https://issues.apache.org/jira/browse/TIKA-2594 > Project: Tika > Issue Type: Bug > Affects Versions: 2.0, 1.16, 1.17 > Reporter: Andreas Meier > Priority: Major > Attachments: TestMail_inline_xhtml_plus_image.eml > > > The attached mail (message/rfc822) with inline xhtml is recognized as > application/xhtml+xml > Regards > Andreas -- This message was sent by Atlassian JIRA (v7.6.3#76005)