[jira] [Commented] (TIKA-2594) Mail detected as application/xhtml+xml

2018-03-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-2594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390350#comment-16390350
 ] 

Hudson commented on TIKA-2594:
--

SUCCESS: Integrated in Jenkins build tika-branch-1x #6 (See 
[https://builds.apache.org/job/tika-branch-1x/6/])
TIKA-2594 improve eml detection via Luis Filipe Nassif (tallison: 
[https://github.com/apache/tika/commit/e12117c0e4792404eca825df0d2ae9925f0d5d18])
* (edit) tika-server/pom.xml
* (edit) tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml


> Mail detected as application/xhtml+xml
> --
>
> Key: TIKA-2594
> URL: https://issues.apache.org/jira/browse/TIKA-2594
> Project: Tika
>  Issue Type: Bug
>Affects Versions: 2.0, 1.16, 1.17
>Reporter: Andreas Meier
>Priority: Major
> Fix For: 1.18, 2.0.0
>
> Attachments: TestMail_inline_xhtml_plus_image.eml
>
>
> The attached mail (message/rfc822) with inline xhtml is recognized as 
> application/xhtml+xml
> Regards
> Andreas



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TIKA-2594) Mail detected as application/xhtml+xml

2018-03-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-2594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390334#comment-16390334
 ] 

Hudson commented on TIKA-2594:
--

SUCCESS: Integrated in Jenkins build Tika-trunk #1454 (See 
[https://builds.apache.org/job/Tika-trunk/1454/])
TIKA-2594 improve eml detection via Luis Filipe Nassif (tallison: 
[https://github.com/apache/tika/commit/9c0a822419797f20a09388ccd235c7e70db9])
* (edit) tika-server/pom.xml
* (edit) tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml


> Mail detected as application/xhtml+xml
> --
>
> Key: TIKA-2594
> URL: https://issues.apache.org/jira/browse/TIKA-2594
> Project: Tika
>  Issue Type: Bug
>Affects Versions: 2.0, 1.16, 1.17
>Reporter: Andreas Meier
>Priority: Major
> Fix For: 1.18, 2.0.0
>
> Attachments: TestMail_inline_xhtml_plus_image.eml
>
>
> The attached mail (message/rfc822) with inline xhtml is recognized as 
> application/xhtml+xml
> Regards
> Andreas



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TIKA-2594) Mail detected as application/xhtml+xml

2018-03-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-2594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390308#comment-16390308
 ] 

Hudson commented on TIKA-2594:
--

UNSTABLE: Integrated in Jenkins build tika-2.x-windows #215 (See 
[https://builds.apache.org/job/tika-2.x-windows/215/])
TIKA-2594 improve eml detection via Luis Filipe Nassif (tallison: rev 
9c0a822419797f20a09388ccd235c7e70db9)
* (edit) tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml
* (edit) tika-server/pom.xml


> Mail detected as application/xhtml+xml
> --
>
> Key: TIKA-2594
> URL: https://issues.apache.org/jira/browse/TIKA-2594
> Project: Tika
>  Issue Type: Bug
>Affects Versions: 2.0, 1.16, 1.17
>Reporter: Andreas Meier
>Priority: Major
> Fix For: 1.18, 2.0.0
>
> Attachments: TestMail_inline_xhtml_plus_image.eml
>
>
> The attached mail (message/rfc822) with inline xhtml is recognized as 
> application/xhtml+xml
> Regards
> Andreas



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TIKA-2594) Mail detected as application/xhtml+xml

2018-03-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-2594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390251#comment-16390251
 ] 

Hudson commented on TIKA-2594:
--

SUCCESS: Integrated in Jenkins build tika-branch-1x #5 (See 
[https://builds.apache.org/job/tika-branch-1x/5/])
TIKA-2594 -- improve eml detection for those starting with Subject: and 
(tallison: 
[https://github.com/apache/tika/commit/b9e9e5b150aca851465e99017da6328c202ba127])
* (add) 
tika-parsers/src/test/resources/test-documents/testEML_embedded_xhtml_and_img.eml
* (edit) tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml
* (edit) tika-parsers/src/test/java/org/apache/tika/mime/TestMimeTypes.java


> Mail detected as application/xhtml+xml
> --
>
> Key: TIKA-2594
> URL: https://issues.apache.org/jira/browse/TIKA-2594
> Project: Tika
>  Issue Type: Bug
>Affects Versions: 2.0, 1.16, 1.17
>Reporter: Andreas Meier
>Priority: Major
> Fix For: 1.18, 2.0.0
>
> Attachments: TestMail_inline_xhtml_plus_image.eml
>
>
> The attached mail (message/rfc822) with inline xhtml is recognized as 
> application/xhtml+xml
> Regards
> Andreas



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TIKA-2594) Mail detected as application/xhtml+xml

2018-03-07 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-2594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390197#comment-16390197
 ] 

Tim Allison commented on TIKA-2594:
---

[~lfcnassif], I added the mime defs you suggested above just now to both 2.0.0 
and 1.18.  Thank you!

> Mail detected as application/xhtml+xml
> --
>
> Key: TIKA-2594
> URL: https://issues.apache.org/jira/browse/TIKA-2594
> Project: Tika
>  Issue Type: Bug
>Affects Versions: 2.0, 1.16, 1.17
>Reporter: Andreas Meier
>Priority: Major
> Fix For: 1.18, 2.0.0
>
> Attachments: TestMail_inline_xhtml_plus_image.eml
>
>
> The attached mail (message/rfc822) with inline xhtml is recognized as 
> application/xhtml+xml
> Regards
> Andreas



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TIKA-2594) Mail detected as application/xhtml+xml

2018-03-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-2594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390059#comment-16390059
 ] 

Hudson commented on TIKA-2594:
--

SUCCESS: Integrated in Jenkins build Tika-trunk #1451 (See 
[https://builds.apache.org/job/Tika-trunk/1451/])
TIKA-2594 -- improve eml detection for those starting with Subject: and 
(tallison: 
[https://github.com/apache/tika/commit/09031046e5bece75ed22d9ee9b184ec49a14f99a])
* (add) 
tika-parsers/src/test/resources/test-documents/testEML_embedded_xhtml_and_img.eml
* (edit) tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml
* (edit) tika-parsers/src/test/java/org/apache/tika/mime/TestMimeTypes.java


> Mail detected as application/xhtml+xml
> --
>
> Key: TIKA-2594
> URL: https://issues.apache.org/jira/browse/TIKA-2594
> Project: Tika
>  Issue Type: Bug
>Affects Versions: 2.0, 1.16, 1.17
>Reporter: Andreas Meier
>Priority: Major
> Attachments: TestMail_inline_xhtml_plus_image.eml
>
>
> The attached mail (message/rfc822) with inline xhtml is recognized as 
> application/xhtml+xml
> Regards
> Andreas



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TIKA-2594) Mail detected as application/xhtml+xml

2018-03-07 Thread Luis Filipe Nassif (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-2594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390028#comment-16390028
 ] 

Luis Filipe Nassif commented on TIKA-2594:
--

We have used that magic restricted to 0:1000 for a long time, with very few 
false positives, along with:

{code}

 
 
 
{code}

> Mail detected as application/xhtml+xml
> --
>
> Key: TIKA-2594
> URL: https://issues.apache.org/jira/browse/TIKA-2594
> Project: Tika
>  Issue Type: Bug
>Affects Versions: 2.0, 1.16, 1.17
>Reporter: Andreas Meier
>Priority: Major
> Attachments: TestMail_inline_xhtml_plus_image.eml
>
>
> The attached mail (message/rfc822) with inline xhtml is recognized as 
> application/xhtml+xml
> Regards
> Andreas



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TIKA-2594) Mail detected as application/xhtml+xml

2018-03-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-2594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16389944#comment-16389944
 ] 

Hudson commented on TIKA-2594:
--

FAILURE: Integrated in Jenkins build tika-2.x-windows #212 (See 
[https://builds.apache.org/job/tika-2.x-windows/212/])
TIKA-2594 -- improve eml detection for those starting with Subject: and 
(tallison: rev 09031046e5bece75ed22d9ee9b184ec49a14f99a)
* (edit) tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml
* (edit) tika-parsers/src/test/java/org/apache/tika/mime/TestMimeTypes.java
* (add) 
tika-parsers/src/test/resources/test-documents/testEML_embedded_xhtml_and_img.eml


> Mail detected as application/xhtml+xml
> --
>
> Key: TIKA-2594
> URL: https://issues.apache.org/jira/browse/TIKA-2594
> Project: Tika
>  Issue Type: Bug
>Affects Versions: 2.0, 1.16, 1.17
>Reporter: Andreas Meier
>Priority: Major
> Attachments: TestMail_inline_xhtml_plus_image.eml
>
>
> The attached mail (message/rfc822) with inline xhtml is recognized as 
> application/xhtml+xml
> Regards
> Andreas



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TIKA-2594) Mail detected as application/xhtml+xml

2018-03-07 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-2594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16389661#comment-16389661
 ] 

Tim Allison commented on TIKA-2594:
---

add the following or is this too lenient?

{noformat}
  
{noformat}

> Mail detected as application/xhtml+xml
> --
>
> Key: TIKA-2594
> URL: https://issues.apache.org/jira/browse/TIKA-2594
> Project: Tika
>  Issue Type: Bug
>Affects Versions: 2.0, 1.16, 1.17
>Reporter: Andreas Meier
>Priority: Major
> Attachments: TestMail_inline_xhtml_plus_image.eml
>
>
> The attached mail (message/rfc822) with inline xhtml is recognized as 
> application/xhtml+xml
> Regards
> Andreas



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)