[ 
https://issues.apache.org/jira/browse/TIKA-2455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16216319#comment-16216319
 ] 

Hudson commented on TIKA-2455:
------------------------------

SUCCESS: Integrated in Jenkins build Tika-trunk #1381 (See 
[https://builds.apache.org/job/Tika-trunk/1381/])
TIKA-2455: flag the containing multipart type (github: 
[https://github.com/apache/tika/commit/4de0c66ad4b28402597fd9cb03978ba00bdc2e9f])
* (edit) 
tika-parsers/src/main/java/org/apache/tika/parser/mail/MailContentHandler.java
TIKA-2455: test for feature; only store multipart subtype in metadata (mattcg: 
[https://github.com/apache/tika/commit/33da38ebb209250680aee3ab8565c89846f3a865])
* (edit) 
tika-parsers/src/main/java/org/apache/tika/parser/mail/MailContentHandler.java
* (edit) 
tika-parsers/src/test/java/org/apache/tika/parser/mail/RFC822ParserTest.java
* (edit) tika-core/src/main/java/org/apache/tika/metadata/Message.java


> Flag in metadata for alternative email bodies
> ---------------------------------------------
>
>                 Key: TIKA-2455
>                 URL: https://issues.apache.org/jira/browse/TIKA-2455
>             Project: Tika
>          Issue Type: Improvement
>          Components: parser
>    Affects Versions: 1.16
>            Reporter: Matthew Caruana Galizia
>            Priority: Minor
>              Labels: attachments, multipart, rfc822, rfc822parser
>             Fix For: 1.17
>
>
> When multipart RFC822 emails are being parsed, there's no way to distinguish 
> between alternative versions of the body and attachments.
> It would be ideal if some kind of flag were set in the metadata passed to the 
> {{EmbeddedDocumentExtractor}} that indicates that the stream is an 
> alternative.
> In GUIs that present the data extracted from the email, alternative bodies 
> can be distinguished from attachments and presented separately.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to