[
https://issues.apache.org/jira/browse/TIKA-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279612#comment-14279612
]
Nick Burch commented on TIKA-1222:
----------------------------------
I've done something similar by r1652321. It's heavily inspired by Luis's patch,
but maintains a bit more backwards compatibility. As such, no unit tests needed
to change! I have also added a new unit test, along the lines of the pdf and
office embedded tests, to verify we correct find all the parts as embedded
resources when in extraction mode. Previous behaviour (inline content) remains
when no extraction is requested, for compatibility
Could people please check that this is working correctly + as expected?
> Tika does not extract attachments from RFC822 files
> ---------------------------------------------------
>
> Key: TIKA-1222
> URL: https://issues.apache.org/jira/browse/TIKA-1222
> Project: Tika
> Issue Type: Bug
> Components: parser
> Affects Versions: 1.4, 1.5, 1.6
> Reporter: Luis Filipe Nassif
> Attachments: Tika-1222.patch
>
>
> TikaApp --extract option does not extract attachments from RFC822 files. The
> issue happens because MailContentHandler.body(...) method gets a Parser.class
> object from the context and calls parser.parse(). It should get a
> EmbeddedDocumentExtractor.class object from the ParseContext one and call
> embeddedDocumentExtractor.parseEmbedded(), similar to other Container parsers.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)