[
https://issues.apache.org/jira/browse/TIKA-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Celpan Valeria updated TIKA-2694:
---------------------------------
Description:
For some emails we get instead of the email address for "From" field a value
which looks like `/O=SONY/OU=EXCHANGE ADMINISTRATIVE GROUP
(FYDIBOHF23SPDLT)/CN=RECIPIENTS/CN=EBERGER`.
The issue seems to be connected to the library
`org.apache.poi:poi-scratchpad:3.17` as when running
`org.apache.tika.parser.microsoft.OutlookExtractor::OutlookExtractor(DirectoryNode,
ParserContext)` we get `this.msg.mainChunks.allChunks.SenderEmailAddress =
"/O=SONY/OU=EXCHANGE ADMINISTRATIVE GROUP
(FYDIBOHF23SPDLT)/CN=RECIPIENTS/CN=EBERGER"`.
Check attachment to reproduce this defect.
was:
For some emails we get instead of the email address for "From" field a value
which looks like `/O=SONY/OU=EXCHANGE ADMINISTRATIVE GROUP
(FYDIBOHF23SPDLT)/CN=RECIPIENTS/CN=EBERGER`. The issue seems to be connected to
the library `org.apache.poi:poi-scratchpad:3.17` as when running
`org.apache.tika.parser.microsoft.OutlookExtractor::OutlookExtractor(DirectoryNode,
ParserContext)` we get `this.msg.mainChunks.allChunks.SenderEmailAddress =
"/O=SONY/OU=EXCHANGE ADMINISTRATIVE GROUP
(FYDIBOHF23SPDLT)/CN=RECIPIENTS/CN=EBERGER"`.
Check attachment to reproduce this defect.
> "From" headers is not always extracted correctly on msg mails
> -------------------------------------------------------------
>
> Key: TIKA-2694
> URL: https://issues.apache.org/jira/browse/TIKA-2694
> Project: Tika
> Issue Type: Bug
> Components: parser
> Affects Versions: 1.17
> Environment: CentOS 7
> Windows 10
> Reporter: Celpan Valeria
> Priority: Major
> Attachments: Fw Anime User Analysis.msg
>
>
> For some emails we get instead of the email address for "From" field a value
> which looks like `/O=SONY/OU=EXCHANGE ADMINISTRATIVE GROUP
> (FYDIBOHF23SPDLT)/CN=RECIPIENTS/CN=EBERGER`.
> The issue seems to be connected to the library
> `org.apache.poi:poi-scratchpad:3.17` as when running
> `org.apache.tika.parser.microsoft.OutlookExtractor::OutlookExtractor(DirectoryNode,
> ParserContext)` we get `this.msg.mainChunks.allChunks.SenderEmailAddress =
> "/O=SONY/OU=EXCHANGE ADMINISTRATIVE GROUP
> (FYDIBOHF23SPDLT)/CN=RECIPIENTS/CN=EBERGER"`.
> Check attachment to reproduce this defect.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)