Chris Knott commented on TIKA-2122:

Sorry I am not particularly familiar with Tika or POI, just needed this feature 
for a current project - what do you mean by HMEF?

My use case is needing to extract custom headers which start with "x-" - 
there's never going to be a way to do this properly I presume, because the 
headers could be anything.

How about extracting just headers that start "x-" and prepending them with 
"custom-email-header:" or something?


On another note, what's the easiest workaround for this at the moment?

> Extract all email headers from Outlook .msg files into Metadata
> ---------------------------------------------------------------
>                 Key: TIKA-2122
>                 URL: https://issues.apache.org/jira/browse/TIKA-2122
>             Project: Tika
>          Issue Type: Improvement
>          Components: parser
>    Affects Versions: 1.13
>            Reporter: Chris Knott
>            Priority: Minor
>             Fix For: 2.0, 1.14
>   Original Estimate: 24h
>  Remaining Estimate: 24h
> Currently most email headers are not added to the Metadata when extracting 
> Outlook .msg files.
> http://svn.apache.org/repos/asf/tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/microsoft/OutlookExtractor.java
> The headers - {{msg.getHeaders()}} - are already being looped through as a 
> way to estimate the date.
> All headers should be added to Metadata, using the name of the header with a 
> prefix such as {{"raw-header:"}}

This message was sent by Atlassian JIRA

Reply via email to