RFC822 messages not parsed
--------------------------

                 Key: TIKA-461
                 URL: https://issues.apache.org/jira/browse/TIKA-461
             Project: Tika
          Issue Type: Bug
          Components: parser
    Affects Versions: 0.7
            Reporter: Joshua Turner


Presented with an RFC822 message exported from Thunderbird, AutodetectParser 
produces an empty body, and a Metadata containing only one key-value pair: 
"Content-Type=message/rfc822". Directly calling MboxParser likewise gives an 
empty body, but with two metadata pairs: "Content-Encoding=us-ascii 
Content-Type=application/mbox".

A quick peek at the source of MboxParser shows that the implementation is 
pretty naive. If the wiring can be sorted out, something like Apache James' 
mime4j might be a better bet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to