Thanks for the info Nick, I'll have a look at that. Best Regards, Swapna.
-----Original Message----- From: Nick Burch [mailto:[email protected]] Sent: Wednesday, September 28, 2011 4:29 PM To: [email protected] Subject: Re: Metadata extracted by OutlookExtractor On Wed, 28 Sep 2011, Swapna Vuppala wrote: > Am new to using Solr and Tika. Am trying to index .msg files (Outlook > mails) into Solr. For this, I need a list of metadata extracted by Tika > from emails. I would like to know what all fields from a .msg file are > extracted by Tika's outlookextractor. Your best bet is probably just to look at the code: http://svn.apache.org/repos/asf/tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/microsoft/OutlookExtractor.java > how I can customize existing parser to get more metadata (like number of > attachments, count of embedded and non-embedded etc )from emails ? If you want to know about attachments, you'll need to register a recursing Parser onto the ParserContext. This'll then be called once per attachment, and you can do whatever you want with the information at that point Nick ____________________________________________________________ Electronic mail messages entering and leaving Arup business systems are scanned for acceptability of content and viruses
