[
https://issues.apache.org/jira/browse/TIKA-623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13920692#comment-13920692
]
Hong-Thai Nguyen commented on TIKA-623:
---------------------------------------
java-libpst-0.7 has been uploaded to oss sonatype nexus. If there's no
objection, I'll refactory attached parser and provide output as:
{code}
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta name="Content-Length" content="271360" />
<meta name="isValid" content="true" />
<meta name="Content-Type" content="application/vnd.ms-outlook" />
<title></title>
</head>
<body>
<div class="email-folder">
<h1>Début du fichier de données Outlook</h1>
<div class="email-entry">
<h1><[email protected]></h1>
<meta subject="Re: Feature Generators" />
<meta
internetMessageId="<[email protected]>" />
<meta descriptorNodeId="2097188" />
<meta lastModificationTime="1393418263291" />
<meta senderName="Jörn Kottmann" />
<meta senderEmailAddress="[email protected]" />
<meta recipients="No recipients table!" />
<p>mail content</p>
</div>
<div class="email-folder">
<h1>Éléments supprimés</h1>
</div>
</div>
<div class="email-folder">
<h1>Racine (pour la recherche)</h1>
</div>
<div class="email-folder">
<h1>SPAM Search Folder 2</h1>
</div>
</body>
</html>
{code}
> Add support for Outlook PST
> ---------------------------
>
> Key: TIKA-623
> URL: https://issues.apache.org/jira/browse/TIKA-623
> Project: Tika
> Issue Type: New Feature
> Components: parser
> Reporter: Tran Nam Quang
> Fix For: 1.6
>
> Attachments: OutlookPSTParser.java
>
>
> Hello everyone,
> As you might know, Outlook stores its mails and other stuff in a single PST
> file. There's a relatively new Java library called java-libpst for reading
> Outlook PST files. It is licensed under the LGPL and available over here:
> http://code.google.com/p/java-libpst/
> I have tested the library on Outlook 2000 and Outlook 2003, with good
> results. It would be great if the library could be integrated into Tika.
> Best regards
> Tran Nam Quang
--
This message was sent by Atlassian JIRA
(v6.2#6252)