[ https://issues.apache.org/jira/browse/TIKA-4250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17843359#comment-17843359 ]
Luís Filipe Nassif edited comment on TIKA-4250 at 5/3/24 8:52 PM: ------------------------------------------------------------------ One drawback of our libpff usage approach is that it exports the internal PST/OST tree as a file system tree, and it sometimes causes issues with forbidden NTFS chars and long paths in the temp folder hard to delete after parsing finishes... was (Author: lfcnassif): One drawback of our libpff usage approach is that it exports the internal PST/OST tree as a file system tree, and it sometimes causes issues with forbidden NTFS chars and long paths in the temp folder hard to delete and parsing finishes... > Add a libpst-based parser > ------------------------- > > Key: TIKA-4250 > URL: https://issues.apache.org/jira/browse/TIKA-4250 > Project: Tika > Issue Type: Task > Reporter: Tim Allison > Priority: Major > > We currently use the com.pff Java-based PST parser for PST files. It would be > useful to add a wrapper for libpst as an optional parser. > One of the benefits of libpst is that it creates .eml or .msg files from the > PST records. This is critical for those who want the original bytes from > embedded files. Obv, PST doesn't store eml or msg, but some users want the > "original" emails even if they are constructed from PST records. -- This message was sent by Atlassian Jira (v8.20.10#820010)