Thanks for the information!

Much appreciated!

Anthony

-----Original Message-----
From: Nick Burch [mailto:[email protected]] 
Sent: 27 March 2018 15:50
To: [email protected]
Subject: Re: Subfile Extraction

On Sun, 25 Mar 2018, McGreevy, Anthony wrote:
> I am currently playing with Tika to see how it works with regards to 
> extraction of subfiles.

Do you mean files or resources embedded within another file?

If so... With the Tika App, you want -z to have these extracted. With the Tika 
java classes, you want to pop something like a 
https://tika.apache.org/1.17/api/org/apache/tika/parser/RecursiveParserWrapper.htmlhttps://tika.apache.org/1.17/api/org/apache/tika/parser/RecursiveParserWrapper.html
or a
https://tika.apache.org/1.17/api/org/apache/tika/extractor/ContainerExtractor.html
on your ParseContext to get called for embedded resources. See 
https://wiki.apache.org/tika/RecursiveMetadata for more on how it works and how 
to have Tika parse + return all the embedded files and resources

Nick

Reply via email to