Thanks for the information! Much appreciated!
Anthony -----Original Message----- From: Nick Burch [mailto:[email protected]] Sent: 27 March 2018 15:50 To: [email protected] Subject: Re: Subfile Extraction On Sun, 25 Mar 2018, McGreevy, Anthony wrote: > I am currently playing with Tika to see how it works with regards to > extraction of subfiles. Do you mean files or resources embedded within another file? If so... With the Tika App, you want -z to have these extracted. With the Tika java classes, you want to pop something like a https://tika.apache.org/1.17/api/org/apache/tika/parser/RecursiveParserWrapper.htmlhttps://tika.apache.org/1.17/api/org/apache/tika/parser/RecursiveParserWrapper.html or a https://tika.apache.org/1.17/api/org/apache/tika/extractor/ContainerExtractor.html on your ParseContext to get called for embedded resources. See https://wiki.apache.org/tika/RecursiveMetadata for more on how it works and how to have Tika parse + return all the embedded files and resources Nick
