Hey,

I am currently playing with Tika to see how it works with regards to extraction 
of subfiles.

The requirement I have is to have Tika take in a parent document, a .docx or 
.eml for example, and extract out the text content, metadata and all subfiles 
so that I can save them to disk.

So far I have worked out the metadata and content extraction but I haven't been 
able to find any tutorials on the subfile extraction.

If you could point me at resources I could use to work this out or examples of 
sample code doing this already it would be much appreciated.

Thanks,

Anthony

Reply via email to