On Mon, 20 Dec 2010, [email protected] wrote:
Just starting to use Tika 0.8 in conjunction with DokuWiki, and I noticed the dependency on Bouncy Castle through PDFBox. Is it possible to remove this dependency, given that we're not using Tika for any encryption purposes?
You may be surprised to discover quite how many PDFs are actually encrypted, if only with the default key. Pretty much any protections you place on a PDF (eg no copying, no printing) will result in it being encrypted, and then without Bouncy Castle PDFBox wouldn't be able to read them
If you don't need PDFs, then you're fine to exclude the pdf parser and its dependencies (eg PDFBox, Bouncy Castle). However, if you do want to handle PDFs, then you almost certainly do need it.
Nick
