All, I have been taking a look at OODT and noticed that you are still using some of the older versions of other projects. I highly recommend updating Tika and PDFBox to their latest versions. 0.7 and 1.2.1. The older versions of PDFBox won't parse PDFs that were OCRd from whatever the latest version of acrobat is. I know this from experience...
Adam
