Hi Folks, I have a problematic PDF which I keeps on crashing my Nutch crawl. I am trying to get all data from the PDF, so content is not truncated at all. http://www.who.int/about/who_reform/who-internal-control-framework.pdf Can someone please try to see if they have any issues parsing this document with Tika 1.6? I have tried it locally, and it seems OK. If I can confirm this with some other folks then I can isolate this to my Nutch crawl. Thank you Lewis
-- *Lewis*
