Hi, On Fri, Jan 4, 2013 at 10:00 PM, Jack Park <[email protected]> wrote: > The paper itself is found by following the link from here: > http://openagricola.nal.usda.gov/Record/IND23271089 > > (I will send the file offlist if needed; it's 64k)
I can take a closer look at the file If you send it to me (I didn't find a link to download it). A good test on whether Tika can (or should be able to) extract text from a PDF is to try copy-pasting the text from a normal PDF viewer. If you can copy the text, then Tika should be able to extract it (it's a bug if it doesn't). If you can't (for example if it's a scanned image), then there's little we can do. BR, Jukka Zitting
