Hi, On Thu, Jun 3, 2010 at 9:22 AM, Ehsan <[email protected]> wrote: > I'm trying to parse a pdf file. I first tried this code > [...] > both of these codes do the exact same thing, they read some of the text in the > PDF file, but leave the rest of the file out?? I tested it with a 1m file and > a > 100k file.
It may be that the PDFBox library Tika uses for handling PDF documents is having a problem with parsing your files. Do you have an example file that you can share? BR, Jukka Zitting
