Hi, I am new to this list, even if I use PDFBox for some time now.
I have founded strange behaviour. Text extraction seems to work on some PDF, and not on other. Even if they all use standard ASCII char. and the text can be selected (then not an image) using Acrobat. Or only part of a page can be extracted. Can someone explain why ? I have big pdf (5Mo) which doesn't work. Even extracting text out of excel.pdf provided in the PDFBox source package doesn't work. I have no resources (time) , nor knowledge to help in debugging PDFBox. But I need to extract text from PDF files. Then : I am doing a lot of tests. Thanks ----------------- Bernard Segonnes http://bsegonnes.free.fr [email protected]

