PDFBox 0.8.0 : text extraction

Bernard Segonnes Mon, 28 Dec 2009 06:31:33 -0800

Hi,

I am new to this list, even if I use PDFBox for some time now.


I have founded strange behaviour.  Text extraction seems to work on some
PDF, and not on other.  Even if they all use standard ASCII char. and the
text can be selected (then not an image) using Acrobat.
Or only part of a page can be extracted.

Can someone explain why ?

I have big pdf (5Mo) which doesn't work.  Even extracting text out of
excel.pdf   provided in the PDFBox source package doesn't work.

I have no resources (time) , nor knowledge to help in debugging PDFBox.  But
I need to extract text from PDF files.  Then : I am doing a lot of tests.


Thanks

-----------------
Bernard Segonnes

http://bsegonnes.free.fr
[email protected]

PDFBox 0.8.0 : text extraction

Reply via email to