It seems that GS converted the text in the file to graphical elements. You
can see it in Acrobat if you open the Contents panel, and you can also see
that the text in the file is not selectable, and therefore can't be
extracted.
You'll need to look for a solution in GS. It has nothing to do with how
PDFBox works, as there's just no text to read in that file.


On Thu, Nov 14, 2013 at 4:29 PM, James Green <[email protected]>wrote:

> This was created via a fairly obtuse means but suffice it say it should
> still work.
>
> https://www.dropbox.com/s/uaq5sqmlf88108p/sample-from-pdf.pdf
>
> This was me creating a document in LibreOffice Writer, exporting that as a
> pdf then loading the pdf into DocumentViewer (Evince, although Adobe
> Reader) could also be used. This is then printed to a java application via
> the windows PScript dll where the java app runs the received postscript
> through Ghostscript to get PDF and finally imported into PDFBox.
>
> This used to work a few weeks ago, and we are unsure why it does not now.
> Printing an odt directly from Writer into the Java app works fine.
>
> This is using PDFBox 1.8.2.
>
> Thanks,
>
> James
>

Reply via email to