Hi,

Christopher Mason schrieb:

I'm investigating libraries for rendering and extracting text from PDF. Across the half dozen I've looked at, both commercial and open source, I think pdfbox is the cleanest.
Oh, interesting. :-)

However, I've run across a number of pdfs that pdfbox does not render properly. One I'm particularly concerned about is:

http://www.cmason.com/tmp/Sowa.pdf

It looks to have encoding or char -> glyph issues in pdfbox, but look okay in every other reader/library I've tried. I've tried with both pdfbox-1.1.0 and with the trunk. Here's how it looks in pdfbox trunk versus Preview:

http://www.cmason.com/tmp/Sowa.png

Any help or suggestions would be most appreciated.
I've a quick look at the pdf. It uses an embedded subset of true type fonts
which is a known problem, see PDFBOX-490 [1] for further details.

BR
Andreas Lehmkühler

[1] https://issues.apache.org/jira/browse/PDFBOX-490

Reply via email to