Hi,
checking the rendering capabilities of PDFBOX 1.8 vs. current trunk I
came across a journal which showed severe problems in both - but
different. Problems of 1.8 are gone, new ones showed up.
While the journal (Chemical&Engineering News, C&EN) does not provide
free PDF editions a sample edition can be downloaded via 'View a sample
issue' at http://cen.acs.org/static/about/digital.html (or directly via
http://www.cendigital.org/cendigital/sample/). I'm referring to volume
92, nr 27 from 2014-07-07 which I downloaded yesterday but the same
problems also showed up in other journal issues.
The problems (all on Linux, Java 1.6):
- PDFBOX 1.8 (svn 1620380)
- first letters of words in headlines are sometimes missing, e.g. on
page 2 "Getting ..." reads " et ing ...", "Overview" -> " verview"
- bad character spacing because of substituted font
- PDFBOX trunk (svn 1620415)
- no missing letters but heavily distorted and displaced
letters in headlines (e.g. page 2)
- compared to 1.8 correct font is used
- picture colors are completely wrong;
logged warning: org.apache.pdfbox.filter.DCTFilter decode
WARNUNG: Inconsistent metadata read from JPEG stream
- transparent background instead of white
- PDFBOX trunk, no-awt svn 1620487
- font rendering ok
- picture/background problems as in trunk
Since these are multiples problems on different versions and the PDF is
not freely distributable I did not create a JIRA issue. Nevertheless it
is a widely distributed journal and a good test case for the rendering
quality. At least the JPEG rendering problem of the current trunk should
be solved.
Best,
Timo
--
Timo Boehme
OntoChem GmbH
H.-Damerow-Str. 4
06120 Halle/Saale
T: +49 345 4780474
F: +49 345 4780471
[email protected]
_____________________________________________________________________
OntoChem GmbH
Geschäftsführer: Dr. Lutz Weber
Sitz: Halle / Saale
Registergericht: Stendal
Registernummer: HRB 215461
_____________________________________________________________________