Petr Slaby created PDFBOX-2108:
----------------------------------
Summary: Type0 CFF Font with identity encoding rendered incorrectly
Key: PDFBOX-2108
URL: https://issues.apache.org/jira/browse/PDFBOX-2108
Project: PDFBox
Issue Type: Bug
Components: FontBox, Rendering
Affects Versions: 2.0.0
Reporter: Petr Slaby
The attached pdf files were created in iText. Both print two lines containing
the text "Hello World!". The second line is printed using a Type0 CFF font with
encoding Identity-H. One of the documents uses a western languages font
(font0.pdf). The other one (font0-arabic.pdf) uses an Arabic font, but this
font contains latin letters, too. Expected rendering can be seen in Acrobat,
the wrong rendering from PDFBox created via PDFToImage is attached.
I have experimented with the problem and tried to fix it, proposed changes are
attached in Type0FontEncoding.patch. However, I came to the result just by
experimenting, without really reading the PDF specification. Also, I am getting
lost in the many mappings internally used in PDFBox, so an expert view is for
sure required.
Note: I have seen several issues where font encoding is already discussed, but
I was not able to decide whether this exact problem is already described in one
of them. Sorry, if this is a duplicate.
Note: Even after fixing the encoding problem, the font size of the second line
in the "Arabic" example is wrong for some reason. This is probably a different
issue and I did not try to analyze it yet.
--
This message was sent by Atlassian JIRA
(v6.2#6252)