Am 10.10.2016 um 12:36 schrieb Ivan Pavlyukovets:
Hello,

I have a little problem with pdf file generation using PDFBox 2.0.3 and I can't 
find ways to solve it ...

I have a string which is generated by using random unicode symbols (Symbol 
'U+22F2' is presented in the string for example.)
and take the following exception when make some actions with this string:
Caused by: java.lang.IllegalArgumentException: No glyph for U+22F2 in font 
ArialUnicodeMS
                at 
org.apache.pdfbox.pdmodel.font.PDCIDFontType2.encode(PDCIDFontType2.java:401)
                at 
org.apache.pdfbox.pdmodel.font.PDType0Font.encode(PDType0Font.java:351)
                at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:316)
                at 
org.apache.pdfbox.pdmodel.PDPageContentStream.showText(PDPageContentStream.java:414)

It's happened when I try to embed "Arial Unicode MS" font.
Arial Unicode MS doesn't contain the mentioned character U+22F2

I tried to find this "wrong" symbols by using 
org.apache.pdfbox.pdmodel.font.PDType0Font#hasGlyph but I saw that this symbol has glyph 
(It's allowed in  Identity-H encoding which is used for embedded fonts).
The encoding is used to encode the added text but there is no direct connection to the available characters in the font. Instead you have to have a look at the font to find out if a given character is supported.

I see that a lot of methods use PDCIDFontType2.encode and it has strange 
behavior ...
It has the following block which throw exception if cid has 0 value
                if (cid == 0)
        {
            throw new IllegalArgumentException(
                    String.format("No glyph for U+%04X in font %s", unicode, 
getName()));
        }
I read in  https://www.microsoft.com/typography/otspec/cmap.htm that "Character 
codes that do not correspond to any glyph in the font should be mapped to glyph index 0. 
The glyph at this location must be a special glyph representing a missing character, 
commonly known as .notdef."
When I deleted this block everything work fine and I saw special glyphs in 
generated pdf.
IMHO, it doesn't make sense to add unsupported characters and replace them with ".notdef"


Steps to reproduce:
1. Create document
                PDDocument document = new PDDocument();
2. load Arial Unicode MS font:
                PDType0Font pdfFont = PDType0Font.load(document, 
document.getClass().getResourceAsStream("/ttf/arialuni.ttf"));
3. be sure that symbol has glyph
                int codePoint = 0x22F2;
                pdfFont.hasGlyph(codePoint)
4. catch strange exception
                PDCIDFontType2 pdcidFontType2 = 
(PDCIDFontType2)pdfFont.getDescendantFont();
                pdcidFontType2.encode(codePoint);

Do you have any suggestions to solve this problem or should I create new issue?
There is no issue with PDFBox. You have to choose a font which supports all characters you want use.


Example is attached.


Ivan Pavlyukovets

BR
Andreas

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org

Reply via email to