Is there a font that contains only cyril glyphs? If you embed it, does it work with PDFBox?

Tilman

Am 22.03.2017 um 12:52 schrieb [email protected]:
Hello,
I am using pdfbox 2.0.5 to fill out form fields of a PDF document using this code:
         doc = PDDocument.load(inputStream);
         PDDocumentCatalog catalog = doc.getDocumentCatalog();
         PDAcroForm form = catalog.getAcroForm();
         for (PDField field : form.getFieldTree()){
             field.setValue("должен");
         }
I get this error: U+0434 ('afii10069') is not available in this font Times-Roman (generic: TimesNewRomanPSMT) encoding: StandardEncoding with differences

I can create the PDF documents any way I like. I have tried MS Office export as Adobe PDF and 
creating directly with Acrobat Pro DC. When creating the fields in Acrobat I can select a font. I 
tried all kinds of fonts, for "Arial Unicode MS" it wants to download a 50MB "Adobe 
Acrobat Reader DC Font Pack". The final PDF file with the filled out form fields should be 
viewable/printable by anyone without first installing a font pack

The PDF document itself contains cyrillic text which is displayed just fine. 
Filling out the form in Acrobat Reader works flawlessly, the only problem is in 
PDFBox.

According to https://issues.apache.org/jira/browse/PDFBOX-3138 The embedded font used by 
the field does indeed contain Hebrew glyphs, and a valid "cmap" table which can 
be used to look up those glyphs. The mentioned character, U+05D7, is indeed is present in 
the font. The embedded font file is in OpenType format, however the PDF Font dictionary 
is Type1 and specifies WinAnsiEncoding, which does not include Hebrew characters. So, 
strictly speaking, the field cannot be filled using any non-ANSI characters and so 
PDFBox's behaviour is correct.

Tried another approach: Instead of setValue() I called 
((PDTextField)field).setDefaultValue(); It does not throw an exception, but 
unfortunately in the result PDF I still see the previous default value in the 
document. The new default value only appears in the properties of the field.

Using this code I see that the font is a PDTrueTypeFont:
String  da      = field.getCOSObject().getString(COSName.DA.getName());
Matcher m       = Pattern.compile("/?(.*) [\\d]+ Tf.*", 
Pattern.CASE_INSENSITIVE).matcher(da);
String  name    = m.find() ? m.group(1) : null;
PDFont  font    = 
field.getAcroForm().getDefaultResources().getFont(COSName.getPDFName(name));

How can I create the PDF document and use PDFBox to fill out the form with 
non-ansi characters?

Thanks,
Roland

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to