[ 
https://issues.apache.org/jira/browse/PDFBOX-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15034142#comment-15034142
 ] 

John Hewson commented on PDFBOX-3138:
-------------------------------------

The embedded font used by the field does indeed contain Hebrew glyphs, and a 
valid "cmap" table which can be used to look up those glyphs. The mentioned 
character, U+05D7, is indeed is present in the font. 

The embedded font file is in OpenType format, however the PDF Font dictionary 
is Type1 and specifies WinAnsiEncoding, which does not include Hebrew 
characters. So, strictly speaking, the field cannot be filled using any 
non-ANSI characters and so PDFBox's behaviour is correct.

It would seem that PDFBox could so something more helpful in this instance. 
Filling the form with Acrobat results in the font from the form's DR being 
overridden in the Field itself with a new CIDFontType0 which has been created 
from the DR font. Ideally we would do that.

Do you have any control over the software producing these fields? I might be 
able to offer a workaround.

> PDTextField doesn't accept any Hebrew characters as new value
> -------------------------------------------------------------
>
>                 Key: PDFBOX-3138
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3138
>             Project: PDFBox
>          Issue Type: Bug
>          Components: AcroForm, FontBox
>    Affects Versions: 2.0.0
>         Environment: Eclipse 4.2.2, Windows 7 Pro, JRE 1.8.0_05
>            Reporter: Gilad Denneboom
>            Priority: Minor
>             Fix For: 2.1.0
>
>         Attachments: SetHebrewFieldValueTest.java, Test.pdf, Test.txt
>
>
> Trying to set a UTF-8 encoded Hebrew string as the value of a PDTextField 
> fails with the following exception:
> {code}
> Exception in thread "main" java.lang.IllegalArgumentException: No glyph for 
> U+05D7 in font AdobeHebrew-Regular
>       at 
> org.apache.pdfbox.pdmodel.font.PDType1CFont.encode(PDType1CFont.java:300)
>       at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:283)
>       at 
> org.apache.pdfbox.pdmodel.PDPageContentStream.showText(PDPageContentStream.java:341)
>       at 
> org.apache.pdfbox.pdmodel.interactive.form.PlainTextFormatter.format(PlainTextFormatter.java:213)
>       at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.insertGeneratedAppearance(AppearanceGeneratorHelper.java:373)
>       at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.setAppearanceContent(AppearanceGeneratorHelper.java:237)
>       at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.setAppearanceValue(AppearanceGeneratorHelper.java:144)
>       at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTextField.constructAppearances(PDTextField.java:263)
>       at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTerminalField.applyChange(PDTerminalField.java:221)
>       at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTextField.setValue(PDTextField.java:218)
>       at SetHebrewFieldValueTest.main(SetHebrewFieldValueTest.java:22)
> {code}
> I've tried using multiple fonts for the field, all of which can handle Hebrew 
> characters just fine, and got the same results in all of them.
> See attached files for a demonstration of the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to