[ 
https://issues.apache.org/jira/browse/PDFBOX-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766463#comment-13766463
 ] 

Maruan Sahyoun commented on PDFBOX-283:
---------------------------------------

I added a quick fix how a new field value is put into the appearance stream. 
The current implementation will only work for single byte character sets and as 
the fields value and the string representation of the value in the appearance 
stream are handled differently the display and the content are different.

There are some issues with calculating the appearance stream for fields where 
there was already an existing one though, which should be addressed separately.

The forms filling now works with german umlaut as well as the characters 
presented above.
                
> Character encoding/appearance issues when filling forms
> -------------------------------------------------------
>
>                 Key: PDFBOX-283
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-283
>             Project: PDFBox
>          Issue Type: Bug
>          Components: PDModel.AcroForm
>         Attachments: PDAppearance.patch
>
>
> [imported from SourceForge]
> http://sourceforge.net/tracker/index.php?group_id=78314&atid=552832&aid=1735902
> Originally submitted by scop on 2007-06-12 10:23.
> When filling a text field with non-ASCII characters such as in my surname 
> "Skyttä" and saving the document in a UTF-8 environment, something goes 
> wrong with the appearance of the text.
> The value itself seems to be stored correctly, but when opening the doc, the 
> appearance of "ä" is not that, but rather something which happens when UTF-8 
> is mistakenly treated as ISO-8859-1 (two garbage characters).
> PDAppearance uses the platform default encoding in quite a few places which 
> apparently has potential to mess things up.  In particular, 
> insertGeneratedAppearance() generates a PrintWriter from an OutputStream 
> without specifying the encoding.  In fact, if I hack that to use ISO-8859-1, 
> the appearance of my "ä" case is correct, but that won't obviously work with 
> anything else than chars that are valid ISO-8859-1.
> In which char encoding should the value be written to the appearance stream 
> (at end of insertGeneratedAppearance())?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to