[jira] [Comment Edited] (PDFBOX-3255) Reasonable way to handle missing characters in font

John Hewson (JIRA) Thu, 10 Mar 2016 14:15:12 -0800

    [ 
https://issues.apache.org/jira/browse/PDFBOX-3255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190056#comment-15190056
 ]


John Hewson edited comment on PDFBOX-3255 at 3/10/16 10:14 PM:
---------------------------------------------------------------

1. Sounds good to me, the only reason it's not public is because the API was 
unstable, but that's no longer the case.

2. Your code has some problems. The encoding.getName(..) method expects a _PDF 
character code_, not a Unicode character. For simple cases, these are often the 
same, but in general they're not. What you need to do is look up the glyphs by 
name, which you do by first mapping the Unicode character to a glyph name:

{code}
String name = font.getGlyphList().codePointToName(c);
if (!encoding.contains(name))
{
  ...
}
{code}

Where c is your char (more properly called a Unicode "code point"), and font is 
a PDSimpleFont.


was (Author: jahewson):
1. Sounds good to me, the only reason it's not public is because the API was 
unstable, but that's no longer the case.

2. Your code has some problems. The encoding.getName(..) method expects a _PDF 
character code_, not a Unicode character. For simple cases, these are often the 
same, but in general they're not. What you need to do is look up the glyphs by 
name, which you do by first mapping the Unicode character to a glyph name:

{code}
String name = getGlyphList().codePointToName(c);
if (!encoding.contains(name))
{
  ...
}
{code}

Where c is your char (more properly called a Unicode "code point"), and font is 
a PDSimpleFont.

> Reasonable way to handle missing characters in font
> ---------------------------------------------------
>
>                 Key: PDFBOX-3255
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3255
>             Project: PDFBox
>          Issue Type: Wish
>          Components: AcroForm
>    Affects Versions: 2.0.0
>            Reporter: Christian Brandt
>              Labels: newbie
>         Attachments: TEST.pdf
>
>
> Hello,
> We have an issue with setting form field values if the input contains 
> characters that cannot be rendered with the associated font. The system 
> throws similar exception to:
> java.lang.IllegalArgumentException: U+0308 ('dieresiscmb') is not available 
> in this font's encoding: MacRomanEncoding with differences
> Currently this is problematic to be handled outside the framework because 
> based on my understanding (please correct me if I'm wrong) the caller does 
> not have a way to figure out what font will be eventually used and therefore 
> which characters are not renderable.
> What we would ultimately like, is that the library would optionally replace 
> unrenderable characters with some another existing character (e.g. space) 
> instead of failing the call, or that the library would provide a way to 
> recover from this error so that the user would be able to call the method 
> again with altered input. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

[jira] [Comment Edited] (PDFBOX-3255) Reasonable way to handle missing characters in font

Reply via email to