[
https://issues.apache.org/jira/browse/PDFBOX-4115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tilman Hausherr updated PDFBOX-4115:
------------------------------------
Description:
When creating a PDF and adding text using a PostScript Type1 font (e.g. the
attached n019003l.pfb but also others), an error occurs when the text contains
German characters.
The error occurs with e.g. the character "ä" (adieresis) and other similar
umlaut characters; it does not occur with "ß" (germandbls).
Using an embedded TTF seems to work fine but when I load the PFB like this:
{code:java}
InputStream pfb = new FileInputStream(fontFile);
font = new PDType1Font(document, pfb);
{code}
I get an encoding error whenever I try to print an "ä" to the page:
{code:java}
java.lang.IllegalArgumentException: U+00E4 ('adieresis') is not available in
this font NimbusSanL-Regu (generic: NimbusSanL-Regu) encoding: built-in (Type 1)
{code}
If I specify a different encoding (WinANSI) when loading the font:
{code:java}
InputStream pfb = new FileInputStream(fontFile);
font = new PDType1Font(document, pfb, new WinAnsiEncoding());
{code}
then the exception is not thrown, but I just have an empty space in place of
the "ä".
I have tried to look into the code, in particular I have played around with the
class PDType1FontEmbedder.
When the FontBox object Type1Font is created by the parser in the following
line of code:
{code:java}
type1 = Type1Font.createWithPFB(pfbBytes);
{code}
I have tried to look into the charstring dictionary:
{code:java}
type1.getCharStringsDict()
{code}
and, by iterating through the set keys, can see that "adieresis" is in there.
However, when using the default encoding from the font (i.e. by passing "null"
to the PDType1FontEmbedder), the resulting encoding that is obtained by the
following line of code:
{code:java}
fontEncoding = Type1Encoding.fromFontBox(type1.getEncoding());
{code}
does not contain "adieresis" (or other "compound" characters), but just
"dieresis".
was:
Hi all,
When creating a PDF and adding text using a PostScript Type1 font (e.g. the
attached n019003l.pfb but also others), an error occurs when the text contains
German characters.
The error occurs with e.g. the character "ä" (adieresis) and other similar
umlaut characters; it does not occur with "ß" (germandbls).
Using an embedded TTF seems to work fine but when I load the PFB like this:
{code}
InputStream pfb = new FileInputStream(fontFile);
font = new PDType1Font(document, pfb);
{code}
I get an encoding error whenever I try to print an "ä" to the page:
{code}
java.lang.IllegalArgumentException: U+00E4 ('adieresis') is not available in
this font NimbusSanL-Regu (generic: NimbusSanL-Regu) encoding: built-in (Type 1)
{code}
If I specify a different encoding (WinANSI) when loading the font:
{code}
InputStream pfb = new FileInputStream(fontFile);
font = new PDType1Font(document, pfb, new WinAnsiEncoding());
{code}
then the exception is not thrown, but I just have an empty space in place of
the "ä".
I have tried to look into the code, in particular I have played around with the
class PDType1FontEmbedder.
When the FontBox object Type1Font is created by the parser in the following
line of code:
{code}
type1 = Type1Font.createWithPFB(pfbBytes);
{code}
I have tried to look into the charstring dictionary:
{code}
type1.getCharStringsDict()
{code}
and, by iterating through the set keys, can see that "adieresis" is in there.
However, when using the default encoding from the font (i.e. by passing "null"
to the PDType1FontEmbedder), the resulting encoding that is obtained by the
following line of code:
{code}
fontEncoding = Type1Encoding.fromFontBox(type1.getEncoding());
{code}
does not contain "adieresis" (or other "compound" characters), but just
"dieresis".
> Problem creating PDF with German text using embedded Type1 (PFB) font
> ---------------------------------------------------------------------
>
> Key: PDFBOX-4115
> URL: https://issues.apache.org/jira/browse/PDFBOX-4115
> Project: PDFBox
> Issue Type: Bug
> Components: FontBox
> Affects Versions: 2.0.8
> Reporter: Tamir Hassan
> Priority: Major
> Labels: type1, type1font
> Attachments: n019003l.pfb
>
>
> When creating a PDF and adding text using a PostScript Type1 font (e.g. the
> attached n019003l.pfb but also others), an error occurs when the text
> contains German characters.
> The error occurs with e.g. the character "ä" (adieresis) and other similar
> umlaut characters; it does not occur with "ß" (germandbls).
> Using an embedded TTF seems to work fine but when I load the PFB like this:
> {code:java}
> InputStream pfb = new FileInputStream(fontFile);
> font = new PDType1Font(document, pfb);
> {code}
> I get an encoding error whenever I try to print an "ä" to the page:
> {code:java}
> java.lang.IllegalArgumentException: U+00E4 ('adieresis') is not available in
> this font NimbusSanL-Regu (generic: NimbusSanL-Regu) encoding: built-in (Type
> 1)
> {code}
> If I specify a different encoding (WinANSI) when loading the font:
> {code:java}
> InputStream pfb = new FileInputStream(fontFile);
> font = new PDType1Font(document, pfb, new WinAnsiEncoding());
> {code}
> then the exception is not thrown, but I just have an empty space in place of
> the "ä".
> I have tried to look into the code, in particular I have played around with
> the class PDType1FontEmbedder.
> When the FontBox object Type1Font is created by the parser in the following
> line of code:
> {code:java}
> type1 = Type1Font.createWithPFB(pfbBytes);
> {code}
> I have tried to look into the charstring dictionary:
> {code:java}
> type1.getCharStringsDict()
> {code}
> and, by iterating through the set keys, can see that "adieresis" is in there.
> However, when using the default encoding from the font (i.e. by passing
> "null" to the PDType1FontEmbedder), the resulting encoding that is obtained
> by the following line of code:
> {code:java}
> fontEncoding = Type1Encoding.fromFontBox(type1.getEncoding());
> {code}
> does not contain "adieresis" (or other "compound" characters), but just
> "dieresis".
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]