[ 
https://issues.apache.org/jira/browse/PDFBOX-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215705#comment-15215705
 ] 

Filip Bellander commented on PDFBOX-922:
----------------------------------------

I tried to update to 2.0.0 today. Now suddenly my tests no longer works with 
the following error: {noformat}U+00A0 ('nbspace') is not available in this 
font's encoding: WinAnsiEncoding{noformat}

Worth noting here is that I don't have any of the Type1 fonts installed on my 
machine (I'm on a Linux-box and just haven't installed them). This results in 
the following information being printed before the tests are run (ie, when I 
start using PDFBox)

{noformat}
10:41:30.174 [main] WARN  o.a.pdfbox.pdmodel.font.PDType1Font - Using fallback 
font LiberationSans for base font Times-Roman
10:41:30.178 [main] WARN  o.a.pdfbox.pdmodel.font.PDType1Font - Using fallback 
font LiberationSans for base font Times-Bold
10:41:30.178 [main] WARN  o.a.pdfbox.pdmodel.font.PDType1Font - Using fallback 
font LiberationSans for base font Times-Italic
10:41:30.179 [main] WARN  o.a.pdfbox.pdmodel.font.PDType1Font - Using fallback 
font LiberationSans for base font Times-BoldItalic
10:41:30.179 [main] WARN  o.a.pdfbox.pdmodel.font.PDType1Font - Using fallback 
font LiberationSans for base font Helvetica
10:41:30.180 [main] WARN  o.a.pdfbox.pdmodel.font.PDType1Font - Using fallback 
font LiberationSans for base font Helvetica-Bold
10:41:30.180 [main] WARN  o.a.pdfbox.pdmodel.font.PDType1Font - Using fallback 
font LiberationSans for base font Helvetica-Oblique
10:41:30.181 [main] WARN  o.a.pdfbox.pdmodel.font.PDType1Font - Using fallback 
font LiberationSans for base font Helvetica-BoldOblique
10:41:30.181 [main] WARN  o.a.pdfbox.pdmodel.font.PDType1Font - Using fallback 
font LiberationSans for base font Courier
10:41:30.182 [main] WARN  o.a.pdfbox.pdmodel.font.PDType1Font - Using fallback 
font LiberationSans for base font Courier-Bold
10:41:30.183 [main] WARN  o.a.pdfbox.pdmodel.font.PDType1Font - Using fallback 
font LiberationSans for base font Courier-Oblique
10:41:30.184 [main] WARN  o.a.pdfbox.pdmodel.font.PDType1Font - Using fallback 
font LiberationSans for base font Courier-BoldOblique
10:41:30.199 [main] DEBUG o.a.p.p.font.FileSystemFontProvider - Loaded 
StandardSymL from /usr/share/fonts/Type1/s050000l.pfb
10:41:30.215 [main] DEBUG o.a.p.p.font.FileSystemFontProvider - Loaded Dingbats 
from /usr/share/fonts/Type1/d050000l.pfb
10:41:30.422 [main] WARN  o.a.pdfbox.pdmodel.font.PDType1Font - Using fallback 
font LiberationSans for Helvetica
Tests run: 10, Failures: 0, Errors: 6, Skipped: 0, Time elapsed: 0.968 sec <<< 
FAILURE!
{noformat}

This problem was not present in 1.8.11, so I'm wondering what's really going on 
here.
What this gets triggered on, from what I can tell, is when you do something like
{code:java}
pdFont.getStringWidth(StringEspaceUtils.unescapeHtml4("&nbsp;"));
{code}
That is at least what it fails on for me.
I dirty work-around would be to replace all non-breaking spaces with breaking 
spaces, but that defeats the purpose of having non-breaking ones.
Suggestions on how this might be solved?

> True type PDFont subclass only supports WinAnsiEncoding (hardcoded!)
> --------------------------------------------------------------------
>
>                 Key: PDFBOX-922
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-922
>             Project: PDFBox
>          Issue Type: New Feature
>          Components: Writing
>    Affects Versions: 1.3.1
>         Environment: JDK 1.6 / OS irrelevant, tried against 1.3.1 and 1.2.0
>            Reporter: Thanos Agelatos
>            Priority: Blocker
>             Fix For: 2.0.0
>
>         Attachments: pdfbox-unicode.diff, pdfbox-unicode2.diff
>
>
> PDFBox cannot embed Identity-H or Identity-V type TTF fonts in the PDF it 
> creates, making it impossible to create PDFs in any language apart from 
> English and ones supported in WinAnsiEncoding. This behaviour is caused 
> because method PDTrueTypeFont.loadTTF has hardcoded WinAnsiEncoding inside, 
> and there is no Identity-H or Identity-V Encoding classes provided (to set 
> afterwards via PDFont.setFont() )
> This excludes the following languages plus many others:
> - Greek
> - Bulgarian
> - Swedish
> - Baltic languages
> - Malteze 
> The PDF created contains garbled characters and/or squares.
> Simple test case:
> {code}
>                 PDDocument doc = null;
>               try {
>                       doc = new PDDocument();
>                       PDPage page = new PDPage();
>                       doc.addPage(page);
>                       // extract fonts for fields
>                       byte[] arialNorm = extractFont("arial.ttf");
>                       //byte[] arialBold = extractFont("arialbd.ttf"); 
>                       //PDFont font = PDType1Font.HELVETICA;
>                       PDFont font = PDTrueTypeFont.loadTTF(doc, new 
> ByteArrayInputStream(arialNorm));
>                       
>                       PDPageContentStream contentStream = new 
> PDPageContentStream(doc, page);
>                       contentStream.beginText();
>                       contentStream.setFont(font, 12);
>                       contentStream.moveTextPositionByAmount(100, 700);
>                       contentStream.drawString("Hello world from PDFBox 
> ελληνικά"); // text here may appear garbled; insert any text in Greek or 
> Bulgarian or Malteze
>                       contentStream.endText();
>                       contentStream.close();
>                       doc.save("pdfbox.pdf");
>                       System.out.println(" created!");
>               } catch (Exception ioe) {
>                       ioe.printStackTrace();
>               } finally {
>                       if (doc != null) {
>                               try { doc.close(); } catch (Exception e) {}
>                       }
>               }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to