[ 
https://issues.apache.org/jira/browse/PDFBOX-5920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr resolved PDFBOX-5920.
-------------------------------------
    Resolution: Fixed

There's now a snapshot at
https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/3.0.4-SNAPSHOT/

I looked at the other files... Sadly, these are files where the fonts itself 
return bad data. I won't change the code parts that retrieve width for code 32 
this time despite that I don't trust this, because all alternatives I tried 
also brought some quality loss in some files. Maybe the text extraction 
algorithm itself should be improved.

> PDType0Font return invalid space width
> --------------------------------------
>
>                 Key: PDFBOX-5920
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5920
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>    Affects Versions: 2.0.32, 3.0.3 PDFBox
>            Reporter: Miroslav Holubec
>            Assignee: Tilman Hausherr
>            Priority: Major
>              Labels: fontwidth, truetype
>             Fix For: 2.0.33, 3.0.4 PDFBox, 4.0.0
>
>         Attachments: texgyreheros-regular.ttf
>
>
> WinAnsiEncoding supports not all available characters from the font. That is 
> the reason why we moved to the workaround proposed by FAQ, also to use 
> PDType0Font. Now we have realized, that returned space width from 
> font.getSpaceWidth() returns invalid value.
> {noformat}
>  class FontWidthTest {
>     @Test
>     void pdType0FontTest() throws IOException {
>         try (InputStream fontStream = 
> FontWidthTest.class.getResourceAsStream("/texgyreheros-regular.ttf");
>              PDDocument document = new PDDocument()) {
>             PDFont font = PDType0Font.load(document, fontStream, false);
>             assertEquals(20064.0, font.getStringWidth("The quick brown fox 
> jumps over the lazy dog."));
>             assertEquals(278.0, font.getSpaceWidth()); // FAIL: returns 584.0
>         }
>     }
>     @Test
>     void pdTrueTypeFontTest() throws IOException {
>         try (InputStream fontStream = 
> FontWidthTest.class.getResourceAsStream("/texgyreheros-regular.ttf");
>              PDDocument document = new PDDocument()) {
>             PDFont font = PDTrueTypeFont.load(document, fontStream, 
> WinAnsiEncoding.INSTANCE);
>             assertEquals(20064.0, font.getStringWidth("The quick brown fox 
> jumps over the lazy dog."));
>             assertEquals(278.0, font.getSpaceWidth());
>         }
>     }
> }
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to