[ https://issues.apache.org/jira/browse/PDFBOX-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14116149#comment-14116149 ]
John Hewson commented on PDFBOX-2262: ------------------------------------- The relatively few tests which we have for text extraction are pretty good. Figuring out which glyph to draw can involve the ToUnicode map for certain fonts, so that aspect has been well tested. One strategy for doing a comprehensive regression test of the ToUnicode mapping could be to run your rendering regression test but switch out the code in PageDrawer which draws glyphs for some AWT code to just render the "unicode" string at a suitable point size. > Remove usage of AWT fonts > ------------------------- > > Key: PDFBOX-2262 > URL: https://issues.apache.org/jira/browse/PDFBOX-2262 > Project: PDFBox > Issue Type: Improvement > Components: PDModel, Rendering > Affects Versions: 2.0.0 > Reporter: John Hewson > Assignee: John Hewson > Attachments: Basiswissen-Vorschriften.pdf, > Basiswissen-Vorschriften.pdf-1.png, > Basiswissen-Vorschriften.pdf-1.png-diff.png, > Basiswissen-Vorschriften.pdf-9.png, > Basiswissen-Vorschriften.pdf-9.png-diff.png, > ELVIA-Reiserucktritt-Vollschutz.pdf-1.png, FreeSansTest.pdf, > PDFBOX-1094-094730.pdf-1.png, PDFBOX-1770.pdf-1.png, > bugzilla867751.pdf-2.png, bugzilla867751.pdf-2.png-diff.png, > bugzilla886049.pdf, bugzilla886049.pdf-1.png, test_1fd9a_test.pdf > > > We're still using AWT fonts to render the "standard 14" built-in fonts, which > causes rendering problems and encoding issues (see PDFBOX-2140). We're also > using AWT for some fallback fonts. > Removal of these AWT fonts isn't too difficult, we need to load the fonts > using the existing PDFFontManager mechanism which has recently been added. > All missing TrueType fonts loaded from disk have been using SystemFontManager > for a number of weeks now. > We should ship some sensible default fonts with PDFBox, such as the > Liberation fonts (see PDFBOX-2169, PDFBOX-2263), in case PDFFontManager can't > find anything suitable, rather than falling back to the default TTF font, but > by default we'll probe the system for suitable fonts. -- This message was sent by Atlassian JIRA (v6.2#6252)