Dear Andreas, Thanks for the fast reply!
--hai On Tue, Jul 2, 2013 at 1:34 PM, Andreas Lehmkuehler <[email protected]>wrote: > Hi, > > > Am 02.07.2013 11:54, schrieb Hai Nguyen FUB: > >> Dear Andreas, >> >> >> I have another question, for some documents, when converting them into >> images, I received warnings like in the following: >> >> <snapshot> >> ... >> 11:42:14,895 WARN [PDSimpleFont] Changing font on <e> from <Arial> to the >> default font >> 11:42:14,900 WARN [PDSimpleFont] Changing font on <u> from <Arial> to the >> default font >> 11:42:14,901 WARN [PDSimpleFont] Changing font on <t> from <Arial> to the >> default font >> 11:42:14,901 WARN [PDSimpleFont] Changing font on <s> from <Arial> to the >> default font >> 11:42:14,902 WARN [PDSimpleFont] Changing font on <c> from <Arial> to the >> default font >> 11:42:14,903 WARN [PDSimpleFont] Changing font on <h> from <Arial> to the >> default font >> 11:42:14,903 WARN [PDSimpleFont] Changing font on <e> from <Arial> to the >> default font >> 11:42:14,906 WARN [PDSimpleFont] Changing font on <o> from <Arial> to the >> default font >> 11:42:14,907 WARN [PDSimpleFont] Changing font on <r> from <Arial> to the >> default font >> ... >> </snapshot> >> >> Those warning could be deactivated in the logging.property file, I guess. >> Though, the images were still created, however the images display wrong >> characters, please see the comparison in the attached image file. >> >> How can I solve this? I have look around in the documentation and googled >> a >> lot, but could not find any solutions. >> > This is a known behaviour of PDFBox. As the embedded font doesn't work for > some > reason an alternative font is used. In some cases it works but in most > cases it > doesn't. There is no solution, yet. Most likely the issue is related to > PDFBOX-490 [1] > > > Is there a way to omit the character parsing, since my application is only >> to convert the file to image and no ocr or the like? I have used the >> loadNonSeq() method, but still received those poor characters in the >> images. >> > No, it is needed to render the text and it has nothing to do with the > parser > itself. > > thanks in advance! >> >> --hai >> > > BR > Andreas Lehmkühler > > [1] > https://issues.apache.org/**jira/browse/PDFBOX-490<https://issues.apache.org/jira/browse/PDFBOX-490> >
