Sorry, I have forgotten the attached image file thanks,
--hai On Tue, Jul 2, 2013 at 11:54 AM, Hai Nguyen FUB <[email protected]>wrote: > Dear Andreas, > > I have another question, for some documents, when converting them into > images, I received warnings like in the following: > > <snapshot> > ... > 11:42:14,895 WARN [PDSimpleFont] Changing font on <e> from <Arial> to the > default font > 11:42:14,900 WARN [PDSimpleFont] Changing font on <u> from <Arial> to the > default font > 11:42:14,901 WARN [PDSimpleFont] Changing font on <t> from <Arial> to the > default font > 11:42:14,901 WARN [PDSimpleFont] Changing font on <s> from <Arial> to the > default font > 11:42:14,902 WARN [PDSimpleFont] Changing font on <c> from <Arial> to the > default font > 11:42:14,903 WARN [PDSimpleFont] Changing font on <h> from <Arial> to > the default font > 11:42:14,903 WARN [PDSimpleFont] Changing font on <e> from <Arial> to the > default font > 11:42:14,906 WARN [PDSimpleFont] Changing font on <o> from <Arial> to the > default font > 11:42:14,907 WARN [PDSimpleFont] Changing font on <r> from <Arial> to the > default font > ... > </snapshot> > > Those warning could be deactivated in the logging.property file, I guess. > Though, the images were still created, however the images display wrong > characters, please see the comparison in the attached image file. > > How can I solve this? I have look around in the documentation and googled > a lot, but could not find any solutions. > > Is there a way to omit the character parsing, since my application is only > to convert the file to image and no ocr or the like? I have used the > loadNonSeq() method, but still received those poor characters in the images. > > thanks in advance! > > --hai > > > > On Mon, Jul 1, 2013 at 6:57 PM, Hai Nguyen FUB <[email protected]>wrote: > >> alright, thank you very much for the fast reply!!! >> >> --hai >> >> >> On Mon, Jul 1, 2013 at 6:52 PM, Andreas Lehmkuehler <[email protected]>wrote: >> >>> Am 01.07.2013 18:30, schrieb Hai Nguyen FUB: >>> >>> Hi Andreas, >>>> >>>> thank you very much, it works!!! >>>> >>>> though I still have warning notifications as following: >>>> >>>> 18:26:54,687 WARN [NonSequentialPDFParser] PDF file >>>> >>>>> 'src\test\resources\pdf\**249scan.pdf' does not allow extracting >>>>> content. >>>>> >>>>> >>>> does this extracting means that the fonts or characters within the >>>> document >>>> are not extractable? >>>> >>> It is possible to define user access permissions for a pdf, such as >>> >>> - disallow/allow printing >>> - disallow/allow text extraction >>> - disallow/allow modify the pdf >>> - .... >>> >>> I your case, it is not allowed to extract the content of the pdf as text. >>> >>> thanks, >>>> >>>> --hai >>>> SNIP >>>> >>> >>> BR >>> Andreas Lehmkühler >>> >> >> >
