On Sun Jan 23 10:02:08 PST 2022 rc...@pobox.com said:
>I am using PDFBox's PDFTextStripper.getText() for a particular kind of
>PDF file generated by a government agency, and the text I'm getting does
>not match that displayed by Acrobat Reader for the same files. The
>getText() calls occasionally get characters Reader does not display, and
>in one case getText() gets an "O" instead of the "U" displayed by
>Reader. I would like to know if there's some way I can get same text as
>Reader displays.

Have you checked for embedded Fonts in the PDF?  It's quite possible to have 
fonts where the code for "A" is NOT the save as the ASCII "A".


--

Worlds only All Electric F-250 truck! 
http://john.casadelgato.com/Electric-Vehicles/1995-Ford-F-250


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org

Reply via email to