Hi there... I'm new with PDFBox and i'm extracting text from some pdf and letting them in a String variable. Now my problem is the latin characters as accentued letter are not suited as they would.
How can I set the charset or how can i see the charset returned from the TextStripper from PDFBox?? I read it was UTF-16BE but when i get byte code with this charset and translate it to ISO-8859-1 i get letter separated with a space and no luck with accented letters... So whats wrong or can you help me to correct this?? I'm using PDFBOX 0.7.3 Thanks in advance...