Hi, using the latest version of pdfbox (1.7.1) that's what I got
MLIPHOAP6 AE0TE 03D4 DR DVGWEWNER5L STLERC 60CO L4PU7L Please give it a try. Maruan Sahyoun Am 20.03.2013 um 11:45 schrieb Sébastien Dailly <[email protected]>: > Hello, > > I've got a problem while reading the attached document. (It has been > deflated, anonymised, text has been removed, and character shuffled). > > The text extraction works fine with some pdf reader (I tried with Acrobat and > Evince), but the text read by pdfbox is not the expected one, as if pdfbox is > using a wrong font description for reading the text : instead of > > >> 60CO L4PU7L > > 03D4 DR DVGWEWNER5L STLERC >> MLIPHOAP6 AE0TE > > I've got > >> UvIKGMuK6RuN0TN >> 0 E4RREDRRRElPéNéOND5vRRrTvNDp >> 60pMRRRv4KS7v > > > I'm using pdfbox 1.6.0 for that. > > Is the document invalid ? What can I do for reading correctly the document ? > > Thanks ! > > -- > Sébastien Dailly > +33 1 56 29 78 67 > ELETTERMAIL > <document.pdf>
