Tilman,
I have reported this because the PDF appeared normal to me. If there is a way to read the text in the PDF in a right way I hope you could help me with that. Best regards, Hesham ---------------------------------------------------------------------------- ---------------------- Included Message: The font has some extremely high values that we use for our heuristics, these are misleading the software: I'll see if something can be done... but I suspect that it requires a change that would break other text extractions so we can't commit it to the repository. Tilman Am 25.01.2018 um 15:20 schrieb Hesham Gneady: Hello , While reading a pdf using PDFBox v2.0.8 many spaces are being ignored, so words are merged together while reading the pdf. You can test a 1-page sample PDF from here: https://www.dropbox.com/s/9i9ofl3tje4iy3k/wrong_space_parsed_sample.pdf?dl=1 You can see wrong read words like : aboutmidnight, andbefore, CountyDonegal, ... I have tried to use PDFTextStripper.setAverageCharTolerance(...) to control space sensitivity but it didn't make any change. Any idea why this happens and how to fix it ? Best regards , Hesham --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus

