Ah, ok, nothing we can do about it then. Sorry. >One more thing… That sounds like a new line issue. Notepad doesn’t understand \n, whereas WordPad and MSWord do.
From: Allison A. [mailto:alliso...@gmail.com] Sent: Friday, July 1, 2016 1:07 AM To: user@tika.apache.org Subject: Re: RE: PDFPaser generates gibberish Many thanks, yes, the PDFBox generates the gibberish. One more thing, when I opened the extracted text with Notepad, it is not showing. Clearly it appears in WordPad, MS Word, etc. Is this about an encoding issue?