Hi,

Please try to submit a test case.

My guess is that this is related to bad /ToUnicode streams.

Tilman

Am 05.03.2020 um 03:09 schrieb Joel Hirsh:
I just started testing with version 2.0.19.

I am using PDFTextStripper and some files that gave back fine results in
2.0.18 are completely useless with 2.0.19.  As an example, I have one file
that gets about 600 phrases in 2.0.18.  In 2.0.19 it gets over 16,000
phrases the majority of which of are a zero length string, and most of the
rest are single characters making up the phrase, rather than a phrase.

The file is confidential, so I cannot just post it.

Am I telling you something that you already know about, or should I try to
submit a test case? Or is there some new option I am unaware of?

Thanks



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org

Reply via email to