Hi,

There isn't. PDFBox extracts in strings which you can then save in the encoding you want.

To see more about the PDF, use PDFDebugger.

Tilman

Am 02.02.2020 um 19:01 schrieb Athanasios Viennas:
Hello! I am looking out for the functionality of encoding in a
<code>org.apache.pdfbox.util.PDFTextStripperByArea</code> as the
constructor specification has been removed after release 1.8.13, now
looking out to use it in release 2.0.18.
I want to have control over parsing a pdf document in research paper type
of two-column layout written in Greek. I am unable to inspect the exact
encoding but anyhow if there is a alternative definition in a different
configuration class and see how it works but I can't spot the alternate
class where that has moved to.

with kind regards
Athanasios Viennas



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org

Reply via email to