[
https://issues.apache.org/jira/browse/PDFBOX-5234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17377091#comment-17377091
]
yoonho commented on PDFBOX-5234:
--------------------------------
[~lehmi]
I am using pdfbox 3.0 version. This is no different with version 2.x vs. The
file is named 600mb, but the actual size is 22mb.
> When extracting text from pdf, spaces are replaced with other characters
> ------------------------------------------------------------------------
>
> Key: PDFBOX-5234
> URL: https://issues.apache.org/jira/browse/PDFBOX-5234
> Project: PDFBox
> Issue Type: Wish
> Reporter: yoonho
> Priority: Major
>
> hello
> I am trying to extract text from a pdf. However, when I extract the text
> using PDFTextStripper, the spaces in the text of the pdf are replaced with
> strange characters. Is there any way to prevent this?
>
> http://gofile.me/4hSqO/CF1mRfmyD
> (The file is attached, and the language is Korean.)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]