[
https://issues.apache.org/jira/browse/PDFBOX-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vitalie Bureanu updated PDFBOX-1858:
------------------------------------
Description:
Extracted text does not have spaces between some words.
Use to test please a string on line 74a... inside of attached test.pdf.
It will be extracted as: "74a Amount of line73youwant refunded toyou . If
Form8888 isattached , checkhere"
The result is not seems to be good, the words are "glued".
I tried to use a class PDF Text Stripper but the result still remain the same.
Can it be resolved, please?
With respect,
Vitalie
was:
Extracted text does not have spaces between some words.
Use to test please a string on line 74a... inside of attached test.pdf.
It will be extracted as: "74a Amount of line73youwant refunded toyou . If
Form8888 isattached , checkhere"
The result is not seems to be good, the words are "glued".
I tried to use a class PDF Text Stripper but the result still remain the same.
Can it be solved, please?
With respect,
Vitalie
> Extracted text does not have spaces
> -----------------------------------
>
> Key: PDFBOX-1858
> URL: https://issues.apache.org/jira/browse/PDFBOX-1858
> Project: PDFBox
> Issue Type: Bug
> Components: Parsing, Text extraction
> Affects Versions: 1.8.3
> Environment: Linux 64bit, Java
> Reporter: Vitalie Bureanu
> Attachments: Screenshot.jpg, test.pdf
>
> Original Estimate: 3h
> Remaining Estimate: 3h
>
> Extracted text does not have spaces between some words.
> Use to test please a string on line 74a... inside of attached test.pdf.
> It will be extracted as: "74a Amount of line73youwant refunded toyou . If
> Form8888 isattached , checkhere"
> The result is not seems to be good, the words are "glued".
> I tried to use a class PDF Text Stripper but the result still remain the same.
> Can it be resolved, please?
> With respect,
> Vitalie
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)