Hi,

Please upload your file to a sharehoster. Also mention what PDFBox version you are using.

If the PDF doesn't have spaces (most PDFs don't), then you won't get any positions.

High level PDFBox text extraction (i.e. just get text) creates spaces by using heuristics.

Tilman

Am 20.12.2017 um 03:46 schrieb Dan Liu:
Hello all:
?0?2 ?0?2 I extract the text according to the codes of https://www.tutorialkart.com/pdfbox/how-to-extract-coordinates-or-position-of-characters-in-pdf/ , but all spaces between english words are lost.

Such as:
"severe acute respiratory syndrome"

becomes:
severeacuterespiratorysyndrome

The attachment is origianl text.


------------------

With best regards
Daniel


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]


Reply via email to