Aw: Re: Extract Text of Document with coordinates

Felix Hermann Mon, 04 Apr 2016 08:37:21 -0700

thanks for the answer,

problem is: how can I get the extracted characters and coordinates with pdfbox 
version 2.0.0.


The example reffers to an older version of pdfbox.
 
 

Gesendet: Donnerstag, 31. März 2016 um 19:58 Uhr
Von: "Tilman Hausherr" <[email protected]>
An: [email protected]
Betreff: Re: Extract Text of Document with coordinates
Am 31.03.2016 um 12:51 schrieb Felix Hermann:
> Hello,
>
> how can I extract the text + coordinates of a PDF document?
>
> To be more precise: I would like to extract all words of the document. And 
> for each word I need the coordinates of this word.
>
> If PDFBox does not support this: How can I get the coordinates of each 
> character?
>
> I tried to adapt the code of this example: 
> https://gist.github.com/DavidYKay/82f20ba67c50c499ebb3

Yes, the printtextlocations (or DrawPrintTextLocations) example is a
good start. Look for the blanks and build words from there.

Tilman

> However, I was not successful, as I use the new PDFBox version. (2.0.0)
>
> Regards
>
> Felix
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
 

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Aw: Re: Extract Text of Document with coordinates

Reply via email to