> Hmm, because it's you, I'll try it myself :-)
Thank you, Tilman!
> You can't really know for sure with the classic text extraction, but you
> could use the extractTextByArea example with the rect coordinates.
Based on your example, though, I think this should work. If I cache the
here's code that works - for some reason, I can't take the rectangle as
it is, I have to flip the coordinates. I wonder if this is documented.
The coordinates in the PDF are PDF coordinates (bottom is y = 0), but
the coordinates I had to use are top is y = 0)
Tilman
package
You can't really know for sure with the classic text extraction, but you
could use the extractTextByArea example with the rect coordinates.
Additionally, please try the DrawPrintTextLocations example. That one
paints three colors on the page... blue is the bounding box, cyan is the
real
Sorry, duh, I switched to overriding writeString(String text, List
positions) instead of writeString(String text).
I can calculate x overlap, but I can't figure out how to get overlap on y.
The file is here:
All,
Is there a recipe for associating a hyperlink to text on the page? Over on
Tika, we're dumping these as at the end of each page. If it isn't
too hard, it would be great to associate these links with text, e.g. http://tika.apache.org;>tika.
This is related to PDFBOX-1143 and TIKA-2029.
5 matches
Mail list logo