Hi,
There is no direct API for that. What you could do is to collect the red
rectangles and the cyan rectangles. Use the bottom side of the red
rectangles to decide what's in a common line, and then use the cyan
shapes to build a common rectangle.
Yes that one method is private. If you really think that this would
help, then copy the source of PDFTextStripper from the source download,
rename it and adjust that method.
Tilman
PS: Your image didn't get through. I assume it is an output of
DrawPrintTextLocations.
Am 05.08.2020 um 13:19 schrieb Ahmad Al-Mughrabi:
Hi PDFBox team,
Thanks for the great framework. I'm looking for the ability to have
the coordinates information (x, y, width, height) for each line in a
given page. In this example
<https://svn.apache.org/viewvc/pdfbox/trunk/examples/src/main/java/org/apache/pdfbox/examples/util/DrawPrintTextLocations.java?view=markup&sortby=date>,
the output is a rectangle that wrapped each word. See the snapshot below:
03054985.2018.1409975-marked-1.png
From the *org.apache.pdfbox.text.PDFTextStripper#writeLine*, we see
the method is not allowed for overriding, the method is *private* as
long as *org.apache.pdfbox.text.PDFTextStripper.WordWithTextPositions*.
/**
* Write a list of string containing a whole line of a document.
*
* @param line a list with the words of the given line
* @throws IOException if something went wrong
*/
private void writeLine(List<WordWithTextPositions> line)
throws IOException
{
int numberOfStrings = line.size();
for (int i = 0; i < numberOfStrings; i++)
{
WordWithTextPositions word = line.get(i);
writeString(word.getText(), word.getTextPositions());
if (i < numberOfStrings - 1)
{
writeWordSeparator();
}
}
}
Can you please point me how I can obtain the line coordinates for a
given page?
Thanks a million,
--
[Atypon Systems LLC] <https://www.atypon.com/>
Ahmad Al Mughrabi | Principle Software Engineer
141 Makkah Al Mukaramah Street, Hamadani 1 Complex, 3rd Floor, Amman
11181 Jordan
mobile +962788880753 | amughr...@atypon.com <mailto:amughr...@atypon.com>
[www.atypon.com] atypon .com
[Atypon Awards]
CONFIDENTIAL: This email and any attachments may contain confidential
and legally privileged information for the exclusive use of the
designated recipients. Unauthorized review, use, storage, disclosure
or distribution is prohibited. If you are not the intended recipient,
contact the sender and destroy all copies of the original message.