Hi, Do you have some code and file to reproduce this effect?
protected void writeString(String text, List<TextPosition> textPositions) throws IOException
is not overridden in PDFStripperByArea so this is weird. Tilman Am 03.03.2020 um 20:31 schrieb PDF Developer:
Hello, I am trying to understand these two methods PDFTextStripper and PDFTextStripperByArea. I am using them obtain the properties of the text in a PDF. For what it is worth, I have some PDFs that are marked up with "regions" which I can, reliably, detect. Since I know the area in question, I thought it would be enough to use the PDFStripperByArea and get the text within the bounded area. That works quite well. However, now there is a requirement to get the rotation of the text, as there are use cases where the text has been rotated as part of an upstream process. So I tried to get the TextPositions properties via an override of the writeString and I thought was all working but a colleague pointed out that the rotation was always "0". Going back to basics, for test purposes, I used PDFTextStripper (again with an override it the writeString method) basically to dump the properties of the TextPositions. That appears to give me the results I am looking for. However, if I use a similar override for PDFTextStripperByArea I never see a rotation other than 0. Since there can be a lot of text on a page and the pages are very large, so I would prefer to use PDFTextStripperByArea (mainly because I know exactly where the text will be and the overhead will be less). Have I misunderstood something along the way? Made a naive assumption? Any suggestions on how to get the PDFTextStripperByArea to return the string contained within an area/region and the rotation (or other properties) of the text? PDFDev
--------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org For additional commands, e-mail: users-h...@pdfbox.apache.org