Am 31.10.2018 um 22:07 schrieb Luca Loiodice:
I am using 2.0.X and have to support arbitrary input PDF (including
90,180,270 orientations, multi-column text, etc..).
Everything works fine except for the text on angle.
I came up with 2 pass call to the PDFStripper. Getting standard oriented
text using SortByPosition=false and getting 90,180,270 oriented text using
SortByPosition=true,
which I am not sure is correct, but seems to work.
Are there any override I could try on the 2.0.X PDFStripper to make it
work?
No, nothing out of the box... the reason is that PDFBox sees each glyph
by itself. You, a human, are smarter than PDFBox and do notice that
these glyphs seemingly on different "lines" are part of a skewed line.
Tilman
On Wed, Oct 31, 2018 at 3:34 PM Tilman Hausherr <thaush...@t-online.de>
wrote:
It might work with 1.8. However that version has other weaknesses.
Tilman
Am 31.10.2018 um 21:19 schrieb Luca Loiodice:
Is it possible to extract the 2 lines of text from this page?
https://www.dropbox.com/s/2uh3p464i7iwjwv/textonanangle.pdf?dl=0
This is the text lines I get using standard PdfStripper
Tex
t on
e o
n a
n a
ngl
e
Text two on an angle
Thanks a lot,
Luca
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org