no page
> parameter that specifies that the text has been rotated?
>
> Back to language modeling... Thank you, again!
>
> -Original Message-
> From: Tilman Hausherr [mailto:thaush...@t-online.de]
> Sent: Monday, September 25, 2017 1:39 PM
> To: users@pdfbox.apache
gain!
-Original Message-
From: Tilman Hausherr [mailto:thaush...@t-online.de]
Sent: Monday, September 25, 2017 1:39 PM
To: users@pdfbox.apache.org
Subject: Re: Extracting rotated text
No good idea except call setRotate() on the page and then do text extraction.
A possible strategy might be
1:39 PM
To: users@pdfbox.apache.org
Subject: Re: Extracting rotated text
No good idea except call setRotate() on the page and then do text extraction.
A possible strategy might be to do all rotations and see which one brings most
known words.
Tilman
Am 25.09.2017 um 19:31 schrieb Allison
No good idea except call setRotate() on the page and then do text
extraction.
A possible strategy might be to do all rotations and see which one
brings most known words.
Tilman
Am 25.09.2017 um 19:31 schrieb Allison, Timothy B.:
Colleagues,
Any recommendations for extracting rotated text
Colleagues,
Any recommendations for extracting rotated text such as:
https://www.fsis.usda.gov/wps/wcm/connect/896bf55c-0d78-44a0-adfb-94f893eb0f72/GallagherEbelKause_74.pdf?MOD=AJPERES
?
Adobe DC gets reasonable text with "save as text". PDFBox's ExtractText (and
Tika) get some
5 matches
Mail list logo