Please see https://github.com/tesseract-ocr/tesseract/issues/83 and other
PDF related issues in GitHub repo with similar discussion.
- excuse the brevity, sent from mobile
On 13-Jan-2017 10:15 PM, "James R Barlow" <j...@purplerock.ca> wrote:
> Tesseract cannot rasterize PDFs. It is fairly straightforward to write a
> PDF like does, but very complex to rasterize one.
> Programs like OCRmyPDF (which I develop) use Ghostscript, Tesseract and
> other tools to handle PDF to searchable PDF conversion.
> On Tuesday, January 10, 2017 at 9:34:57 PM UTC-8, Andreas Steibl wrote:
>> I have a pdf (scanned) and now i make a searchable pdf from this
>> First i generate a black/white multipage tif, and with tesseract i can
>> make a searchable pdf.
>> But is it somehow possible to integrate the original pdf images?
>> because the generated tif has not the same quality like the original
>> (maybe the scaned image is in color)
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To post to this group, send email to firstname.lastname@example.org.
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit https://groups.google.com/d/
> For more options, visit https://groups.google.com/d/optout.
You received this message because you are subscribed to the Google Groups
To unsubscribe from this group and stop receiving emails from it, send an email
To post to this group, send email to email@example.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit
For more options, visit https://groups.google.com/d/optout.