Tesseract cannot rasterize PDFs. It is fairly straightforward to write a PDF like does, but very complex to rasterize one.
Programs like OCRmyPDF (which I develop) use Ghostscript, Tesseract and other tools to handle PDF to searchable PDF conversion. On Tuesday, January 10, 2017 at 9:34:57 PM UTC-8, Andreas Steibl wrote: > > Hello > > I have a pdf (scanned) and now i make a searchable pdf from this > First i generate a black/white multipage tif, and with tesseract i can > make a searchable pdf. > > But is it somehow possible to integrate the original pdf images? > because the generated tif has not the same quality like the original > (maybe the scaned image is in color) > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/2dccb3d2-f45e-4f47-9d04-302814d7f4ce%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.