On Fri, 17 Oct 2008, James Tuttle wrote:

I wonder if any of you might have experience with creating text PDFs
from  TIFFs.  I've been using tiffcp to stitch TIFFs together into a
single image and then using tiff2pdf to generate PDFs from the single
TIFF.  I've had to pass this image-based PDF to someone with Acrobat to
use it's batch processing facility to OCR the text and save a text-based
PDF.  I wonder if anyone has suggestions for software I can integrate
into the script (Python on Linux) I'm using.

I don't, but I've used the batch processing of Acrobat before to do the OCR -- and let me suggest that you make sure to back up the files before running the batch.

I selected the wrong option, and instead of ending up with image+text, it stripped out the image, and saved overtop of the original files. (wiping out a week's worth of scanning for me)

I've also never found a good way of editing the 'tags' that Acrobat generates -- so it marks up each line of the document as a new paragraph and I couldn't find any good tools to merge the tags (although, I was running an older version of Acrobat ... 6, I think)

-Joe

Reply via email to