Re: OCR Per Page Basis

TP Thu, 08 Mar 2012 00:29:17 -0800

On Wed, Mar 7, 2012 at 9:33 AM, Dmitri Silaev <[email protected]> wrote:
> No, at this time it is not possible to do via command line.


As a matter of fact with the SVN version of tesseract at least (and
probably earlier versions), it is possible to tell tesseract to OCR a
particular page in a multipage tiff file via the command line. For
example, run:

   tesseract.exe example_multipage.tif page4 config-page.txt

where the config file, config-page.txt, only has the following in it:

  tessedit_page_number    3

You'll see:

  Tesseract Open Source OCR Engine v3.02 with Leptonica
  Page 4 of 5

and page4.txt will then contain the OCRed text of the fourth "page" in
example_multipage.tif.

So just dynamically create "config-page.txt" with the page # you want to OCR.

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Re: OCR Per Page Basis

Reply via email to