My bad, I had missed that feature. "tessedit_page_number" indeed
allows to specify a TIFF page. I can only add a bit of clarification:
the page number is zero-based. The value of -1 (default) instructs
Tesseract to process all TIFF pages.

Warm regards,
Dmitri Silaev
www.CustomOCR.com



On Thu, Mar 8, 2012 at 12:28 PM, TP <[email protected]> wrote:
> On Wed, Mar 7, 2012 at 9:33 AM, Dmitri Silaev <[email protected]> wrote:
>> No, at this time it is not possible to do via command line.
>
> As a matter of fact with the SVN version of tesseract at least (and
> probably earlier versions), it is possible to tell tesseract to OCR a
> particular page in a multipage tiff file via the command line. For
> example, run:
>
>   tesseract.exe example_multipage.tif page4 config-page.txt
>
> where the config file, config-page.txt, only has the following in it:
>
>  tessedit_page_number    3
>
> You'll see:
>
>  Tesseract Open Source OCR Engine v3.02 with Leptonica
>  Page 4 of 5
>
> and page4.txt will then contain the OCRed text of the fourth "page" in
> example_multipage.tif.
>
> So just dynamically create "config-page.txt" with the page # you want to OCR.
>
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to