Do you mean you have image with text and pictures and you want to
remove(ignore) pictures from OCR process?

If this is the case than have a look at segment_image function in jbig2enc
project[2].

You can try it this way (if you have installed jbig2enc) on test image[3]:

jbig2 -s -S -p -v Microfilm_Sample_2_jpeg.jpg
mv output.0000.png image.0000.png
pdf.py output >Microfilm_Sample_2.pdf



[1] https://github.com/agl/jbig2enc/blob/master/src/jbig2.cc#L105
[2] https://github.com/agl/jbig2enc
[3]
http://www.digitalscanning.com/images/Samples/Microfilm_Sample_2_jpeg.jpg

Zdenko


On Thu, Jan 30, 2014 at 6:49 AM, Nick Porter <[email protected]> wrote:

> How can I use Leptonica to scan an image for regions of the image that
> contain text, then send the regions to Tesseract for recognition?
> I am developing in the iOS SDK, any suggestions are appreciated.
>
> --
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en
>
> ---
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to