Hi,

Am Freitag, 30. Oktober 2015 16:39:17 UTC+1 schrieb Zack Cohen:
>
> Thanks!
>

You can get it just from the console API at nearly no additional runtime:

$ tesseract page_152.png page_152 -l deu-frak+deu  makebox hocr

This will output three files: page_152.txt, page_152.hocr and page_152.box.

With the data in the box-file you can cut out the areas from the image.

And then I assume I would use something like magick++ (which I am having a 
> suprisingly hard time getting to work on mac) to crop the images
>

Exactly. And on a mac you should use homebrew to install magick.
 
HTH

Helmut Wollmersdorfer

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/1341b772-e79f-4e9a-9ea0-9ee1e1b6f624%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to