You should consider also using the PAGE format. You can use this tool for 
conversion: http://www.primaresearch.org/tools/TesseractOCRToPAGE

On Monday, 13 July 2015 06:23:09 UTC+1, [email protected] wrote:
>
> I'm working on converting a large number of tax forms into structured 
> data, is hOCR the best way to do this? maybe there are other ways? I would 
> imagine this is a problem that is at least partially solved.
>
> Thanks in advance! Tesseract is awesome :)
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/2c8fcf96-16b7-496b-804c-470d63e3e413%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to