On Tuesday, July 14, 2015 at 2:47:40 AM UTC-4, James Owers wrote: > > You should consider also using the PAGE format. You can use this tool for > conversion: http://www.primaresearch.org/tools/TesseractOCRToPAGE >
Most PAGE format tools aren't available as open source and use a custom license specific to the lab that produces them and the primary thing that PAGE adds over hOCR (ground truth text) doesn't sound like it's needed here. Tom -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/3bfe7604-103b-43ce-884a-c8580213aaf2%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

