hOCR is a convention for encoding OCR-related information in HTML. Every hOCR file is a valid HTML file and most of the OCR-related information is encoded using regular HTML markup. And an HTML file becomes an hOCR file through the addition of a tiny amount of meta information.
There are two primary use cases. -- If you're generating HTML already, you can put OCR-related information in it in which other applications can understand it. -- If you're already using HTML as input (e.g. for text-to-speech), you can look for additional OCR-related information if you need it. If there is no OCR-specific information you need, you can ignore the hOCR-related markup entirely. Generally speaking, just support HTML in the way you would anyway, and only worry about hOCR if you really need a specific OCR-related feature. In that way, hOCR is different from (and much simpler than) other OCR formats, which require explicit and separate support. Tom On Jun 17, 9:14 am, "Romeyke, Andreas" <[email protected]> wrote: > Hello, > > we discuss to support hOCR in our tools. To decide if hOCR should be > supported, we need some information who is developing on tools with it. > Because our development team is very small we are not able to support > hOCR if there is no benefit for us. > > Which software supports hOCR with deep structure information? What kind > of software exists to allow users to judge the quality of OCR *and* > layout information? What about the actual state of hOCR in ocropus? > Which OCR programs supports hOCR (I know about cuneiform, tesseract and > ocropus)? > > -- > Andreas Romeyke > - Abteilung Blindenschrift - > Deutsche Zentralbücherei für Blinde zu Leipzig (DZB) > Gustav-Adolf-Straße 7, 04105 Leipzig > Tel: +49 341 7113-..., Fax: +49 341 7113-125 > Internet:www.dzb.de > E-Mail: [email protected] > > -- > -- > Diese e-mail wurde auf Spam und Viren mit Astaro Security Gateway geprueft. > > signature.asc > < 1KViewDownload -- You received this message because you are subscribed to the Google Groups "ocropus" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/ocropus?hl=en.
