I'd like to be able to pull the confidences from the character recognition engine - ideally at the character level but the word level would be sufficient. Looking at the hOCR specification (https://docs.google.com/document/d/1QQnIQtvdAC_8n92-LhwPcjtAUFwBlzE8EWnKAxlgVf0/preview) it seems like this information should be embedded in the hOCR output. However, the most recent version of ocropus (0.7) doesn't appear to contain this information in the hOCR output. The only data I see in the hOCR output are the bbox coordinates. Are there any command line switches I can include to provide additional information in the hOCR output, specifically the confidence measures? Or can this only be achieved programmatically? If so, has anyone had any luck patching the code to provide this additional feature?
-Elliot -- You received this message because you are subscribed to the Google Groups "ocropus" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/ocropus/9e536aa5-9918-4071-9dd0-87f5b5a1c535%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
