The current version of tesseract in Debian and Ubuntu changes the output filename extension (.hocr) compared to the previous version (.html).
Whilst API changes are inevitable - they really should be accompanied by bumps to the version number. Otherwise it makes it really difficult for programs which use tesseract to deal with API changes. Please bump the version number (i.e. the output of tesseract --version) before every release. Regards Jeff -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CACg2wbxeE-z_GLVdB-nWy70Ou2oeQZL_Dz-Onop422xqZ%2Bz5gA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.

