Jeff's answer is probably the most important explanation, but some other reasons include: - Tess supports more languages - Tess is older - Tess has a bigger more well developed community (partly because of all the other reasons) - Tess is higher performance (from a resource utilization point of view, last time I checked)
Ocropus is/was pretty much a one-man project and was, as I understand it, designed to support his research. It also went through a significant rewrite as a result of a change in implementation strategy and that discontinuity probably didn't help things. Because it's more modern and was designed as a toolkit to support research, it might lead to better OCR in the future, but it'd still have a hard time competing with the "unreasonable effectiveness of data" that Google can bring to bear with its large training corpuses. Tom On Sunday, July 19, 2015 at 4:09:03 PM UTC-4, [email protected] wrote: > > This seems like a good explanation based off of everything I've learned > over the last few days. > > On Friday, July 17, 2015 at 8:41:33 PM UTC-7, Jeff Breidenbach wrote: >> >> Tesseract is more complete in terms of 'throw me an arbitrary document >> image and produce something useful' >> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/49afbb05-4ba4-4905-b508-6ea77dca04f9%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

