On Thu, May 24, 2012 at 8:08 AM, nikolaykhl <[email protected]> wrote:
> I agree that Abbyy will do the job more accurate out of the box and is > easier to get started with. > You may also want to have a look at this article: > http://www.splitbrain.org/blog/2010-06/15-linux_ocr_software_comparison > > This comparison is from 2010 and tesseract-ocr svn r402. Current revision is 725, so I guess there are some improvements since that test ;-) > > On Wednesday, May 23, 2012 9:03:31 PM UTC+4, Scott Oom wrote: >> >> We are working on automated testing tools for applications and games. >> >> We want to be able to verify various text in the UIs in different >> languages and have been experimenting with Tesseract OCR and having a >> lot of fun with it. >> >> In 2007, Ray Smith mentioned that "Tesseract is now behind the leading >> commercial engines in terms of its accuracy." >> >> What commercial engines are more accurate than Tesseract and in what >> ways? Can Tesseract OCR approach the commercial engines with training >> and adjusting of parameters or is it still behind? >> >> I would say it depends on your tasks and budget. E.g. in our local Gutenberg project Finereader is used for standard text. But for text with Fraktur we used tesseract-ocr (I did custom training for it). Project leader did not want to buy special version of Finereader[1]... On other side - I have not good experience with using tesseract to identify bold and italics text... [1] http://www.frakturschrift.de/en:start -- Zdenko -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

