Hi Grant, The percent accuracy depends on what kind of data you're looking to process. People who want to process large data sets with Tesseract typically train for the specific domain they're going to use it for, unless they're using high quality scans with common fonts, in which case accuracy is around 95--98% for most people. Many people post-process the OCR'ed text -- for a point and click solution, try VietOCR which uses Tesseract. Several people have gotten close to 100% accuracy with training. Numbers tend to throw a wrench in things because they're typically formatted weirdly. There is a standard EuroTest document on the website to show what symbols it can process easily. -_Sven
On Fri, May 18, 2012 at 8:42 AM, Grant Fletcher <[email protected]> wrote: > Hi All, > > Does anyone have a set of sample data I can have to test, & in understanding > how to work out a percentage accuracy of the OCR engine.. > > Any assistance on this would be appreciated. > > Thanks > Grant -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

