You can try the finetuned traineddata from tutorial at https://github.com/Shreeshrii/tess4tutorial/tree/master/impact_from_full
Here are the results I get using those vs the ones with eng.traineddata from tessdata_bst: ***** 2v2Xj **** 1 K 45 1 K45 ***** 3VtsA **** 308 8 308 8 ***** FxcEl **** 1 Ka 1a ***** gwrBt **** 23 B 13 238 13 ***** hAJOM **** 1_C 15 1 C15 ***** kATPl **** 20°F C 13 Fr C 13 ***** Oj222 **** 12 0 1 120 1 ***** rOexn **** 1 C 13 1013 ***** UBqvX **** 34 E 1 34 E 1 ***** unnamed **** 32 EC 9 32EC 9 ***** Vwv5G **** 32 EC 5 32EC 5 On Tue, Aug 27, 2019 at 2:55 PM Shree Devi Kumar <[email protected]> wrote: > If all your images are in this bold thick font, fine tuning for impact may > help with some of the recognition errors. > > On Tue, 27 Aug 2019, 14:42 Stephane Charette, <[email protected]> > wrote: > >> I have a large number of images that contain a single line of >> alphanumeric data. My scans so far have not been great, and I could use >> some assistance. >> >> Several vars are turned off as recommended in the docs: >> >> key.push_back("load_system_dawg"); >> val.push_back("false"); >> key.push_back("load_freq_dawg"); >> val.push_back("false"); >> >> >> These are set at initialization: >> >> tess->Init(nullptr, "eng", tesseract::OEM_DEFAULT, nullptr, 0, &key, >> &val, false); >> tess->SetPageSegMode(tesseract::PageSegMode::PSM_SINGLE_LINE); >> >> >> Some images are close, such as this one: >> >> [image: "32 EC 5"] >> ...which is interpreted as "SZ2EC 3". >> >> Other like this one return a blank string: >> >> [image: "30 B 9"] >> And then I have some like this one which is so close, but Tesseract >> removes the spaces between the letters, so this example results in "1201": >> >> [image: "12 O 1"] >> I've posted my full .cpp test file and more example images showing the >> problem on StackOverflow: >> https://stackoverflow.com/questions/57670769/how-to-get-tesseract-to-recognize-these-alphanumeric-strings >> >> Thanks, >> >> Stéphane >> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/f721e105-d0d6-4322-b9c5-6c5f2d487d06%40googlegroups.com >> <https://groups.google.com/d/msgid/tesseract-ocr/f721e105-d0d6-4322-b9c5-6c5f2d487d06%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- ____________________________________________________________ भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWc%2BzhtREv_Ommw3Vqi57Ewiam6FEk%3DkRhR_raS0UAuCw%40mail.gmail.com.

