You can try the finetuned traineddata from tutorial at
https://github.com/Shreeshrii/tess4tutorial/tree/master/impact_from_full

Here are the results I get using those vs the ones with eng.traineddata
from tessdata_bst:

 ***** 2v2Xj ****
1 K 45
1 K45

 ***** 3VtsA ****
308 8
308 8

 ***** FxcEl ****
1 Ka
1a

 ***** gwrBt ****
23 B 13
238 13

 ***** hAJOM ****
1_C 15
1 C15

 ***** kATPl ****
20°F C 13
Fr C 13

 ***** Oj222 ****
12 0 1
120 1

 ***** rOexn ****
1 C 13
1013

 ***** UBqvX ****
34 E 1
34 E 1

 ***** unnamed ****
32 EC 9
32EC 9

 ***** Vwv5G ****
32 EC 5
32EC 5


On Tue, Aug 27, 2019 at 2:55 PM Shree Devi Kumar <[email protected]>
wrote:

> If all your images are in this bold thick font, fine tuning for impact may
> help with some of the recognition errors.
>
> On Tue, 27 Aug 2019, 14:42 Stephane Charette, <[email protected]>
> wrote:
>
>> I have a large number of images that contain a single line of
>> alphanumeric data.  My scans so far have not been great, and I could use
>> some assistance.
>>
>> Several vars are turned off as recommended in the docs:
>>
>>     key.push_back("load_system_dawg");
>>     val.push_back("false");
>>     key.push_back("load_freq_dawg");
>>     val.push_back("false");
>>
>>
>> These are set at initialization:
>>
>>     tess->Init(nullptr, "eng", tesseract::OEM_DEFAULT, nullptr, 0, &key, 
>> &val, false);
>>     tess->SetPageSegMode(tesseract::PageSegMode::PSM_SINGLE_LINE);
>>
>>
>> Some images are close, such as this one:
>>
>> [image: "32 EC 5"]
>> ...which is interpreted as "SZ2EC 3".
>>
>> Other like this one return a blank string:
>>
>> [image: "30 B 9"]
>> And then I have some like this one which is so close, but Tesseract
>> removes the spaces between the letters, so this example results in "1201":
>>
>> [image: "12 O 1"]
>> I've posted my full .cpp test file and more example images showing the
>> problem on StackOverflow:
>> https://stackoverflow.com/questions/57670769/how-to-get-tesseract-to-recognize-these-alphanumeric-strings
>>
>> Thanks,
>>
>> Stéphane
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/tesseract-ocr/f721e105-d0d6-4322-b9c5-6c5f2d487d06%40googlegroups.com
>> <https://groups.google.com/d/msgid/tesseract-ocr/f721e105-d0d6-4322-b9c5-6c5f2d487d06%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 

____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWc%2BzhtREv_Ommw3Vqi57Ewiam6FEk%3DkRhR_raS0UAuCw%40mail.gmail.com.

Reply via email to