Finetuning with Courier font with a training text similar to image you are
recognizing with more samples of 5 will give better result.


On Sun, 28 Apr 2019, 20:19 RangerRick, <[email protected]> wrote:

> Ok. Now I have tried the "best" traindata file (no difference) and
> removing the alpha layer (no difference). I even created a new, simpler
> bitmap using Courier New font (attached), which still fails.
>
> Tesseract just can't distinguish between the number 5 and an S.
>
>
> On Sunday, April 28, 2019 at 12:41:35 AM UTC-5, RangerRick wrote:
>>
>> Hi,
>>
>> I'm new to Tesseract, using latest version 4 executable on Windows 7.
>>
>> I'm converting Morse code CW from JPG into text using Tesseract. It works
>> almost right, just missing on the number 5, which is usually misinterpreted
>> as an "S".  Here's an example of the issue.
>>
>>
>> [image: output.jpg]
>>
>>
>> Here's how it's being interpreted:
>>
>>                                                                   3AMWA
>> DE FASMX QFSMXQ CQ CQ DE FSMXQ FSMXQ CQ DE FSMXQ ENSMAA I III FSMXQ FSMXQ
>> NHE K »
>>
>>
>> I have tried adjusting the various command line parameters but no joy. I
>> believe the font is Fontcraft Courier DemiBold, but shouldn't matter.  In
>> this case, the image is 96 DPI and 24 pixels tall (total, including border).
>>
>> I started to try and retrain to optimize for this font, but that looks
>> like a pretty daunting task.
>>
>> Any guidance would be greatly appreciated.
>>
>> Rick
>>
>>
>> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/ab572776-22f8-4259-a7b4-ec6615d11bb4%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/ab572776-22f8-4259-a7b4-ec6615d11bb4%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduW3-0bxPuEHFXBFQACXZ7CSEZzHFPQM3Z36RwLG3GAW3w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to