[tesseract-ocr] Re: Recognition of "5" instead of "S"

RangerRick Sun, 28 Apr 2019 07:53:04 -0700

Ok. Now I have tried the "best" traindata file (no difference) and removing 
the alpha layer (no difference). I even created a new, simpler bitmap using 
Courier New font (attached), which still fails.


Tesseract just can't distinguish between the number 5 and an S.


On Sunday, April 28, 2019 at 12:41:35 AM UTC-5, RangerRick wrote:
>
> Hi,
>
> I'm new to Tesseract, using latest version 4 executable on Windows 7.
>
> I'm converting Morse code CW from JPG into text using Tesseract. It works 
> almost right, just missing on the number 5, which is usually misinterpreted 
> as an "S".  Here's an example of the issue.
>
>
> [image: output.jpg]
>
>
> Here's how it's being interpreted:
>
>                                                                   3AMWA DE 
> FASMX QFSMXQ CQ CQ DE FSMXQ FSMXQ CQ DE FSMXQ ENSMAA I III FSMXQ FSMXQ 
> NHE K Â»
>
>
> I have tried adjusting the various command line parameters but no joy. I 
> believe the font is Fontcraft Courier DemiBold, but shouldn't matter.  In 
> this case, the image is 96 DPI and 24 pixels tall (total, including border).
>
> I started to try and retrain to optimize for this font, but that looks 
> like a pretty daunting task.
>
> Any guidance would be greatly appreciated.
>
> Rick
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/ab572776-22f8-4259-a7b4-ec6615d11bb4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[tesseract-ocr] Re: Recognition of "5" instead of "S"

Reply via email to