Finetuning with Courier font with a training text similar to image you are recognizing with more samples of 5 will give better result.
On Sun, 28 Apr 2019, 20:19 RangerRick, <[email protected]> wrote: > Ok. Now I have tried the "best" traindata file (no difference) and > removing the alpha layer (no difference). I even created a new, simpler > bitmap using Courier New font (attached), which still fails. > > Tesseract just can't distinguish between the number 5 and an S. > > > On Sunday, April 28, 2019 at 12:41:35 AM UTC-5, RangerRick wrote: >> >> Hi, >> >> I'm new to Tesseract, using latest version 4 executable on Windows 7. >> >> I'm converting Morse code CW from JPG into text using Tesseract. It works >> almost right, just missing on the number 5, which is usually misinterpreted >> as an "S". Here's an example of the issue. >> >> >> [image: output.jpg] >> >> >> Here's how it's being interpreted: >> >> 3AMWA >> DE FASMX QFSMXQ CQ CQ DE FSMXQ FSMXQ CQ DE FSMXQ ENSMAA I III FSMXQ FSMXQ >> NHE K » >> >> >> I have tried adjusting the various command line parameters but no joy. I >> believe the font is Fontcraft Courier DemiBold, but shouldn't matter. In >> this case, the image is 96 DPI and 24 pixels tall (total, including border). >> >> I started to try and retrain to optimize for this font, but that looks >> like a pretty daunting task. >> >> Any guidance would be greatly appreciated. >> >> Rick >> >> >> -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/ab572776-22f8-4259-a7b4-ec6615d11bb4%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/ab572776-22f8-4259-a7b4-ec6615d11bb4%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduW3-0bxPuEHFXBFQACXZ7CSEZzHFPQM3Z36RwLG3GAW3w%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.

