You can test with the finetuned traineddata file from https://github.com/Shreeshrii/tessdata_shreetest/blob/master/engmorse.traineddata
Download the file (raw file) Use it with `-l engmorse` If you have not not placed it in your tessdata directory identified by TESSDATA_PREFIX also provide the path with `--tessdata-dir /path/to/finetuned/traineddata` ubuntu@tesseract-ocr:~/TEST$ tesseract morse.jpg - -l eng Warning: Invalid resolution 0 dpi. Using 70 instead. Estimating resolution as 125 3AMWA DE FASMX QFSMXQ CQ CQ DE FS5MXQ FSMXQ CQ DE FSMXQ ENSMAR I III FSMXQ FSMXQ NHE K » ubuntu@tesseract-ocr:~/TEST$ tesseract morse.jpg - -l engmorse --tessdata-dir ~/tesstutorial Warning: Invalid resolution 0 dpi. Using 70 instead. Estimating resolution as 125 3AMWA DE FASMX QF5MXQ CQ CQ DE F5MXQ F5MXQ CQ DE F5MXQ ENS5MAA I III F5MXQ F5MXQ NHE K On Mon, Apr 29, 2019 at 12:20 AM Shree Devi Kumar <[email protected]> wrote: > Finetuning with Courier font with a training text similar to image you are > recognizing with more samples of 5 will give better result. > > > On Sun, 28 Apr 2019, 20:19 RangerRick, <[email protected]> wrote: > >> Ok. Now I have tried the "best" traindata file (no difference) and >> removing the alpha layer (no difference). I even created a new, simpler >> bitmap using Courier New font (attached), which still fails. >> >> Tesseract just can't distinguish between the number 5 and an S. >> >> >> On Sunday, April 28, 2019 at 12:41:35 AM UTC-5, RangerRick wrote: >>> >>> Hi, >>> >>> I'm new to Tesseract, using latest version 4 executable on Windows 7. >>> >>> I'm converting Morse code CW from JPG into text using Tesseract. It >>> works almost right, just missing on the number 5, which is usually >>> misinterpreted as an "S". Here's an example of the issue. >>> >>> >>> [image: output.jpg] >>> >>> >>> Here's how it's being interpreted: >>> >>> 3AMWA >>> DE FASMX QFSMXQ CQ CQ DE FSMXQ FSMXQ CQ DE FSMXQ ENSMAA I III FSMXQ FSMXQ >>> NHE K » >>> >>> >>> I have tried adjusting the various command line parameters but no joy. I >>> believe the font is Fontcraft Courier DemiBold, but shouldn't matter. In >>> this case, the image is 96 DPI and 24 pixels tall (total, including border). >>> >>> I started to try and retrain to optimize for this font, but that looks >>> like a pretty daunting task. >>> >>> Any guidance would be greatly appreciated. >>> >>> Rick >>> >>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To post to this group, send email to [email protected]. >> Visit this group at https://groups.google.com/group/tesseract-ocr. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/ab572776-22f8-4259-a7b4-ec6615d11bb4%40googlegroups.com >> <https://groups.google.com/d/msgid/tesseract-ocr/ab572776-22f8-4259-a7b4-ec6615d11bb4%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > -- ____________________________________________________________ भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWZbgJ5ex0CAQk3cs_%2BBBqwkLZtAesBfxbTs3dx_-x2MA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.

