Hi,
when you fine tune the model (maybe with ocrd-train) you can choose to
restrict the model output to a smaller set of characters. No need to
blacklist or anything else.

If you just want to locate the symbols something like opencv matchTemplate
<https://docs.opencv.org/trunk/d4/dc6/tutorial_py_template_matching.html>
or training an opencv
<https://www.pyimagesearch.com/2015/11/09/pedestrian-detection-opencv/>/dlib
<http://www.hackevolve.com/create-your-own-object-detector/> hog detector
may be more appropriate. Using tesseract looks like a very convoluted way
to do it.

If you have multiple symbols use multiple patterns/train multiple detectors.


Bye

Lorenzo



Il giorno mar 21 mag 2019 alle ore 16:37 Rafay Kalim <[email protected]>
ha scritto:

> Hey, so I am trying to train a new Tesseract model to only recognize
> certain UTF-8 symbols as I want an OCR that only recognizes these symbols
> and not other English letters etc. I realize there are two ways I can do
> this - one is to fine tune Tesseract over the normal English model and then
> blacklist the English text or train a completely new model that only
> recognizes this text. I was wondering if I could get some input into which
> of these - or another method, is better for ease, time and accuracy.
>
> The context is I will have some various texts on a board and I want to
> recognize the locations of the symbols. However, I don't want to recognize
> any of the English or anything else as this may mess with my post
> processing. I have tried a few locations (like restricting where these
> symbols can be on the board and then only scanning the text in those
> strips) but I am not satisfied with the results. Additionally, I can also
> control the font and the size of the text on the board and everything else,
> except the actual codes.
>
> Thanks for the help!
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/3237ae86-db20-467c-bebc-6b45f854e799%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/3237ae86-db20-467c-bebc-6b45f854e799%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAMgOLLxSd7tT7TEzxZxdfX9zf431pKprddF1xw35TsERJN6-eA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to