Hi, when you fine tune the model (maybe with ocrd-train) you can choose to restrict the model output to a smaller set of characters. No need to blacklist or anything else.
If you just want to locate the symbols something like opencv matchTemplate <https://docs.opencv.org/trunk/d4/dc6/tutorial_py_template_matching.html> or training an opencv <https://www.pyimagesearch.com/2015/11/09/pedestrian-detection-opencv/>/dlib <http://www.hackevolve.com/create-your-own-object-detector/> hog detector may be more appropriate. Using tesseract looks like a very convoluted way to do it. If you have multiple symbols use multiple patterns/train multiple detectors. Bye Lorenzo Il giorno mar 21 mag 2019 alle ore 16:37 Rafay Kalim <[email protected]> ha scritto: > Hey, so I am trying to train a new Tesseract model to only recognize > certain UTF-8 symbols as I want an OCR that only recognizes these symbols > and not other English letters etc. I realize there are two ways I can do > this - one is to fine tune Tesseract over the normal English model and then > blacklist the English text or train a completely new model that only > recognizes this text. I was wondering if I could get some input into which > of these - or another method, is better for ease, time and accuracy. > > The context is I will have some various texts on a board and I want to > recognize the locations of the symbols. However, I don't want to recognize > any of the English or anything else as this may mess with my post > processing. I have tried a few locations (like restricting where these > symbols can be on the board and then only scanning the text in those > strips) but I am not satisfied with the results. Additionally, I can also > control the font and the size of the text on the board and everything else, > except the actual codes. > > Thanks for the help! > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/3237ae86-db20-467c-bebc-6b45f854e799%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/3237ae86-db20-467c-bebc-6b45f854e799%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAMgOLLxSd7tT7TEzxZxdfX9zf431pKprddF1xw35TsERJN6-eA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.

