Hello, I'm using tesseract (actually the tesseract.js port) to recognize short 6 character codes like this
F65M0P What are the optimal settings for this in terms of speed and correctness? Things to note: - Language is irrelevant. - Codes are always 6 characters long, uppercase, both digits and letters. - The font is chosen for its prevalence OCR contexts. - This image consists of the code and nothing else. I experience tesseract quite slow compared to larger texts which I suspect has to do with trying to force out a dictionary word. It will mostly prefer letters over digits for example S instead of 5. I am unfamiliar with tessaract and OCR, and you might be unfamiliar with the js-port. I don't think I can train the engine but I can set options like language_model_penalty_non_dict_word. Thanks for your help. Martin -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/ea523655-5ca7-42e2-860b-ba6894da8ba2%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

