Hello I have the following problem: Tesseract splits a word into two words.
The image below shows the thresholded image with the recognition results. The yellow rectangles show the detected words. The detected text is "Total $1 9,55" instead of "Total $19,55". It is clearly wrong that Tesseract detects a word boundary between the "1" and the "9". I see this error very frequently. Is there any of the hundreds of undocumented settings that defines the minimum width for a space character ? Or is there any way to tell the word chopper that I want to define a space as at least the width of another character in the same column ? -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/e2555c88-e98a-48ba-a618-cccf6d152562%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

