The "..." is formally called an "ellipsis" and I can find nothing useful Googling except that somebody has tried using OpenCV object/feature detection to try and look for this. The only possible way I can imagine getting Tesseract to recognise an ellipsis is to train it where 3 full stops appear within a single box in the box file - something like that. But I'm not sure.
On Tuesday, 6 January 2015 19:27:24 UTC, Christen Møller wrote: > > Hi > > I have a problem with Tesseract - it simply ignores three dots in > sequence: "...". > > So "It's ..." becomes "It's" og "9,1 ... 9,3 ... 9,7" becomes "9,19,39,7". > Leaving very much manual work! > > Does anybody know how to make Tesseract recognize "..."? > > Best regards > > Christen Møller > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/78c3c10b-5bcd-49f3-96fe-aed055f4513b%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

