I joined for similar/opposite reasons: In my case Tesseract is removing critical whitespace from between non-dictionary words, and I was looking for tips/hints as to what to tweak in Tesseract's configuration to get it to treat whitespace differently.
Anyone know? Stéphane On Wednesday, July 10, 2019 at 8:16:55 AM UTC-7, Timothy Snyder wrote: > > Hello all, > > Does anyone know of any config parameters that will increase the tolerance > of whitespace between characters, i.e., increase the amount of whitespace > needed to trigger word segmentation? > > I have many cases in my text where there are extra whitespace between > characters resulting in the segmentation of a single word into multiple > words. > > Any suggestions would be appreciated! > > -Tim > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/8e6878a9-655d-41c7-9d4d-bcb7dcfb6419%40googlegroups.com.

