I joined for similar/opposite reasons:  In my case Tesseract is removing 
critical whitespace from between non-dictionary words, and I was looking 
for tips/hints as to what to tweak in Tesseract's configuration to get it 
to treat whitespace differently.

Anyone know?

Stéphane


On Wednesday, July 10, 2019 at 8:16:55 AM UTC-7, Timothy Snyder wrote:
>
> Hello all,
>
> Does anyone know of any config parameters that will increase the tolerance 
> of whitespace between characters, i.e., increase the amount of whitespace 
> needed to trigger word segmentation?
>
> I have many cases in my text where there are extra whitespace between 
> characters resulting in the segmentation of a single word into multiple 
> words.
>
> Any suggestions would be appreciated!
>
> -Tim
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/8e6878a9-655d-41c7-9d4d-bcb7dcfb6419%40googlegroups.com.

Reply via email to