If the space is included in the training across the board, the model might not recognize the comma when it appears without space (as in numbers: 23,334).
On Wednesday, October 18, 2023 at 5:29:13 AM UTC+3 Danny wrote: > For purposes of training, I'm wondering if the box for a character should > include the surrounding space. > > In particular for the CJK "FULLWIDTH COMMA", should the box be the red or > green rectangle? > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/4ccffa26-2abc-4337-9815-3c99e42f20f4n%40googlegroups.com.

