Looks ok. The dimensions need to match the bounding box in your tif. You can extract unicharset from the training text also.
On Thu, Oct 24, 2019, 15:00 Adam Funk <[email protected]> wrote: > Hi, > > I'm a bit confused by some of the comments in the tesseract > documentation, issues, and wiki about the addition of line-by-line > training to tesseract 4. Is the attached box file valid for training > tesseract 4.0.0? > > (I know that unicharset_extractor does not support WordStr yet, but I > have found a way to get around that by recycling the unicharset from the > standard English model.) > > Thanks, > Adam > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/c92aef13-060d-a6c9-560a-029f9700f1b1%40sheffield.ac.uk > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduU4pD-EaCOGn48qA7F_QbtDukcbVEfjmo4Vgsa_SAXQYw%40mail.gmail.com.

