I have also updated the image to have everything as the same font/size/etc, but still, tesseract just says "Error during processing." with seemingly zero information as to why.
Has anyone ever experienced this? If I can't find anything else out, I guess I'll just have to step through the page processing code and add in a bunch of printf statements just to see where tesseract is blowing up, which seems a bit overkill. -- James On Sunday, July 1, 2018 at 3:13:27 PM UTC-5, James Lipham wrote: > > Good afternoon all! > > I'm running Tesseract v3.05.02 on OSX Sierra (installed via Homebrew), and > I'm trying to train a custom dataset with some fairly small images that are > programmatically generated from a dot matrix display. > > When running > tesseract eng.dmd.exp0.tif eng.dmd.box nobatch box.train > > I get the following information: > > Tesseract Open Source OCR Engine v3.05.02 with Leptonica > Page 1 > Detected 27 diacritics > Error during processing. > > There is no additional information output to the console, so I really > don't know what my error could be. I've looked and verified that the tif > image doesn't have an alpha channel, and the box file appears to be in the > appropriate format. > > Has anyone run into this before? I'm thinking it's something absurdly > simple. I've attached both the TIF and box files I'm using. > > Thank you very very much! > > -- James > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/26e2884b-e09a-4b5d-8033-5aef7afad1c6%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.