Not yet,i tried but failed.I'm waiting for the same API like you. 在 2017年8月11日星期五 UTC+8上午6:08:05,Shoaib写道: > > Hi everyone, > > I would like to train Tesseract on my own dataset comprising of word > images. I have the bounding box information but for the whole word instead > of per character. I referred to the following documentation available on > the topic of training Tesseract 4.0. > https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00 > > On the documentation, it is mentioned that "*The boxes only need to be at > the textline level. It is thus far easier to make training data from > existing image data.*". But later in the wiki, the box format that allows > boxes at text line level is said not to be implemented as of yet ("*Box > File Format - Second Option (NOT YET IMPLEMENTED)*"). I would therefore, > like to know if there is any way to train Tesseract based on just the word > bounding box information instead of character level information? > > Thanking you for your time in this regard. >
-- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/159baf4d-28a2-49c6-99c2-5fb1cc231ae3%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

