I'm training failure, final result looks like very bad. maybe because i don't know how handle the same character in different position. you looking like that: م , ئما , تىم , مور actually i'm writing like that: م , ئما , تىم , مور can you see one character like O, it's a same character, but when it position change, it style change. i don't know what can i do. i think why the result so terrible, may be because this . computer get 1 character for training, but there is have 4 different style...........
in any body tell me what i need to do training language something like this.... 在 2013年1月15日星期二UTC+8下午9时16分04秒,gold snake写道: > > My language some special, just like arab font, but bitween arab font have > some different, actually only different on shape of the font. and It's > writing right to left too. > I'm using standard tutorial : > https://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3 > > but when i'm finish and test, it can't be accurately identify. > my step is : > > tesseract as.kadas.exp0.tif as.kadas.exp0 batch.nochop makebox > > tesseract as.kadas.exp0.tif as.kadas.exp0 nobatch box.train > > unicharset_extractor as.kadas.exp0.box > > shapeclustering -F font_properties -U unicharset as.kadas.exp0.tr > > mftraining -F font_properties -U unicharset -O as.unicharset > as.kadas.exp0.tr > > cntraining as.kadas.exp0.tr > > I haven't words dict. so ... i'm not use some step. > rename some file , add as. prefix > > combine_tessdata as. > > there is no any error until i'm combne, so i'm sure it's not have any > problem. > and when i'm test picture ,content is 13. the result is : ئئ > when i'm test any words, the result just ئ > > > > and i'm find the D:\Little\Tesseract-OCR\tessdata , and i'm found some > file : > > ara.cube.bigrams > ara.cube.fold > ara.cube.lm > ara.cube.nn > ara.cube.params > ara.cube.size > ara.cube.word-freq > ara.traineddata > > and i can't understand. why the arab trainddata not only > have ara.traineddata? what is any other arab.* file ?? and if i'm trainning > my lanugage it's necessary?? > and how i cant find that file or create?? > > thanks very much... > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

