search archive of tesseract forums for cube. Zdenko
On Tue, Jan 15, 2013 at 2:16 PM, gold snake <[email protected]> wrote: > My language some special, just like arab font, but bitween arab font have > some different, actually only different on shape of the font. and It's > writing right to left too. > I'm using standard tutorial : > https://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3 > > but when i'm finish and test, it can't be accurately identify. > my step is : > > tesseract as.kadas.exp0.tif as.kadas.exp0 batch.nochop makebox > > tesseract as.kadas.exp0.tif as.kadas.exp0 nobatch box.train > > unicharset_extractor as.kadas.exp0.box > > shapeclustering -F font_properties -U unicharset as.kadas.exp0.tr > > mftraining -F font_properties -U unicharset -O as.unicharset > as.kadas.exp0.tr > > cntraining as.kadas.exp0.tr > > I haven't words dict. so ... i'm not use some step. > rename some file , add as. prefix > > combine_tessdata as. > > there is no any error until i'm combne, so i'm sure it's not have any > problem. > and when i'm test picture ,content is 13. the result is : ئئ > when i'm test any words, the result just ئ > > > > and i'm find the D:\Little\Tesseract-OCR\tessdata , and i'm found some > file : > > ara.cube.bigrams > ara.cube.fold > ara.cube.lm > ara.cube.nn > ara.cube.params > ara.cube.size > ara.cube.word-freq > ara.traineddata > > and i can't understand. why the arab trainddata not only > have ara.traineddata? what is any other arab.* file ?? and if i'm trainning > my lanugage it's necessary?? > and how i cant find that file or create?? > > thanks very much... > > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

