My language some special, just like arab font, but bitween arab font have some different, actually only different on shape of the font. and It's writing right to left too. I'm using standard tutorial : https://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3
but when i'm finish and test, it can't be accurately identify. my step is : tesseract as.kadas.exp0.tif as.kadas.exp0 batch.nochop makebox tesseract as.kadas.exp0.tif as.kadas.exp0 nobatch box.train unicharset_extractor as.kadas.exp0.box shapeclustering -F font_properties -U unicharset as.kadas.exp0.tr mftraining -F font_properties -U unicharset -O as.unicharset as.kadas.exp0.tr cntraining as.kadas.exp0.tr I haven't words dict. so ... i'm not use some step. rename some file , add as. prefix combine_tessdata as. there is no any error until i'm combne, so i'm sure it's not have any problem. and when i'm test picture ,content is 13. the result is : ئئ when i'm test any words, the result just ئ and i'm find the D:\Little\Tesseract-OCR\tessdata , and i'm found some file : ara.cube.bigrams ara.cube.fold ara.cube.lm ara.cube.nn ara.cube.params ara.cube.size ara.cube.word-freq ara.traineddata and i can't understand. why the arab trainddata not only have ara.traineddata? what is any other arab.* file ?? and if i'm trainning my lanugage it's necessary?? and how i cant find that file or create?? thanks very much... -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

