Re: How training language like arab?

gold snake Fri, 18 Jan 2013 03:16:51 -0800

thanks everybody again.

在 2013年1月15日星期二UTC+8下午9时16分04秒，gold snake写道：
>
> My language some special, just like arab font, but bitween arab font have 
> some different, actually only different on shape of the font. and It's 
> writing right to left too.
> I'm using standard tutorial : 
> https://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3
>
> but when i'm finish and test, it can't be accurately identify. 
> my step is :
>
> tesseract as.kadas.exp0.tif as.kadas.exp0 batch.nochop makebox
>
> tesseract as.kadas.exp0.tif as.kadas.exp0 nobatch box.train
>
> unicharset_extractor as.kadas.exp0.box
>
> shapeclustering -F font_properties -U unicharset as.kadas.exp0.tr
>
> mftraining -F font_properties -U unicharset -O as.unicharset 
> as.kadas.exp0.tr
>
> cntraining as.kadas.exp0.tr
>
> I haven't words dict. so ... i'm not use some step.
> rename some file , add as. prefix
>
> combine_tessdata as.
>
> there is no any error until i'm combne, so i'm sure it's not have any 
> problem.
> and when i'm test picture ,content is 13.  the result is : ئئ
> when i'm test any words, the result just ئ
>
>
>
> and i'm find the D:\Little\Tesseract-OCR\tessdata , and i'm found some 
> file :
>
> ara.cube.bigrams
> ara.cube.fold
> ara.cube.lm
> ara.cube.nn
> ara.cube.params
> ara.cube.size
> ara.cube.word-freq
> ara.traineddata
>
> and i can't understand. why the arab trainddata not only 
> have ara.traineddata? what is any other arab.* file ?? and if i'm trainning 
> my lanugage it's necessary??
> and how i cant find that file or create??
>
> thanks very much...
>
>


-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Re: How training language like arab?

Reply via email to