The reason why Arabic has those files and your language does not is that Arabic is set up to use the "cube" feature to combine it with other languages, so you can do "-l ara+eng" and OCR a document with both Arabic and English. That training is harder, and not necessary if you mainly want to do monolingual documents.
And what Zdenko is saying is that you are asking questions that don't show that you're tried to solve the problem yourself. We're all professional programmers and we want to help people but we don't have time to teach elementary web searching or programming. You seem to be a smart guy, but your questions appear to be lazy. You need to make an effort to solve the problems and come to us for help, not ask us to solve them for you. --Sven On Wed, Jan 16, 2013 at 2:59 AM, gold snake <[email protected]> wrote: > I can't found any answer for my question in this link. > can you just tolk to me? Is have necessary to bully a rookie? > please... > > 在 2013年1月16日星期三UTC+8下午4时02分25秒,zdenop写道: >> >> Really ;-)? I got 93 results. E.g.: >> >> https://groups.google.com/**forum/#!msg/tesseract-ocr/** >> 0msQtTB_XrI/D1noel9GpPgJ<https://groups.google.com/forum/#!msg/tesseract-ocr/0msQtTB_XrI/D1noel9GpPgJ> >> https://groups.google.com/d/**topic/tesseract-ocr/tyV5_** >> z65XMk/discussion<https://groups.google.com/d/topic/tesseract-ocr/tyV5_z65XMk/discussion> >> https://groups.google.com/d/**msg/tesseract-ocr/R7UCx0oV3PA/** >> GE7KJ_76kS0J<https://groups.google.com/d/msg/tesseract-ocr/R7UCx0oV3PA/GE7KJ_76kS0J> >> >> Please honor time of people on this list... >> >> Zdenko >> >> >> On Wed, Jan 16, 2013 at 8:18 AM, gold snake <[email protected]> wrote: >> >>> I can't found anything. common.... >>> >>> 在 2013年1月15日星期二UTC+8下午10时38分42秒,**zdenop写道: >>>> >>>> search archive of tesseract forums for cube. >>>> >>>> Zdenko >>>> >>>> >>>> On Tue, Jan 15, 2013 at 2:16 PM, gold snake <[email protected]> wrote: >>>> >>>>> My language some special, just like arab font, but bitween arab font >>>>> have some different, actually only different on shape of the font. and >>>>> It's >>>>> writing right to left too. >>>>> I'm using standard tutorial : https://code.google.com/p/**te** >>>>> sseract-ocr/wiki/**TrainingTesse**ract3<https://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3> >>>>> >>>>> but when i'm finish and test, it can't be accurately identify. >>>>> my step is : >>>>> >>>>> tesseract as.kadas.exp0.tif as.kadas.exp0 batch.nochop makebox >>>>> >>>>> tesseract as.kadas.exp0.tif as.kadas.exp0 nobatch box.train >>>>> >>>>> unicharset_extractor as.kadas.exp0.box >>>>> >>>>> shapeclustering -F font_properties -U unicharset as.kadas.exp0.tr >>>>> >>>>> mftraining -F font_properties -U unicharset -O as.unicharset >>>>> as.kadas.exp0.tr >>>>> >>>>> cntraining as.kadas.exp0.tr >>>>> >>>>> I haven't words dict. so ... i'm not use some step. >>>>> rename some file , add as. prefix >>>>> >>>>> combine_tessdata as. >>>>> >>>>> there is no any error until i'm combne, so i'm sure it's not have any >>>>> problem. >>>>> and when i'm test picture ,content is 13. the result is : ئئ >>>>> when i'm test any words, the result just ئ >>>>> >>>>> >>>>> >>>>> and i'm find the D:\Little\Tesseract-OCR\**te**ssdata , and i'm found >>>>> some file : >>>>> >>>>> ara.cube.bigrams >>>>> ara.cube.fold >>>>> ara.cube.lm >>>>> ara.cube.nn >>>>> ara.cube.params >>>>> ara.cube.size >>>>> ara.cube.word-freq >>>>> ara.traineddata >>>>> >>>>> and i can't understand. why the arab trainddata not only >>>>> have ara.traineddata? what is any other arab.* file ?? and if i'm >>>>> trainning >>>>> my lanugage it's necessary?? >>>>> and how i cant find that file or create?? >>>>> >>>>> thanks very much... >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "tesseract-ocr" group. >>>>> To post to this group, send email to [email protected] >>>>> >>>>> To unsubscribe from this group, send email to >>>>> tesseract-oc...@**googlegroups.**com >>>>> >>>>> For more options, visit this group at >>>>> http://groups.google.com/**group**/tesseract-ocr?hl=en<http://groups.google.com/group/tesseract-ocr?hl=en> >>>>> >>>> >>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To post to this group, send email to [email protected] >>> To unsubscribe from this group, send email to >>> tesseract-oc...@**googlegroups.com >>> For more options, visit this group at >>> http://groups.google.com/**group/tesseract-ocr?hl=en<http://groups.google.com/group/tesseract-ocr?hl=en> >>> >> >> -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > -- ``All that is gold does not glitter, not all those who wander are lost; the old that is strong does not wither, deep roots are not reached by the frost. >From the ashes a fire shall be woken, a light from the shadows shall spring; renewed shall be blade that was broken, the crownless again shall be king.” -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

