Hi Eric,
I am using Tesseract 3.02 to train Chinese(Traditional) data, on Windows 7.
I had been trained English Script font. It works and does improve 
performance. Now, I am facing problems on Chinese training.
Detail steps as below:

1. I entered command below
tesseract [lang].[fontname].exp[number].tif [lang].[fontname].exp[number] 
batch.nochop makebox

2. The .BOX fie generated.

3. I revised the .BOX file, because the original .BOX file always has wrong 
characters.

4. Then, I entered command below
tesseract [lang].[fontname].exp[number].tif [lang].[fontname].exp[number] 
nobatch box.train 

5. Command window showed *"Found 0 good blobs. 7 remaining unlabelled words 
deleted"*(I use 7 Chienese character to train).

Could you share with me how did you train your source?
Looking forward to your tips and reply.

Good day ans super thanks.


Cheers,
Raccoon Tseng 



Eric.yang於 2010年8月2日星期一 UTC+8下午2時42分43秒寫道:
>
> Hi,all.I'm currently using tesseract-2.04 to recognition Chinese, in 
> Windows xp. 
> I read the introduction in http://code.google.com/p/tesseract-ocr/w/list, 
> but when I do my training run into some problem. Here are the steps i 
> did: 
>
> 1.tesseract 1.tif 1 batch.nochop makebox--------------make a txt file 
> 2.Remane 1.txt to 1.box, then use bbtesseract to adjustment. 
> 3.Tesseract 1.tif junk nobatch box.train --------make 1.tr and 
> junk.txt 
> 4.mftraining scan.tr5.cnTraining scan.tr6.unicharset_extractor 
> scan.box 
>
> Ok, there are inttemp / normproto/ pffmtable/ unicharset, but how do i 
> use them? 
> Did I do something wrong? 
>
> Thinks a lot!

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/1ad23bdf-485d-4a92-8c6f-4fbedd4ebf35%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to