check also error messages - if you did not run shapeclustering then mftraining should not produce any output (in 3.02 version) ;-) Also it looks like you forget to rename output files from training tools! You need to follow training wiki[1]!
[1] http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3 -- Zdenko On Mon, Nov 12, 2012 at 3:42 PM, Mi Tran <[email protected]> wrote: > Oh, sorry. I use tesseract 3.0.2, win7 32bit. Processing that I did:**** >> >> 1. Generate Training Images: eng.timesitalic.exp0.tif**** >> >> 2. Make Box Files: tesseract eng.timesitalic.exp0.tif >> eng.timesitalic.exp0 batch.nochop makebox**** >> >> 3. Bootstrapping a new character set:**** >> >> tesseract eng.timesitalic.exp0.tif eng.timesitalic.exp0 -l eng >> batch.nochop makebox**** >> >> 4. Run Tesseract for Training: tesseract eng.timesitalic.exp0.tif >> eng.timesitalic.exp0 nobatch box.train**** >> >> 5. Compute the Character Set: unicharset_extractor >> eng.timesitalic.exp0.box eng.timesitalic.exp1.box**** >> >> 6. Create font_properties file, it content: timesitalic 1 0 0 1 0, >> and then run: **** >> >> mftraining -F font_properties -U unicharset -O eng.unicharset >> eng.timesitalic.exp0.tr**** >> >> cntraining eng.timesitalic.exp0.tr eng.timesitalic.exp1.tr **** >> >> 7. Dictionary Data: create frequent_words_list file and words_list >> file, then run:**** >> >> wordlist2dawg frequent_words_list lang.freq-dawg lang.unicharset >> wordlist2dawg words_list lang.word-dawg lang.unicharset**** >> >> 8. Putting it all together: combine_tessdata eng.**** >> >> 9. Rename eng.traineddata is nom.traineddata.**** >> >> 10. Coppy nom.traineddata into tessdata directory.**** >> >> 11. Run instruction: tesseract text.png out -l nom**** >> >> èError: “"tessdata_manager.SeekToStart<TESSDATE_INTERM>: Error: Assert >> failed:in file ...\...\classify\adaptmatch.cpp, line 555"” >> > Thanks. > > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

