> > Oh, sorry. I use tesseract 3.0.2, win7 32bit. Processing that I did: > > 1. Generate Training Images: eng.timesitalic.exp0.tif > > 2. Make Box Files: tesseract eng.timesitalic.exp0.tif > eng.timesitalic.exp0 batch.nochop makebox > > 3. Bootstrapping a new character set: > > tesseract eng.timesitalic.exp0.tif eng.timesitalic.exp0 -l eng > batch.nochop makebox > > 4. Run Tesseract for Training: tesseract eng.timesitalic.exp0.tif > eng.timesitalic.exp0 nobatch box.train > > 5. Compute the Character Set: unicharset_extractor > eng.timesitalic.exp0.box eng.timesitalic.exp1.box > > 6. Create font_properties file, it content: timesitalic 1 0 0 1 0, > and then run: > > mftraining -F font_properties -U unicharset -O eng.unicharset > eng.timesitalic.exp0.tr > > cntraining eng.timesitalic.exp0.tr eng.timesitalic.exp1.tr > > 7. Dictionary Data: create frequent_words_list file and words_list > file, then run: > > wordlist2dawg frequent_words_list lang.freq-dawg lang.unicharset > wordlist2dawg words_list lang.word-dawg lang.unicharset > > 8. Putting it all together: combine_tessdata eng. > > 9. Rename eng.traineddata is nom.traineddata. > > 10. Coppy nom.traineddata into tessdata directory. > > 11. Run instruction: tesseract text.png out -l nom > > èError: “"tessdata_manager.SeekToStart<TESSDATE_INTERM>: Error: Assert > failed:in file ...\...\classify\adaptmatch.cpp, line 555"” > Thanks.
-- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

