Some user experienced problems with sending (big?) files to forum. I did not (but I usually don't send big files via e-mail). I would prefer if upload file to some online service, but I think mftraining will fix it ;-).
Zdenko On Tue, Mar 5, 2013 at 4:30 PM, A. Naut <[email protected]> wrote: > I appreciate the quick reply! I don't know why the files didn't attach, > that's very odd - I will have to repost them when I am home and also > investigate as to if/how I forgot the mftraining step, and if so if that > solves the issue. > Thanks! > > > On Tuesday, March 5, 2013 3:21:00 AM UTC-5, zdenop wrote: > >> There are no atttached data. Maybe try to use some online storage system >> (google disk, skydrive, dropbox...) and send a link here. >> >> You stated you are following wiki instruction[1], but you log shows it is >> not true - you did not run mftraining. >> >> [1] >> http://code.google.com/p/**tesseract-ocr/wiki/**TrainingTesseract3<http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3> >> >> >> Zdenko >> >> >> On Tue, Mar 5, 2013 at 4:14 AM, A. Naut <[email protected]> wrote: >> >>> I'm trying to train the attached files (Tesseract 3.02, following the >>> instructions at http://code.google.com/p/**tesseract-ocr/wiki/** >>> TrainingTesseract3<http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3>) >>> , and although I can compete the training process successfully I can't >>> get tesseract to work with the produce trainneddata file - I always receive >>> the error: >>> >>> tessdata_manager.SeekToStart(**TESSDATA_INTTEMP):Error:Assert failed:in >>> file adaptmatch.cpp, line 555 >>> >>> I have attached the .box, .tif, and font_properties file I used for >>> training purposes. (Although the training instructions says to add .exp? >>> after the font name in the font_properties file, when I use ocr.exp0 as the >>> font name in that file the shape clustering than fails). >>> >>> >>> The following is the process I use for producing the training file: >>> >>> ./tesseract eng.icr.exp0.tif eng.icr.exp0 nobatch box.train.stderr >>> Tesseract Open Source OCR Engine v3.02.02 with Leptonica >>> APPLY_BOXES: >>> Boxes read from boxfile: 315 >>> Found 315 good blobs. >>> Leaving 26 unlabelled blobs in 0 words. >>> TRAINING ... Font name = icr >>> Generated training data for 18 words >>> ./unicharset_extractor eng.icr.exp0.box >>> ./shapeclustering -F font_properties -U unicharset eng.icr.exp0.tr >>> Reading eng.icr.exp0.tr ... >>> Building master shape table >>> Computing shape distances... >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... >>> Stopped with 0 merged, min dist 999.000000 >>> Computing shape distances... 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 >>> 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 >>> Distance = 0.007463: Stopped with 1 merged, min dist 0.101266 >>> Master shape_table:Number of shapes = 36 max unichars = 2 number with >>> multiple unichars = 1 >>> ./cntraining eng.icr.exp0.tr >>> Reading eng.icr.exp0.tr ... >>> Clustering ... >>> >>> Writing normproto ... >>> mv unichartset icr.unicharset >>> mv shapetable icr.shapetable >>> mv normproto icr.normproto >>> mv pffmtable icr.pffmtable >>> mv inttemp icr.inttemp >>> >>> ./combine_tessdata icr. >>> TessdataManager combined tesseract data files. >>> Offset for type 0 is -1 >>> Offset for type 1 is 140 >>> Offset for type 2 is -1 >>> Offset for type 3 is -1 >>> Offset for type 4 is -1 >>> Offset for type 5 is 2528 >>> Offset for type 6 is -1 >>> Offset for type 7 is -1 >>> Offset for type 8 is -1 >>> Offset for type 9 is -1 >>> Offset for type 10 is -1 >>> Offset for type 11 is -1 >>> Offset for type 12 is -1 >>> Offset for type 13 is 7841 >>> Offset for type 14 is -1 >>> Offset for type 15 is -1 >>> Offset for type 16 is -1 >>> >>> -- >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To post to this group, send email to [email protected] >>> >>> To unsubscribe from this group, send email to >>> tesseract-oc...@**googlegroups.com >>> >>> For more options, visit this group at >>> http://groups.google.com/**group/tesseract-ocr?hl=en<http://groups.google.com/group/tesseract-ocr?hl=en> >>> >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to tesseract-oc...@**googlegroups.com. >>> >>> For more options, visit >>> https://groups.google.com/**groups/opt_out<https://groups.google.com/groups/opt_out> >>> . >>> >>> >>> >> >> -- > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > > --- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/groups/opt_out. > > > -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.

