I'm trying to train the attached files (Tesseract 3.02, following the 
instructions at 
http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3 ) , and 
although I can compete the training process successfully I can't get 
tesseract to work with the produce trainneddata file - I always receive the 
error:

tessdata_manager.SeekToStart(TESSDATA_INTTEMP):Error:Assert failed:in file 
adaptmatch.cpp, line 555

I have attached the .box, .tif, and font_properties file I used for 
training purposes. (Although the training instructions says to add .exp? 
after the font name in the font_properties file, when I use ocr.exp0 as the 
font name in that file the shape clustering than fails).


The following is the process I use for producing the training file:

./tesseract eng.icr.exp0.tif eng.icr.exp0 nobatch box.train.stderr
Tesseract Open Source OCR Engine v3.02.02 with Leptonica
APPLY_BOXES:
   Boxes read from boxfile:     315
   Found 315 good blobs.
   Leaving 26 unlabelled blobs in 0 words.
TRAINING ... Font name = icr
Generated training data for 18 words
./unicharset_extractor  eng.icr.exp0.box
./shapeclustering -F font_properties -U unicharset eng.icr.exp0.tr
Reading eng.icr.exp0.tr ...
Building master shape table
Computing shape distances...
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances...
Stopped with 0 merged, min dist 999.000000
Computing shape distances...
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
Distance = 0.007463: Stopped with 1 merged, min dist 0.101266
Master shape_table:Number of shapes = 36 max unichars = 2 number with 
multiple unichars = 1
./cntraining  eng.icr.exp0.tr
Reading eng.icr.exp0.tr ...
Clustering ...

Writing normproto ...
mv unichartset icr.unicharset
mv shapetable icr.shapetable
mv normproto icr.normproto
mv pffmtable icr.pffmtable
mv inttemp icr.inttemp

./combine_tessdata icr.
TessdataManager combined tesseract data files.
Offset for type 0 is -1
Offset for type 1 is 140
Offset for type 2 is -1
Offset for type 3 is -1
Offset for type 4 is -1
Offset for type 5 is 2528
Offset for type 6 is -1
Offset for type 7 is -1
Offset for type 8 is -1
Offset for type 9 is -1
Offset for type 10 is -1
Offset for type 11 is -1
Offset for type 12 is -1
Offset for type 13 is 7841
Offset for type 14 is -1
Offset for type 15 is -1
Offset for type 16 is -1

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to