Hello,

I attempted to run the following command

src/training/tesstrain.sh --fonts_dir /usr/share/fonts --lang eng 
--linedata_only --noextract_font_properties --langdata_dir 
~/tesstutorial/langdata --tessdata_dir ~/tesstutorial/tesseract/tessdata 
--output_dir ~/tesstutorial/engtrain

(which is copied from the document *Training Tesseract 4.00 *in the section 
TessTutorial.

Everything seems to be going fine until it (spuriously?) generates an error 
message in the log file:

Rendered page 3355 to file /tmp/eng-2019-09-14.GmB/eng.Arial_Italic.exp0.tif
Rendered page 3370 to file 
/tmp/eng-2019-09-14.GmB/eng.Arial_Bold_Italic.exp0.tif
ERROR: Program text2image failed. Abort.
Rendered page 3367 to file /tmp/eng-2019-09-14.GmB/eng.Arial.exp0.tif
Rendered page 3356 to file /tmp/eng-2019-09-14.GmB/eng.Arial_Italic.exp0.tif
...

After this, training will continue and then end without copying anything 
out of the /tmp directory.  In my case, it generated 7 of 8 box files as 
seen by a directory of /tmp/eng-2019-09-14.GmB:

dmaung@Rhinegeist1:~/Tesseract-git/tesseract$ ls -1 /tmp/eng-2019-09-14.GmB/
eng.Arial_Bold.exp0.box
eng.Arial_Bold.exp0.tif
eng.Arial_Bold_Italic.exp0.box
eng.Arial_Bold_Italic.exp0.tif
eng.Arial.exp0.box
eng.Arial.exp0.tif
eng.Arial_Italic.exp0.box
eng.Arial_Italic.exp0.tif
eng.Courier_New_Bold.exp0.box
eng.Courier_New_Bold.exp0.tif
eng.Courier_New_Bold_Italic.exp0.box
eng.Courier_New_Bold_Italic.exp0.tif
eng.Courier_New.exp0.tif
eng.Courier_New_Italic.exp0.box
eng.Courier_New_Italic.exp0.tif
tesstrain.log

Can anyone suggest how to debug what is causing text2image to fail or how 
to get around it?
David




-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/4bf99c84-28ef-431d-a08e-95bd2939be4f%40googlegroups.com.

Reply via email to