It fails with latest code. See https://github.com/tesseract-ocr/tesseract/issues/2748
Try with an older commit. On Tue, Nov 5, 2019, 11:32 Khangaroo <[email protected]> wrote: > Hi. I'm trying to create a fine-tuned model for Tesseract, but the > tesstrain.sh script always appears to fail on "Phase E: Generating lstmf > files". I get a rather vague error message for each Tesseract instance: > > Failed to read pages from > /tmp/eng-2019-11-04.YNl/eng.Century_Schoolbook_L_Bold_Italic.exp0.tif > Error during processing. > > I ran strace on one of the failed commands from tesstrain.sh and one line > in particular stuck out: > > openat(AT_FDCWD, > "/tmp/eng-2019-11-04.YOY/eng.NimbusSanNovDSemBol.exp0.uzn", O_RDONLY) = -1 > ENOENT (No such file or directory) > > The only code I could find that referenced any uzn files in the entire > repository was some code dedicated to reading it, not writing it. Is there > any way around this? > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/72d94549-73d1-49a9-b51c-15a8fd2346a8%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/72d94549-73d1-49a9-b51c-15a8fd2346a8%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWhLnp9KCBPwKp7_FgAyVmFLnefh3dn08bZNKCYdj%2BEjg%40mail.gmail.com.

