Dear friends,
I want to train tesseract lstm for some scan documents.
Since the scan files are not so good, I have tried to make their
corresponding box with jTessBoxEditor, the boxes and the characters were
not so good recognized and need to correct manually.
After few days, now I have 3 files:
vie.timesnewromani.exp99.tif,
vie.timesnewromani.exp99.box
vie.timesnewromani.exp99.tr
Now, I need to convert them into lstm for training, I have modified the
tesstrain.sh
mkdir -p ${TRAINING_DIR}
tlog "\n=== Starting training for language '${LANG_CODE}'"
cp ~/tesstutorial/langdata/${LANG_CODE}/*.box ${TRAINING_DIR}
cp ~/tesstutorial/langdata/${LANG_CODE}/*.tif ${TRAINING_DIR}
source "$(dirname $0)/language-specific.sh"
set_lang_specific_parameters ${LANG_CODE}
I did copy all three files to langdata/vie/
but it seems that the files were not copied to the tmp train folder:
Please give me some advices,
Many thanks,
TuPM
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/tesseract-ocr/2b4d2343-083e-4904-8314-d0ec9706506dn%40googlegroups.com.