Hi All,

I have started a project to do OCR on Identity Cards. I am learning to 
train tesseract models with custom fonts.

Please help me on this.

Steps till now:

1. git pull https://github.com/tesseract-ocr/tesseract
2. Then I followed instructions on training package till command "sudo make 
training-install".
3.Downloaded eng.traineddata from 
https://github.com/tesseract-ocr/tessdata_best in tessdata folder
4. Command " src/training/tesstrain.sh --fonts_dir /usr/share/fonts 
--fontlist "Arial Bold" --lang eng --linedata_only  
 --noextract_font_properties --langdata_dir ../langdata   --tessdata_dir 
./tessdata --output_dir ~/tesstutorial/engtrain"

It is giving error:
=== Phase E: Generating lstmf files ===
Using TESSDATA_PREFIX=./tessdata
[Tue Oct 16 05:41:31 UTC 2018] /usr/bin/tesseract 
/tmp/tmp.4EGdp9wW57/eng.Arial_Bold.exp0.tif 
/tmp/tmp.4EGdp9wW57/eng.Arial_Bold.exp0 --psm 6 lstm.train
Tesseract Open Source OCR Engine v3.04.01 with Leptonica
fseek(data_file_, static_cast<size_t>(offset_table_[tessdata_type]), 
SEEK_SET) == 0:Error:Assert failed:in file ../ccutil/tessdatamanager.h, 
line 173
ERROR: /tmp/tmp.4EGdp9wW57/eng.Arial_Bold.exp0.lstmf does not exist or is 
not readable

Why the version is 4.0.

Also, how do we download custom font for my Identity Cards.

Regards,

On Monday, 10 September 2018 15:05:15 UTC+5:30, [email protected] 
wrote:
>
>   Thank you Shreeshrii for reply!
>
> Manual customization of theese files might be kinda annoying. If i will 
> need to experiment with the dawg files and I'll achieve something I'll 
> surely let you know if there is any difference. Again thank you for your 
> help and time :)
>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/279bc21a-199a-43be-b5d6-07bfdd2a833f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to