Robert is pointing you to right direction. Did you read the log you post
here?
" Tesseract Open Source OCR Engine v3.04.01 with Leptonica"
You are mixing tesseract versions so no surprise of problems.

Zdenko


ut 16. 10. 2018 o 8:26 Vinod Gattani <[email protected]>
napísal(a):

> Hi,
> Typo: " Why the version is not 4.0.?
> I installed using "git pull https://github.com/tesseract-ocr/tesseract";.
> And then followed the instructions on training page.
>
> Regards
>
> On Tue, Oct 16, 2018 at 11:53 AM Robert Kamiński <
> [email protected]> wrote:
>
>> Hi,
>> " Why the version is 4.0." What do you mean by that? In logs it states
>> that it's 3.04v. "Tesseract Open Source OCR Engine v3.04.01 with
>> Leptonica".
>> The problem might be the fact that 4th version is using lstm files
>> whereas you have version 3.04 using box files instead. Try to check the
>> version of installed Tesseract. Also note that I'm not the expert here ^.^
>>
>>
>> wt., 16 paź 2018 o 08:04 Vinod Gattani <[email protected]>
>> napisał(a):
>>
>>> Hi All,
>>>
>>> I have started a project to do OCR on Identity Cards. I am learning to
>>> train tesseract models with custom fonts.
>>>
>>> Please help me on this.
>>>
>>> Steps till now:
>>>
>>> 1. git pull https://github.com/tesseract-ocr/tesseract
>>> 2. Then I followed instructions on training package till command "sudo
>>> make training-install".
>>> 3.Downloaded eng.traineddata from
>>> https://github.com/tesseract-ocr/tessdata_best in tessdata folder
>>> 4. Command " src/training/tesstrain.sh --fonts_dir /usr/share/fonts
>>> --fontlist "Arial Bold" --lang eng --linedata_only
>>>  --noextract_font_properties --langdata_dir ../langdata   --tessdata_dir
>>> ./tessdata --output_dir ~/tesstutorial/engtrain"
>>>
>>> It is giving error:
>>> === Phase E: Generating lstmf files ===
>>> Using TESSDATA_PREFIX=./tessdata
>>> [Tue Oct 16 05:41:31 UTC 2018] /usr/bin/tesseract
>>> /tmp/tmp.4EGdp9wW57/eng.Arial_Bold.exp0.tif
>>> /tmp/tmp.4EGdp9wW57/eng.Arial_Bold.exp0 --psm 6 lstm.train
>>> Tesseract Open Source OCR Engine v3.04.01 with Leptonica
>>> fseek(data_file_, static_cast<size_t>(offset_table_[tessdata_type]),
>>> SEEK_SET) == 0:Error:Assert failed:in file ../ccutil/tessdatamanager.h,
>>> line 173
>>> ERROR: /tmp/tmp.4EGdp9wW57/eng.Arial_Bold.exp0.lstmf does not exist or
>>> is not readable
>>>
>>> Why the version is 4.0.
>>>
>>> Also, how do we download custom font for my Identity Cards.
>>>
>>> Regards,
>>>
>>> On Monday, 10 September 2018 15:05:15 UTC+5:30, [email protected]
>>> wrote:
>>>>
>>>>   Thank you Shreeshrii for reply!
>>>>
>>>> Manual customization of theese files might be kinda annoying. If i will
>>>> need to experiment with the dawg files and I'll achieve something I'll
>>>> surely let you know if there is any difference. Again thank you for your
>>>> help and time :)
>>>>
>>>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To post to this group, send email to [email protected].
>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/tesseract-ocr/279bc21a-199a-43be-b5d6-07bfdd2a833f%40googlegroups.com
>>> <https://groups.google.com/d/msgid/tesseract-ocr/279bc21a-199a-43be-b5d6-07bfdd2a833f%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To post to this group, send email to [email protected].
>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/tesseract-ocr/CALtwN-eGJG3MOTm7f-p%3DESRGgU7PtC41SVcBU8OSNMGThYjo5A%40mail.gmail.com
>> <https://groups.google.com/d/msgid/tesseract-ocr/CALtwN-eGJG3MOTm7f-p%3DESRGgU7PtC41SVcBU8OSNMGThYjo5A%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/CAN557awfgH5F07nyV5iL1o5pN4MfebOvUWsJBLdSbG6QsdCmew%40mail.gmail.com
> <https://groups.google.com/d/msgid/tesseract-ocr/CAN557awfgH5F07nyV5iL1o5pN4MfebOvUWsJBLdSbG6QsdCmew%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8wxAd4YCEUwnU-bPf9FQ%2BtutmKdwSQXro_eo6cjLkNRHA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to