Robert/ Zdenko

Yes, in the log I see version "3.4v".

To install v4, I used the link "https://github.com/tesseract-ocr/tesseract";.
I thought it has tesseract v4, as the Readme file say "Source code for the
new LSTM based 4.0 version is available from the master branch on GitHub."
So, I did a git pull.

Steps:


   1. git pull https://github.com/tesseract-ocr/tesseract
   2. cd tesseract
   3. sudo apt-get install libicu-dev
   4. sudo apt-get install libpango1.0-dev
   5. sudo apt-get install libcairo2-dev
   6. sh autogen.sh
   7. sh ./configure
   8. make
   9. make training
   10. sudo make training-install
   11. Training Command gives the error as mentioned.

Also, when I do tesseract -v, I see 3.04.01 too.

So, is there any other way of installing v4.0. Please let me know what I am
doing wrong.

Regards,
Vinod

On Tue, Oct 16, 2018 at 12:15 PM Zdenko Podobny <[email protected]> wrote:

> Robert is pointing you to right direction. Did you read the log you post
> here?
> " Tesseract Open Source OCR Engine v3.04.01 with Leptonica"
> You are mixing tesseract versions so no surprise of problems.
>
> Zdenko
>
>
> ut 16. 10. 2018 o 8:26 Vinod Gattani <[email protected]>
> napísal(a):
>
>> Hi,
>> Typo: " Why the version is not 4.0.?
>> I installed using "git pull https://github.com/tesseract-ocr/tesseract";.
>> And then followed the instructions on training page.
>>
>> Regards
>>
>> On Tue, Oct 16, 2018 at 11:53 AM Robert Kamiński <
>> [email protected]> wrote:
>>
>>> Hi,
>>> " Why the version is 4.0." What do you mean by that? In logs it states
>>> that it's 3.04v. "Tesseract Open Source OCR Engine v3.04.01 with
>>> Leptonica".
>>> The problem might be the fact that 4th version is using lstm files
>>> whereas you have version 3.04 using box files instead. Try to check the
>>> version of installed Tesseract. Also note that I'm not the expert here ^.^
>>>
>>>
>>> wt., 16 paź 2018 o 08:04 Vinod Gattani <[email protected]>
>>> napisał(a):
>>>
>>>> Hi All,
>>>>
>>>> I have started a project to do OCR on Identity Cards. I am learning to
>>>> train tesseract models with custom fonts.
>>>>
>>>> Please help me on this.
>>>>
>>>> Steps till now:
>>>>
>>>> 1. git pull https://github.com/tesseract-ocr/tesseract
>>>> 2. Then I followed instructions on training package till command "sudo
>>>> make training-install".
>>>> 3.Downloaded eng.traineddata from
>>>> https://github.com/tesseract-ocr/tessdata_best in tessdata folder
>>>> 4. Command " src/training/tesstrain.sh --fonts_dir /usr/share/fonts
>>>> --fontlist "Arial Bold" --lang eng --linedata_only
>>>>  --noextract_font_properties --langdata_dir ../langdata   --tessdata_dir
>>>> ./tessdata --output_dir ~/tesstutorial/engtrain"
>>>>
>>>> It is giving error:
>>>> === Phase E: Generating lstmf files ===
>>>> Using TESSDATA_PREFIX=./tessdata
>>>> [Tue Oct 16 05:41:31 UTC 2018] /usr/bin/tesseract
>>>> /tmp/tmp.4EGdp9wW57/eng.Arial_Bold.exp0.tif
>>>> /tmp/tmp.4EGdp9wW57/eng.Arial_Bold.exp0 --psm 6 lstm.train
>>>> Tesseract Open Source OCR Engine v3.04.01 with Leptonica
>>>> fseek(data_file_, static_cast<size_t>(offset_table_[tessdata_type]),
>>>> SEEK_SET) == 0:Error:Assert failed:in file ../ccutil/tessdatamanager.h,
>>>> line 173
>>>> ERROR: /tmp/tmp.4EGdp9wW57/eng.Arial_Bold.exp0.lstmf does not exist or
>>>> is not readable
>>>>
>>>> Why the version is 4.0.
>>>>
>>>> Also, how do we download custom font for my Identity Cards.
>>>>
>>>> Regards,
>>>>
>>>> On Monday, 10 September 2018 15:05:15 UTC+5:30, [email protected]
>>>> wrote:
>>>>>
>>>>>   Thank you Shreeshrii for reply!
>>>>>
>>>>> Manual customization of theese files might be kinda annoying. If i
>>>>> will need to experiment with the dawg files and I'll achieve something 
>>>>> I'll
>>>>> surely let you know if there is any difference. Again thank you for your
>>>>> help and time :)
>>>>>
>>>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "tesseract-ocr" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>> To post to this group, send email to [email protected].
>>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>>> To view this discussion on the web visit
>>>> https://groups.google.com/d/msgid/tesseract-ocr/279bc21a-199a-43be-b5d6-07bfdd2a833f%40googlegroups.com
>>>> <https://groups.google.com/d/msgid/tesseract-ocr/279bc21a-199a-43be-b5d6-07bfdd2a833f%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To post to this group, send email to [email protected].
>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/tesseract-ocr/CALtwN-eGJG3MOTm7f-p%3DESRGgU7PtC41SVcBU8OSNMGThYjo5A%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/tesseract-ocr/CALtwN-eGJG3MOTm7f-p%3DESRGgU7PtC41SVcBU8OSNMGThYjo5A%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To post to this group, send email to [email protected].
>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/tesseract-ocr/CAN557awfgH5F07nyV5iL1o5pN4MfebOvUWsJBLdSbG6QsdCmew%40mail.gmail.com
>> <https://groups.google.com/d/msgid/tesseract-ocr/CAN557awfgH5F07nyV5iL1o5pN4MfebOvUWsJBLdSbG6QsdCmew%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8wxAd4YCEUwnU-bPf9FQ%2BtutmKdwSQXro_eo6cjLkNRHA%40mail.gmail.com
> <https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8wxAd4YCEUwnU-bPf9FQ%2BtutmKdwSQXro_eo6cjLkNRHA%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAN557awW6ZeHtsXH0uO8AF8QvhEcHjU74w_ycrN-imoHZTvQew%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to