Thanks a lot Shree. I tried the tesseract 4.0 and the training is working 
well until it reaches the lstm-training step and got stuck there. I am 
totally new in the training so hope you don't mind if I am asking silly 
questions. Do you know why I got stuck? Also, would you call this training 
fine-tuning? As I just want to improve the accuracy of existing 
eng.langdata. 

<https://lh3.googleusercontent.com/-dWRkYql4AKA/W2k9PoNsndI/AAAAAAAAAOM/zWVkkPvUCT44moZPpvt6xgYFnQ0StwxUQCLcBGAs/s1600/Capture.PNG>



On Monday, August 6, 2018 at 10:26:12 PM UTC-7, shree wrote:
>
> Ocr-d scripts are geared towards tesseract 4.0.x. you are trying to use it 
> with tesseract 3.05.
>
> On Tue 7 Aug, 2018, 10:50 AM May, <[email protected] <javascript:>> 
> wrote:
>
>> Hey Shree
>>
>> I also tried with the orignal script from the github. But faced the same 
>> issue with the process stuck at unicharset_output.
>>
>>
>> <https://lh3.googleusercontent.com/-rFB69WQGLIg/W2krzHUjFfI/AAAAAAAAAOA/SZ4CEzUIEGMIhQUWXHfHMS9H4Yxk-ADGwCLcBGAs/s1600/Capture.PNG>
>>
>>
>> These are the versions:
>> tesseract 3.05.02
>>  leptonica-1.75.3
>>   libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.3) : libpng 1.6.34 : 
>> libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.2.0
>>
>>
>> On Thursday, August 2, 2018 at 8:52:38 PM UTC-7, shree wrote:
>>>
>>> Please use latest scripts from https://github.com/OCR-D/ocrd-train
>>>
>>> On Fri, Aug 3, 2018 at 4:41 AM May <[email protected]> wrote:
>>>
>>>>
>>>> <https://lh3.googleusercontent.com/-LnwUni4-lLw/W2OPUqJpn_I/AAAAAAAAANs/Xd_-CVCdiMk0cjMmxBpVgfOSU1JeAacAgCLcBGAs/s1600/Capture.PNG>
>>>>
>>>>
>>>>
>>>> <https://lh3.googleusercontent.com/-j3_B1CmVv9w/W2OPbuUYH3I/AAAAAAAAANw/xmBXrNakKuMHm2L9cj-K3sCXCjFxuF80QCLcBGAs/s1600/Capture.PNG>
>>>>
>>>>
>>>>
>>>> Here are attached photos
>>>>
>>>>
>>>> On Thursday, August 2, 2018 at 4:08:11 PM UTC-7, May wrote:
>>>>>
>>>>> Hey all,
>>>>>
>>>>> I am following Shree's script for OCR-d in the google groups for 
>>>>> ocrd-training (
>>>>> https://groups.google.com/forum/#!topic/tesseract-ocr/be4-rjvY2tQ). I 
>>>>> managed to pass the combine tessdata stage but got stuck at the 
>>>>> unicharset stage:
>>>>>  
>>>>>
>>>>>
>>>>> I have edited the script to direct it to my path:
>>>>>
>>>>> I do find a unicharset file named "unicharset" but not as 
>>>>> "my.unicharset". Changing the script by removing "my." also did not solve 
>>>>> the problem. Do you know what's causing the issue?
>>>>>
>>>>> Best
>>>>> May
>>>>>
>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "tesseract-ocr" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to [email protected].
>>>> To post to this group, send email to [email protected].
>>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/tesseract-ocr/48347dd8-7b7e-4d0d-9cb5-b21e3ec23f31%40googlegroups.com
>>>>  
>>>> <https://groups.google.com/d/msgid/tesseract-ocr/48347dd8-7b7e-4d0d-9cb5-b21e3ec23f31%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>
>>> -- 
>>>
>>> ____________________________________________________________
>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To post to this group, send email to [email protected] 
>> <javascript:>.
>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/af43b995-7e24-4dca-827c-080755211544%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/tesseract-ocr/af43b995-7e24-4dca-827c-080755211544%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/29b12ff3-abac-4fe6-99af-7a8c443c5a99%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to