Hi! Do you have the answer yet? Cause I am currently looking for it :D

Vào lúc 12:27:57 UTC+7 ngày Thứ Sáu, 16 tháng 4, 2021, [email protected] 
đã viết:

> Thank you that was helpful. So is it the same training set used for 
> creating the default traindeddata files available in the repo?
>
> On Thursday, 15 April 2021 at 15:46:07 UTC+5:30 shree wrote:
>
>> Use langdata_lstm repo for LSTM training. That has larger training text.
>>
>> On Thu, Apr 15, 2021, 00:52 Venkatapathy S <[email protected]> wrote:
>>
>>> Hi,
>>> I want to retrain Tesseract from the scratch for a particular language(I 
>>> have read as many resources as possible, including warnings, from the 
>>> Tutorial <https://tesseract-ocr.github.io/tessdoc/>, Github 
>>> <https://github.com/tesseract-ocr/tesseract/issues/654#issuecomment-274574951>
>>>  and 
>>> this forum). Now to begin (and to get myself familiar with the process), I 
>>> was trying to start with the English language. When I was going through the 
>>> langdata files(https://github.com/tesseract-ocr/langdata) for English I 
>>> found out that the training text contains only 72 lines. Does the training 
>>> text provided in the langdata repository given as a sample text or is it 
>>> exactly the same set used to train the default eng.traineddata model 
>>> provided by the tesseract? Can someone help me with this, please?
>>>
>>> Regards,
>>> Venkat
>>> https://sites.google.com/view/venkatapathy/home
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected].
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/tesseract-ocr/5f588dfc-5c8b-400a-96c5-65c547f27d46n%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/tesseract-ocr/5f588dfc-5c8b-400a-96c5-65c547f27d46n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/0a478694-2939-4a4c-bc3f-18a3319d8f83n%40googlegroups.com.

Reply via email to