You can extract the files from traineddata with combine_tessdata -u

Look at the ben.config file for any special layout config in it.

The LSTM training was done by Ray Smith at Google. My info is based on
whatever has been open sourced by them at Mithun.

On Tue, 4 Jun 2019, 23:18 Jennil Thiyam, <[email protected]> wrote:

> Shree what is the segmentation algorithm used in this bengali ocr, i think
> the segmentation algorithm for english characters and bengali character has
> to be different. Is it the BB Chaudhury's segmentation algorithm used?
>
> On Tue, Jun 4, 2019 at 5:41 PM Shree Devi Kumar <[email protected]>
> wrote:
>
>> Ben trained on bengali, Bengali with ben, asm and English.
>>
>>
>> https://github.com/tesseract-ocr/langdata_lstm/blob/master/script/Bengali.langs.txt
>>
>>
>> On Tue, 4 Jun 2019, 17:11 Jennil Thiyam, <[email protected]> wrote:
>>
>>> What is the difference between ben.traineddata and Bengali.traineddata,
>>> some character are not recognised by the be.traineddata but it was
>>> recognised by Bengali.traineddata.
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To post to this group, send email to [email protected].
>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/tesseract-ocr/CAJxgooeAunWjMUSf%3D5aqj3-42uau6Xjo1V%3DvMfQFgD-9%3D_U71g%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/tesseract-ocr/CAJxgooeAunWjMUSf%3D5aqj3-42uau6Xjo1V%3DvMfQFgD-9%3D_U71g%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To post to this group, send email to [email protected].
>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduXk8sn9kcV1S1sWvBy1SOfcxbO12%3DZfpYCr-zfMtQNOaw%40mail.gmail.com
>> <https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduXk8sn9kcV1S1sWvBy1SOfcxbO12%3DZfpYCr-zfMtQNOaw%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/CAJxgoodXR-WQjC6a0QLftu8fsgpZWG7GfZm%2BxgL4qk0CpWOxig%40mail.gmail.com
> <https://groups.google.com/d/msgid/tesseract-ocr/CAJxgoodXR-WQjC6a0QLftu8fsgpZWG7GfZm%2BxgL4qk0CpWOxig%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduU3jASX_oGryWXaswVMYJ5B7tp0H1pL_5bEMjd6xGYftA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to