Please try the asm.traineddata which is for Assamese which is written in
Bengali script.

On Fri, 31 May 2019, 16:55 Jennil Thiyam, <[email protected]> wrote:

> How come this character is in here??? Its not used in bengali, and also
> not recognized by ben.traindata model, the character is in my unicharset
> that I got after running tesstrain.sh
> The character is pronounced as "waa" . I attached two picture, the first
> one wa.png is the sshot of the unicharset from the link u have given, and
> the picture wa_11.png is the unicharset that i got after performing
> tesstrain.sh(after adding this new character in ben.training_text)
> The character is in line no.35(in wa.png) and 79(in wa_11.png)
>
> Please help me out
>
> On Fri, May 31, 2019 at 3:47 PM Shree Devi Kumar <[email protected]>
> wrote:
>
>> or in
>>
>> https://github.com/tesseract-ocr/langdata_lstm/blob/master/asm/asm.unicharset
>>
>>
>> On Fri, May 31, 2019 at 3:45 PM Shree Devi Kumar <[email protected]>
>> wrote:
>>
>>> Is your new character included in
>>>
>>>
>>> https://github.com/tesseract-ocr/langdata_lstm/blob/master/ben/ben.unicharset
>>>
>>>
>>> On Fri, May 31, 2019 at 3:22 PM Jennil Thiyam <[email protected]>
>>> wrote:
>>>
>>>> I have followed the procedure (that is described in training tesseract
>>>> 4 for fine tuning for putting plus-minus sign in eng.traineddata) to train
>>>> ben.traineddata (by adding one character which is not in the Bengali
>>>> alpahbets, more than 30 times, in ben.training_text). after creating
>>>> starter training data and then running lstmtraining, the model failed to
>>>> recognized the new character, in case of plus-minus, it is said that the
>>>> plus-minus sign was recognized.
>>>> Does anyone have any suggestion???
>>>> The demo of the training_text is given below,
>>>> .....
>>>> লক্ষ্যমাত্রা নির্দেশ ধ্বংস কে
>>>> দেখতে শুধু লাইব্রেরী আশা স্বাগত থাং
>>>> শতাব্দী অন্ধ্রপ্রদেশ (িপিপিপ)
>>>> সন্ধান করে অভ্যুত্থানের প্রসিদ্ধ
>>>> ময়ূরের শুরু ইন্টারেস্টিং দলের ও
>>>> পুিলেশর খ্রিস্টপূর্ব আশা প্রদর্শিত
>>>> কহীং উইকিপিডিয়াতে এ্যান্ড 19 ইঞ্চি
>>>> আছে ০ লিখতে অর্পানেট পরে এেক
>>>> ভূঁইয়ার আছে করুন, গ্লোব সেপ্টেম্বর
>>>> প্রশ্ন,
>>>> *ৱু ৱূ ৱে ৱৈ ৱো ৱৌ ৱং*
>>>> *ৱ ৱা ৱি ৱী ৱু ৱূ ৱে ৱৈ ৱো ৱৌ ৱং*
>>>> *ৱ ৱা ৱি ৱী ৱু ৱূ ৱে ৱৈ ৱো ৱৌ ৱং*
>>>> *ৱ ৱা ৱি ৱী ৱু ৱূ ৱে ৱৈ ৱো ৱৌ ৱং*
>>>> *ৱ ৱা ৱি ৱী ৱু ৱূ ৱে ৱৈ ৱো ৱৌ ৱং*
>>>> ....
>>>> the underlined text is the possible form that this new character can
>>>> take*, *is ther any rule in adding this new character to the training
>>>> text???
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "tesseract-ocr" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>> To post to this group, send email to [email protected].
>>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>>> To view this discussion on the web visit
>>>> https://groups.google.com/d/msgid/tesseract-ocr/CAJxgooeysg5AfzppAXjKpREOvH2Jnz14wksMUjhsjotMJxE3bA%40mail.gmail.com
>>>> <https://groups.google.com/d/msgid/tesseract-ocr/CAJxgooeysg5AfzppAXjKpREOvH2Jnz14wksMUjhsjotMJxE3bA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>> .
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>
>>> --
>>>
>>> ____________________________________________________________
>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>>
>>
>>
>> --
>>
>> ____________________________________________________________
>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To post to this group, send email to [email protected].
>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduW4hqzzPYxs5C3G7vdTrW%3DAfLgU7zi8cKH8YT22jE5C7g%40mail.gmail.com
>> <https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduW4hqzzPYxs5C3G7vdTrW%3DAfLgU7zi8cKH8YT22jE5C7g%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/CAJxgooeEaQ6TnAXYnAqFpfU0KX5kppUBjxWDhv16bk4N%3Dher4g%40mail.gmail.com
> <https://groups.google.com/d/msgid/tesseract-ocr/CAJxgooeEaQ6TnAXYnAqFpfU0KX5kppUBjxWDhv16bk4N%3Dher4g%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduUfHEyWHiwbwUg4AnSf2ZpkPv%2BOG%3DctTu9tObU-tvsqQg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to