[tesseract-ocr] The extra character is not recognized after fine tuning training

Jennil Thiyam Fri, 31 May 2019 02:53:31 -0700

I have followed the procedure (that is described in training tesseract 4
for fine tuning for putting plus-minus sign in eng.traineddata) to train
ben.traineddata (by adding one character which is not in the Bengali
alpahbets, more than 30 times, in ben.training_text). after creating
starter training data and then running lstmtraining, the model failed to
recognized the new character, in case of plus-minus, it is said that the
plus-minus sign was recognized.
Does anyone have any suggestion???
The demo of the training_text is given below,
.....
লক্ষ্যমাত্রা নির্দেশ ধ্বংস কে
দেখতে শুধু লাইব্রেরী আশা স্বাগত থাং
শতাব্দী অন্ধ্রপ্রদেশ (িপিপিপ)
সন্ধান করে অভ্যুত্থানের প্রসিদ্ধ
ময়ূরের শুরু ইন্টারেস্টিং দলের ও
পুিলেশর খ্রিস্টপূর্ব আশা প্রদর্শিত
কহীং উইকিপিডিয়াতে এ্যান্ড 19 ইঞ্চি
আছে ০ লিখতে অর্পানেট পরে এেক
ভূঁইয়ার আছে করুন, গ্লোব সেপ্টেম্বর
প্রশ্ন,
*ৱু ৱূ ৱে ৱৈ ৱো ৱৌ ৱং*
*ৱ ৱা ৱি ৱী ৱু ৱূ ৱে ৱৈ ৱো ৱৌ ৱং*
*ৱ ৱা ৱি ৱী ৱু ৱূ ৱে ৱৈ ৱো ৱৌ ৱং*
*ৱ ৱা ৱি ৱী ৱু ৱূ ৱে ৱৈ ৱো ৱৌ ৱং*
*ৱ ৱা ৱি ৱী ৱু ৱূ ৱে ৱৈ ৱো ৱৌ ৱং*
....
the underlined text is the possible form that this new character can take*,
*is ther any rule in adding this new character to the training text???


-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAJxgooeysg5AfzppAXjKpREOvH2Jnz14wksMUjhsjotMJxE3bA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[tesseract-ocr] The extra character is not recognized after fine tuning training

Reply via email to