Thanks Shree , but if tesseract is open source then why developers can't
answer doubts , If i were to randomly train my model how can i come down to
accurate accuracy of my model , then my model accuracy will also be random.
I want the reason for condition imposed on training text , how much
For tesseract 3.05
random text will work, it is suggested to use combos similar to English
training text.
It is unlikely you will get answers to your questions from the developers.
You can search past issues/questions in forum and github.
3.05 training does not take long, run a few experiments
Hi Shree Thanks for replying
For tesseract *3.05.00*
I had already checked that link there they mentioned
*"Make sure there are a minimum number of samples of each character. 10 is
good, but 5 is OK for rare characters.*
*There should be more samples of the more frequent characters - at least
see
https://github.com/tesseract-ocr/tesseract/wiki/Training-Tesseract-3.03%E2%80%933.05
ShreeDevi
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Sat, Apr 7, 2018 at 4:02 PM, Romil Mehla wrote:
> Thanks for
Thanks for your reply , i have read about tesseract 4.0 and Ray mentioned
how he used so many files to train tesseract 4.0 but i dont want to use
tesseract 4.0 , i wanted to know about tesseract 3.05.00 , from my
understanding suppose for eng languaur . eng.training_text file is build
from
Just a word list is not enough for training text.
For tesseract 4.0.0 it needs to be representative of the text to be
recognized.
On Sat 7 Apr, 2018, 2:50 PM Romil Mehla, wrote:
> Is there any program to generate it ? i see ambiguous_words.cpp
> generating dictionary words
Is there any program to generate it ? i see ambiguous_words.cpp generating
dictionary words and ambiguous words where is it used ? or it can be used
to build unicharambigs file to generate rules ?
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr"
7 matches
Mail list logo