Hi, I am confused about training image files.
I am trying to create Fraktur German language data files for Tesseract 
which are more accurate... but I need help to first understand what the 
software requirements are ( this does not  seem to be properly addressed in 
the notes, at least not to my level of satisfaction).

Does an image training file have to have a complementary text file with a 
closest matching font to the script in the image file ????
This is where I get stumped.... I don't understand what the software 
requirements are and I notice that some people mention getting particular 
results which are better when they do certain things like beginning a line 
with a capital etc which seems to add to my frustration of not knowing what 
the software is requiring or wanting.

Any help on this matter to weed out any misconceptions I may or may not 
have will be much appreciated.

I would then like to make a better help file for training Tesseract.


Cheers

Richard


On Tuesday, May 1, 2012 9:22:06 PM UTC+10, Rufus wrote:
>
> I've trained a new new language consisting of digits and a few special 
> characters in the font swiss921bt. I've been wondering whether my training 
> text is good enough, or should I use the standard training text (
> http://michaeljaylissner.com/files/standard-training-text.txt) ?
>
> my training text in swiss921bt font is in the attachement.
>

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to