Re: TRAINING ... Font name = UnknownFont.

Zdenko Podobný Sat, 24 Apr 2010 02:47:12 -0700

Dňa 19.04.2010 09:05, MARTIN Pierre wrote / napísal(a):
> Hello Zdpo,
>
> As said in my mail on 13th of April, as an answer to Sriranga:
>
>   
>>> I am extremely thankful  for the attachment. I could not understand "OCRB 
>>> font" - which I don't have. It is presumed any fonts can do/be used ?
>>>       
>> Exactly. Basically, you'll have to create your custom language which will 
>> still contain a certain number of fonts. Each font can be train with 
>> multiple pictures. That's why the file names for the boxes are decomposed 
>> this way: xxx.FFFFF.ppp.box (xxx=language, FFF=font, ppp=page if you have 
>> multiple training pictures by font), this way the files are better organised.
>>     
> As you can see, the names of the input files when training Tesseract 
> (Especially the .tr files) are determining the font names.
>
> This is visible in the source code too, if you make a search for 
> "CurrentFont" in the whold source code, you'll see what i mean.
>
> Pierre.
>
>   
When I make tests on linux I experienced crash of tesseract... I tried
to understood source code (+ to some work with debuger ;-) ) and I think
there is a bug (or at least code did not handle possible inputs
correctly). My experience (+ patch for my problems) can be found on
http://www.sk-spell.sk.cx/tesseract-ocr-en-language-training-300...


Zdenko

smime.p7s
Description: S/MIME Cryptographic Signature

Re: TRAINING ... Font name = UnknownFont.

Reply via email to