There's no way to augment .traneddata files directly.
You'll need to go through entire training procedure from scratch using
both old and your box/tiff files.

Warm regards,
Dmitri Silaev





On Thu, Apr 14, 2011 at 1:54 AM, stinguin <[email protected]> wrote:
> Hi,
>
> I didn't expect such a feedback - many thanks for all your answers! I
> will try to put your tips (adding some new tif/box and editing the
> wordlist) into practice :-) Is it possible to add my new box (or
> better .tr-files) to the existing deu-frak.traineddata directly?
> Or do I have to create a new .traineddata by using all box-files from
> Peter and the new ones from me?
>
> Best wishes, Holger
>
>
> On 13 Apr., 19:03, Peter Alberti <[email protected]> wrote:
>> Hi
>>
>> deu-frak.traineddata is a file I created, so I'm happy to hear that
>> someone might want to improve it.
>> Actually, I've continued to work a little bit on it myself, and you
>> can get the files I'm using from
>>
>> https://github.com/paalberti/tesseract-dan-fraktur
>>
>> The files you find there ought to be little bit better than deu-
>> frak.traineddata available under downloads, but I haven't done any
>> proper testing yet, so your mileage may vary. Also, the tif/box in the
>> dan-frak/ subdirectory might work slightly better than those under deu-
>> frak/ (Danish is the language I'm most interested in), so if you want
>> to retrain yourself, you might to work with those.
>>
>> The two most obvious improvements, I can think of is to add to some
>> tif/box that look more like the texts you're ocr-ing, if possible, and
>> maybe to build a better wordlist (if I remember correctly, the German
>> one was a little bit of quick hack.)
>>
>> Best regards, Peter.
>>
>> On 12 Apr., 22:09, stinguin <[email protected]> wrote:
>>
>> > Hi list,
>>
>> > I'm new to tesseract and hope that anyone of you could help me. I want
>> > to ocr some german texts which are typesetted in fraktur. The results
>> > by using the existing language "deu-frak" are good, but not good
>> > enough. Is it possible to improve this language by training? If so,
>> > can someone explain that step by step?
>> > I just know how to create a new language. Do you think i can improve
>> > the results by creating my own one? I think the deu-frak-language is
>> > more than just a few box files, isn't it?
>>
>> > Thanks in advance
>
> --
> You received this message because you are subscribed to the Google Groups 
> "tesseract-ocr" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to 
> [email protected].
> For more options, visit this group at 
> http://groups.google.com/group/tesseract-ocr?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Reply via email to