There's no way to augment .traneddata files directly. You'll need to go through entire training procedure from scratch using both old and your box/tiff files.
Warm regards, Dmitri Silaev On Thu, Apr 14, 2011 at 1:54 AM, stinguin <[email protected]> wrote: > Hi, > > I didn't expect such a feedback - many thanks for all your answers! I > will try to put your tips (adding some new tif/box and editing the > wordlist) into practice :-) Is it possible to add my new box (or > better .tr-files) to the existing deu-frak.traineddata directly? > Or do I have to create a new .traineddata by using all box-files from > Peter and the new ones from me? > > Best wishes, Holger > > > On 13 Apr., 19:03, Peter Alberti <[email protected]> wrote: >> Hi >> >> deu-frak.traineddata is a file I created, so I'm happy to hear that >> someone might want to improve it. >> Actually, I've continued to work a little bit on it myself, and you >> can get the files I'm using from >> >> https://github.com/paalberti/tesseract-dan-fraktur >> >> The files you find there ought to be little bit better than deu- >> frak.traineddata available under downloads, but I haven't done any >> proper testing yet, so your mileage may vary. Also, the tif/box in the >> dan-frak/ subdirectory might work slightly better than those under deu- >> frak/ (Danish is the language I'm most interested in), so if you want >> to retrain yourself, you might to work with those. >> >> The two most obvious improvements, I can think of is to add to some >> tif/box that look more like the texts you're ocr-ing, if possible, and >> maybe to build a better wordlist (if I remember correctly, the German >> one was a little bit of quick hack.) >> >> Best regards, Peter. >> >> On 12 Apr., 22:09, stinguin <[email protected]> wrote: >> >> > Hi list, >> >> > I'm new to tesseract and hope that anyone of you could help me. I want >> > to ocr some german texts which are typesetted in fraktur. The results >> > by using the existing language "deu-frak" are good, but not good >> > enough. Is it possible to improve this language by training? If so, >> > can someone explain that step by step? >> > I just know how to create a new language. Do you think i can improve >> > the results by creating my own one? I think the deu-frak-language is >> > more than just a few box files, isn't it? >> >> > Thanks in advance > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en. > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

