Hi Max, Look at: Extracts all component files from .traineddata
combine_tessdata -u tessdata/ell.traineddata /home/$USER/temp/ell combine_tessdata language_data_path_prefix (e.g. tessdata/eng.) Combines all individual tessdata components (unicharset, DAWGs, classifier templates, ambiguities, language configs). The result will be a combined tessdata file lang_code.traineddata Hope it helps, Oleg On Mon, May 9, 2011 at 3:01 AM, Max Cantor <[email protected]> wrote: > I was looking at that, but can't find the other component files in the > source tree. is there somewhere to get the component files for the > eng.trainneddata? > > sorry if i'm missing something obvious... > > max > On May 9, 2011, at 1:40 AM, zdenko podobny wrote: > > > see [1] or user-words on the same page. > > > > [1] > http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3#Putting_it_all_together > > > > Zdenko > > > > On Sun, May 8, 2011 at 5:53 PM, Max Cantor <[email protected]> wrote: > > Is there a way to set up a custom wordlist without going through the > entire retraining process? our wordlists will change a bit at runtime, so > if there is an API variable to set, that would be perfect for us. > > > > Thanks, > > Max > > > > Keep up the good work! > > > > -- > > You received this message because you are subscribed to the Google > > Groups "tesseract-ocr" group. > > To post to this group, send email to [email protected] > > To unsubscribe from this group, send email to > > [email protected] > > For more options, visit this group at > > http://groups.google.com/group/tesseract-ocr?hl=en > > > > > > -- > > You received this message because you are subscribed to the Google > > Groups "tesseract-ocr" group. > > To post to this group, send email to [email protected] > > To unsubscribe from this group, send email to > > [email protected] > > For more options, visit this group at > > http://groups.google.com/group/tesseract-ocr?hl=en > > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

