Ok, i feel a bit less bad now. combine_tessdata segfaults on both ubuntu and osx:
182:tess max$ combine_tessdata -u eng.traineddata eng Extracting tessdata components from eng.traineddata tesseract::TessdataManager::TessdataTypeFromFileName( filename, &type, &text_file):Error:Assert failed:in file tessdatamanager.cpp, line 241 Segmentation fault this is tesseract 3.00. seems to have some problem with the traineddata suffix. thanks, max On May 9, 2011, at 3:30 PM, zdenko podobny wrote: > no problem :-) I think you will like option "-o" too. > > Zdenko > > On Mon, May 9, 2011 at 8:27 AM, Max Cantor <[email protected]> wrote: > I feel really dumb now. Sorry for the bother. > > > Thanks, max > > On May 9, 2011, at 14:01, zdenko podobny <[email protected]> wrote: > >> Please try to read (to look is not enough ;-) ) [1] : >> >> >> // Specify option -u to unpack all the components to the specified path: >> // >> >> >> >> >> >> // combine_tessdata -u tessdata/eng.traineddata /home/$USER/temp/eng. >> // >> >> >> >> >> >> // This will create /home/$USER/temp/eng.* files with individual tessdata >> // components from tessdata/eng.traineddata. >> >> >> >> >> >> // >> [1] >> http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3#Putting_it_all_together >> >> On Mon, May 9, 2011 at 2:01 AM, Max Cantor <[email protected]> wrote: >> I was looking at that, but can't find the other component files in the >> source tree. is there somewhere to get the component files for the >> eng.trainneddata? >> >> sorry if i'm missing something obvious... >> >> max >> On May 9, 2011, at 1:40 AM, zdenko podobny wrote: >> >> > see [1] or user-words on the same page. >> > >> > [1] >> > http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3#Putting_it_all_together >> > >> > Zdenko >> > >> > On Sun, May 8, 2011 at 5:53 PM, Max Cantor <[email protected]> wrote: >> > Is there a way to set up a custom wordlist without going through the >> > entire retraining process? our wordlists will change a bit at runtime, so >> > if there is an API variable to set, that would be perfect for us. >> > >> > Thanks, >> > Max >> > >> > Keep up the good work! >> > >> > -- >> > You received this message because you are subscribed to the Google >> > Groups "tesseract-ocr" group. >> > To post to this group, send email to [email protected] >> > To unsubscribe from this group, send email to >> > [email protected] >> > For more options, visit this group at >> > http://groups.google.com/group/tesseract-ocr?hl=en >> > >> > >> > -- >> > You received this message because you are subscribed to the Google >> > Groups "tesseract-ocr" group. >> > To post to this group, send email to [email protected] >> > To unsubscribe from this group, send email to >> > [email protected] >> > For more options, visit this group at >> > http://groups.google.com/group/tesseract-ocr?hl=en >> >> -- >> You received this message because you are subscribed to the Google >> Groups "tesseract-ocr" group. >> To post to this group, send email to [email protected] >> To unsubscribe from this group, send email to >> [email protected] >> For more options, visit this group at >> http://groups.google.com/group/tesseract-ocr?hl=en >> >> >> -- >> You received this message because you are subscribed to the Google >> Groups "tesseract-ocr" group. >> To post to this group, send email to [email protected] >> To unsubscribe from this group, send email to >> [email protected] >> For more options, visit this group at >> http://groups.google.com/group/tesseract-ocr?hl=en > > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > > > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

