2010/8/1 Zdenko Podobný <[email protected]>: > > Dňa 28.07.2010 17:02, Jimmy O'Regan wrote / napísal(a): >> >>>>> I grepped the code and it seems to be looking for something called >>>>> LANG.user-words, but that didn't seem to do anything -- I got the same >>>>> garbled text when I ran Tesseract 3 the second time. >>>>> >>> Turns out T3 doesn't even access $LANG.user-words. I suspect it's looking >>> for it in the traineddata file... >>> >>> >> Hmm... probably... which is quite a stupid thing to do, really, but I >> presume nobody in Google actually uses this, so it's probably quite >> neglected. >> >> I'm toying with the idea of adding support for an actual *user* list - >> i.e., that tesseract would check $HOME/.tesseract/lang.user-words - >> because assuming a single user system that the user has full control over is >> still a braindamaged assumption. >> > just idea: maybe this should be handled by environment variable. If I > set up: > export TESSDATA_PREFIX=~/. > tesseract will try to get ALL files from "$HOME/.tessdata" > > Problem is that if tesseract did not find all needed files (e.g. > eng.traineddata) in $TESSDATA_PREFIX it stops... (e.g. it will not look at > "standard" installation directories like /usr/share/tessdata or > /usr/local/share/tessdata).
Well I was thinking of adding support for reading the user-words files from a .tesseract directory (because they're not exactly 'user' files if they have to be installed in a system directory; I guess checking in a dot directory in $HOME would be ok. (I was also thinking that being able to have a .tesseract/config to have user set variables might be A Good Thing). It's a pretty low-priority set of ideas, though. > > I tried to use "export TESSDATA_PREFIX=~/.:/usr/local/share/tessdata" but it > did not worked (tesseract tried to open file > "/home/zdeno/.:/usr/local/share/tessdatatessdata/eng.traineddata" that is not > correct) No; Tesseract uses only a single directory for language data. > > > Zd. > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en. > > -- <Leftmost> jimregan, that's because deep inside you, you are evil. <Leftmost> Also not-so-deep inside you. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

